From sspitsyn at openjdk.java.net Thu Oct 1 00:22:03 2020 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Thu, 1 Oct 2020 00:22:03 GMT Subject: RFR: 8230664: Fix TestInstanceKlassSize for PowerPC [v2] In-Reply-To: References: Message-ID: On Wed, 30 Sep 2020 21:28:19 GMT, Ziviani wrote: >> TestInstanceKlassSize was failing because, for PowerPC, the following code (instanceKlass.cpp) always compiles to >> `return false;` bool InstanceKlass::has_stored_fingerprint() const { >> #if INCLUDE_AOT >> return should_store_fingerprint() || is_shared(); >> #else >> return false; >> #endif >> } >> However, in `hasStoredFingerprint()@InstanceKlass.java` the condition `shouldStoreFingerprint() || isShared();` is >> always evaluated and may return true (_AFAIK isShared() returns true_). Such condition adds 8 bytes in the >> `getSize()@InstanceKlass.java` causing the failure in TestInstanceKlassSize: public long getSize() { // in number of >> bytes >> ... >> if (hasStoredFingerprint()) { >> size += 8; // uint64_t >> } >> return alignSize(size); >> } >> Considering these tests are failing for PowerPC only (_based on ProblemList.txt_), my solution checks if >> `hasStoredFingerprint()` is running on a PowerPC platform. I decided to go this way because there is no existing flag >> informing whether AOT is included or not and creating a new one just to handle the PowerPC case seems too much. This >> patch is an attempt to fix https://bugs.openjdk.java.net/browse/JDK-8230664 > > Ziviani has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes > the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last > revision: > 8230664: Fix TestInstanceKlassSize > > The code hasStoredFingerprint() at InstanceKlass.java is not considering > AOT disabled at compilation time, like has_stored_fingerprint() at > instanceKlass.cpp does. Such difference can cause TestInstanceKlassSize > failures because all objects will have an extra 8-bytes. LGTM ------------- Marked as reviewed by sspitsyn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/358 From jiefu at openjdk.java.net Thu Oct 1 00:28:24 2020 From: jiefu at openjdk.java.net (Jie Fu) Date: Thu, 1 Oct 2020 00:28:24 GMT Subject: RFR: 8223347: Integration of Vector API (Incubator) In-Reply-To: References: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> <-JIanIoqecCK7WfRPElGiS9Gora2wP3NfpGZ3hNL_Hg=.2ccfa0e9-33c6-430b-9303-66829e97e6ff@github.com> Message-ID: On Wed, 30 Sep 2020 18:19:53 GMT, Paul Sandoz wrote: >> Hi @PaulSandoz , >> >> This integration seems to miss https://github.com/openjdk/panama-vector/pull/1, which had fixed crashes on AVX512 >> machines. >> Thanks. > > @DamonFool we can follow up later for that fix (and others in `vectorIntrinsics`), after this PR integrates. I don't > want to perturb the code that has already been reviewed, which requires yet more additional review. Hi @PaulSandoz , I think it would be better to integrate it [1] in this MR. I have tested this MR on our AVX512 machines and it still crashes. Also, for the sake of maintenance, it seems NOT a good idea to push a problematic commit into the jdk main-line repo. As for the review process, I don't think it's a problem since the fix [1] is clear and small enough. What do you think? Thanks. [1] https://github.com/openjdk/panama-vector/commit/1af35c357066743935bd3f48ce3610a41761f89a ------------- PR: https://git.openjdk.java.net/jdk/pull/367 From psandoz at openjdk.java.net Thu Oct 1 01:04:44 2020 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Thu, 1 Oct 2020 01:04:44 GMT Subject: RFR: 8223347: Integration of Vector API (Incubator) In-Reply-To: References: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> <-JIanIoqecCK7WfRPElGiS9Gora2wP3NfpGZ3hNL_Hg=.2ccfa0e9-33c6-430b-9303-66829e97e6ff@github.com> Message-ID: On Thu, 1 Oct 2020 00:25:46 GMT, Jie Fu wrote: >> @DamonFool we can follow up later for that fix (and others in `vectorIntrinsics`), after this PR integrates. I don't >> want to perturb the code that has already been reviewed, which requires yet more additional review. > > Hi @PaulSandoz , > > I think it would be better to integrate it [1] in this MR. > > I have tested this MR on our AVX512 machines and it still crashes. > Also, for the sake of maintenance, it seems NOT a good idea to push a problematic commit into the jdk main-line repo. > > As for the review process, I don't think it's a problem since the fix [1] is clear and small enough. > > What do you think? > > Thanks. > > [1] https://github.com/openjdk/panama-vector/commit/1af35c357066743935bd3f48ce3610a41761f89a @DamonFool I appreciate your efforts on this but i want to hold back on that issue and follow up very quickly after integration of this PR. This change has been through an extremely long and arduous review process, and i want to stick to what was reviewed and not ask reviewers to go through further cycles on what overall is a very large change. Unfortunately this change is in a holding pattern waiting for the CSR to be approved thereby increasing the window where we might find further issues (that if we had already integrated may have been dealt with separately perhaps in a less timely fashion with respect to that integration). Unless an issue is extremely severe I think we should queue them up in `panama-vector/vectorIntrinsics` (there is at least one more for ARM SVE that is queued up). Since the issue you describe effects one instruction, for one type, on AVX512, its impact is limited and will be mitigated by a quick follow up. ------------- PR: https://git.openjdk.java.net/jdk/pull/367 From jiefu at openjdk.java.net Thu Oct 1 01:27:19 2020 From: jiefu at openjdk.java.net (Jie Fu) Date: Thu, 1 Oct 2020 01:27:19 GMT Subject: RFR: 8223347: Integration of Vector API (Incubator) In-Reply-To: References: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> <-JIanIoqecCK7WfRPElGiS9Gora2wP3NfpGZ3hNL_Hg=.2ccfa0e9-33c6-430b-9303-66829e97e6ff@github.com> Message-ID: On Thu, 1 Oct 2020 01:01:23 GMT, Paul Sandoz wrote: >> Hi @PaulSandoz , >> >> I think it would be better to integrate it [1] in this MR. >> >> I have tested this MR on our AVX512 machines and it still crashes. >> Also, for the sake of maintenance, it seems NOT a good idea to push a problematic commit into the jdk main-line repo. >> >> As for the review process, I don't think it's a problem since the fix [1] is clear and small enough. >> >> What do you think? >> >> Thanks. >> >> [1] https://github.com/openjdk/panama-vector/commit/1af35c357066743935bd3f48ce3610a41761f89a > > @DamonFool I appreciate your efforts on this but i want to hold back on that issue and follow up very quickly after > integration of this PR. This change has been through an extremely long and arduous review process, and i want to stick > to what was reviewed and not ask reviewers to go through further cycles on what overall is a very large change. > Unfortunately this change is in a holding pattern waiting for the CSR to be approved thereby increasing the window > where we might find further issues (that if we had already integrated may have been dealt with separately perhaps in a > less timely fashion with respect to that integration). Unless an issue is extremely severe I think we should queue them > up in `panama-vector/vectorIntrinsics` (there is at least one more for ARM SVE that is queued up). Since the issue you > describe effects one instruction, for one type, on AVX512, its impact is limited and will be mitigated by a quick > follow up. Okay. I can understand it. Vector API is very valuable to us. Hope the follow-ups can be integrated as soon as possible. And thank you all for your great work. Best regards, Jie ------------- PR: https://git.openjdk.java.net/jdk/pull/367 From dholmes at openjdk.java.net Thu Oct 1 01:32:38 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 1 Oct 2020 01:32:38 GMT Subject: RFR: 8253433: Remove -XX:+Debugging product option In-Reply-To: References: Message-ID: On Wed, 30 Sep 2020 12:38:30 GMT, Coleen Phillimore wrote: > The Debugging option shouldn't be used on the command line. There's a SuppressErrorAt option to ignore certain > asserts, if there is some situation needing that. Debugging should never be used. > Tested with tier1 tests on 4 platforms. Seems fine. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/434 From david.holmes at oracle.com Thu Oct 1 02:18:35 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 1 Oct 2020 12:18:35 +1000 Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: References: Message-ID: Hi Alan, On 1/10/2020 2:30 am, Alan Hayward wrote: > The AArch64 port uses maybe_isb in places where an ISB might be required > because the code may have safepointed. These maybe_isbs are very conservative > and are used in many places are used when a safepoint has not happened. > > cross_modify_fence was added in common code to place a barrier in all the > places after a safepoint has occurred. All the uses of it are in common code, > yet it remains unimplemented on AArch64. > > This set of patches implements cross_modify_fence for AArch64 and reconsiders > every uses of maybe_isb, discarding many of them. In addition, it introduces > a new diagnostic option, which when enabled on AArch64 tests the correct > usage of the barriers. > > Advantage of this patch is threefold: > * Reducing the number of ISBs - giving a theoretical performance improvement. > * Use of common code instead of backend specific code. > * Additional test diagnostic options > > Patch 1: Split cross_modify_fence > ================================= > This is simply refactoring work split out to simplify the other two patches. > > instruction_fence() is provided by each target and simply places > a fence for the instruction stream. > > cross_modify_fence() is now a member of JavaThread and just calls > instruction_fence. This function will be extended in Patch 3. I don't agree with the change here. The cross_modify_fence() is not related to thread API imo, it belongs in OrderAccess. The name was deliberately selected to abstract away from the specific details of why a given platform may need this fence: http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-March/037153.html "The name "instruction_pipeline" seems a bit implementation specific about what HW architectural features need to be taken care of due to cross-modifying code, which may or may not apply to a given platform. Perhaps cross_modify_fence(), or something along those lines, would be better. That makes it more clear what we are protecting against, as opposed to what HW architectural features that might concern on a given platform." @robehn , @fisk please chime in here. :) Thanks, David > Patch 2: Use cross_modify_fence instead of maybe_isb > ==================================================== > > The [n] References refer to the comments for cross_modify_fence in > thread.hpp. > > This is all the existing uses of maybe_isb in the AArch64 target: > > 1) Instances of Java code calling a VM function > * This encapsulates the changes to: > ** MacroAssembler::call_VM_leaf_base() > ** generate_fast_get_int_field0() > ** stubGenerator_aarch64 generate_throw_exception() > ** sharedRuntime_aarch64 generate_handler_blob() > ** SharedRuntime::generate_resolve_blob() > ** C1 LIR_Assembler::rt_call > ** C1 StubAssembler::call_RT(): used by Used by generate_exception_throw, > generate_handle_exception, generate_code_for. > ** OptoRuntime::generate_exception_blob() > * Any changes will be caught due to calls to [2] or [3] by the VM function. > * Any calls that do not call [2] or [3] do not require an ISB. > * This patch is more optimal for these cases. > > 2) Instances of Java code calling a JNI function > * This encapsulates the changes to: > ** SharedRuntime::generate_native_wrapper() > ** TemplateInterpreterGenerator::generate_native_entry() > * A safepoint still in progress after the call with be caught by [4]. > * An ISB is still required for the case where there was a safepoint > but it completed during the call. This happens if the code doesn't > branch on safepoint_in_progress > * In the SharedRuntime version, the two possible calls to > reguard_yellow_pages and complete_monitor_unlocking_C are after the thread > goes back into it's original state, so are covered by [2] and [3], the > same as a normal VM call. > * This patch is only more optimal for the two post-JNI calls. > > 3) Patching functions > * This encapsulates the changes to: > ** patch_callers_callsite() (called by gen_c2i_adapter()) > * This results in code being patched, but does not safepoint > * Therefore an ISB is required. > * This patch introduces no change here. > > 4) C1 MacroAssembler::emit_static_call_stub() > * Calls ISB (not maybe_isb) > * By design, the patching doesn't require that the up-to-date > destination is required for proper functioning. > * However, the ISB makes it most likely that the new destination will > be picked up. > * This patch introduces no change here. > > Patch 3: Add cross modify fence verification > ============================================ > > The VerifyCrossModifyFence diagnostic flag enables confirmation to the correct > usage of instruction barriers. It can safely be enabled on any Java run. > > Enabling it will cause the following: > > * Once all threads have been brought to a safepoint, each thread will be > marked. > > * On a cross_modify_fence and safepoint_fence the mark for that thread > will be cleared. > > * On entry to a method and in a safepoint poll, then the thread is checked. > If it is marked, then the code will error. > > ------------- > > Commit messages: > - AArch64: Add cross modify fence verification > - AArch64: Use cross_modify_fence instead of maybe_isb > - Split cross_modify_fence > > Changes: https://git.openjdk.java.net/jdk/pull/428/files > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=428&range=00 > Issue: https://bugs.openjdk.java.net/browse/JDK-8221554 > Stats: 179 lines in 26 files changed: 127 ins; 7 del; 45 mod > Patch: https://git.openjdk.java.net/jdk/pull/428.diff > Fetch: git fetch https://git.openjdk.java.net/jdk pull/428/head:pull/428 > > PR: https://git.openjdk.java.net/jdk/pull/428 > From eosterlund at openjdk.java.net Thu Oct 1 06:06:41 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 1 Oct 2020 06:06:41 GMT Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: References: Message-ID: On Wed, 30 Sep 2020 08:36:32 GMT, Alan Hayward wrote: > The AArch64 port uses maybe_isb in places where an ISB might be required > because the code may have safepointed. These maybe_isbs are very conservative > and are used in many places are used when a safepoint has not happened. > > cross_modify_fence was added in common code to place a barrier in all the > places after a safepoint has occurred. All the uses of it are in common code, > yet it remains unimplemented on AArch64. > > This set of patches implements cross_modify_fence for AArch64 and reconsiders > every uses of maybe_isb, discarding many of them. In addition, it introduces > a new diagnostic option, which when enabled on AArch64 tests the correct > usage of the barriers. > > Advantage of this patch is threefold: > * Reducing the number of ISBs - giving a theoretical performance improvement. > * Use of common code instead of backend specific code. > * Additional test diagnostic options > > Patch 1: Split cross_modify_fence > ================================= > This is simply refactoring work split out to simplify the other two patches. > > instruction_fence() is provided by each target and simply places > a fence for the instruction stream. > > cross_modify_fence() is now a member of JavaThread and just calls > instruction_fence. This function will be extended in Patch 3. > > Patch 2: Use cross_modify_fence instead of maybe_isb > ==================================================== > > The [n] References refer to the comments for cross_modify_fence in > thread.hpp. > > This is all the existing uses of maybe_isb in the AArch64 target: > > 1) Instances of Java code calling a VM function > * This encapsulates the changes to: > ** MacroAssembler::call_VM_leaf_base() > ** generate_fast_get_int_field0() > ** stubGenerator_aarch64 generate_throw_exception() > ** sharedRuntime_aarch64 generate_handler_blob() > ** SharedRuntime::generate_resolve_blob() > ** C1 LIR_Assembler::rt_call > ** C1 StubAssembler::call_RT(): used by Used by generate_exception_throw, > generate_handle_exception, generate_code_for. > ** OptoRuntime::generate_exception_blob() > * Any changes will be caught due to calls to [2] or [3] by the VM function. > * Any calls that do not call [2] or [3] do not require an ISB. > * This patch is more optimal for these cases. > > 2) Instances of Java code calling a JNI function > * This encapsulates the changes to: > ** SharedRuntime::generate_native_wrapper() > ** TemplateInterpreterGenerator::generate_native_entry() > * A safepoint still in progress after the call with be caught by [4]. > * An ISB is still required for the case where there was a safepoint > but it completed during the call. This happens if the code doesn't > branch on safepoint_in_progress > * In the SharedRuntime version, the two possible calls to > reguard_yellow_pages and complete_monitor_unlocking_C are after the thread > goes back into it's original state, so are covered by [2] and [3], the > same as a normal VM call. > * This patch is only more optimal for the two post-JNI calls. > > 3) Patching functions > * This encapsulates the changes to: > ** patch_callers_callsite() (called by gen_c2i_adapter()) > * This results in code being patched, but does not safepoint > * Therefore an ISB is required. > * This patch introduces no change here. > > 4) C1 MacroAssembler::emit_static_call_stub() > * Calls ISB (not maybe_isb) > * By design, the patching doesn't require that the up-to-date > destination is required for proper functioning. > * However, the ISB makes it most likely that the new destination will > be picked up. > * This patch introduces no change here. > > Patch 3: Add cross modify fence verification > ============================================ > > The VerifyCrossModifyFence diagnostic flag enables confirmation to the correct > usage of instruction barriers. It can safely be enabled on any Java run. > > Enabling it will cause the following: > > * Once all threads have been brought to a safepoint, each thread will be > marked. > > * On a cross_modify_fence and safepoint_fence the mark for that thread > will be cleared. > > * On entry to a method and in a safepoint poll, then the thread is checked. > If it is marked, then the code will error. I am about to integrate concurrent stack scanning for ZGC. This patch changes things a bit so that disarming of the poll word is always done only by the thread itself. The consequence is that if a thread is in native or blocked, and the safepoint finishes, then the thread will when waking up, still take one slow path to figure out all is done, disarm the poll and run the cross modifying fence. The consequence is that we trivially need the cross modifying fence only in the poll slow path and nowhere else. So maybe if you wait for that one, we can delete a few more ISBs, and IMO also delete the verification code, as you can't really mess it up any more. ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From stefank at openjdk.java.net Thu Oct 1 07:09:39 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 1 Oct 2020 07:09:39 GMT Subject: RFR: 8247912: Make narrowOop a scoped enum [v5] In-Reply-To: References: Message-ID: On Wed, 30 Sep 2020 18:50:46 GMT, Kim Barrett wrote: >> Please review this change to the type narrowOop from a typedef for juint to a >> scoped enum with uint32_t as the representation type. This provides stronger >> type checking when using this type. >> >> For the most part this was fairly straightforward, and the patch size is >> relatively small. The implementation of some existing CompressedOops >> "primitives" required adjustment. An explicit conversion to narrowOop was >> added, with casts change to use it. There were a few places that were type >> punning and needed explicit conversions,, mostly in platform-specific assembly >> support. >> >> There are a couple of lingering problems. >> >> Relocation::pd_set_data_value in relocInfo_ppc.cpp is treating a narrowKlass >> as a narrowOop. I adjusted the code to accommodate the narrowOop change, but >> this probably ought to be done differently. >> >> There are a couple of `(narrowOop)` casts remaining in s390.ad. I'm not sure >> whether these can be safely converted to CompressedOops::narrow_oop_cast. >> >> There might still be some casts from narrowOop to an integral type. Those are >> hard to find in our cast-happy code base. >> >> Testing: >> tier1-6 for Oracle supported platforms. >> Build fastdebug linux-ppc64le and linux-s390x. > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev > excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since > the last revision: > - Merge branch 'master' into strong_narrowoop > - add missing inlines for consistency > - stefank review > - improve assertion > - remove NarrowType > - 8247912: Make narrowOop a scoped enum Marked as reviewed by stefank (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/273 From github.com+4146708+a74nh at openjdk.java.net Thu Oct 1 08:37:57 2020 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Thu, 1 Oct 2020 08:37:57 GMT Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: References: Message-ID: On Thu, 1 Oct 2020 06:04:03 GMT, Erik ?sterlund wrote: > I am about to integrate concurrent stack scanning for ZGC. This patch changes things a bit so that disarming of the > poll word is always done only by the thread itself. The consequence is that if a thread is in native or blocked, and > the safepoint finishes, then the thread will when waking up, still take one slow path to figure out all is done, disarm > the poll and run the cross modifying fence. The consequence is that we trivially need the cross modifying fence only in > the poll slow path and nowhere else. So maybe if you wait for that one, we can delete a few more ISBs, and IMO also > delete the verification code, as you can't really mess it up any more. I'd need to see what the code looks like - but I'm thinking it just reduces the number of cross_modify_fence calls, instead of reducing explicit isb calls in the backend (of which only 3 remain with this patch). Either way, it's good news. I'm going to try rebasing locally on the top of your patch (I'm assuming it's this one https://github.com/openjdk/jdk/pull/296 ) and see where that gets me. I'd be cautious about removing the verification code - it should be useful for helping to catch issues if we do start getting 1 run in a million crash dumps. But agreed, doesn't need to be there if the whole isb process has become trivial. ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From eosterlund at openjdk.java.net Thu Oct 1 09:18:38 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 1 Oct 2020 09:18:38 GMT Subject: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v9] In-Reply-To: References: Message-ID: <4lDALRmzwYmR7Fjl7vyUhbjpryVmVjh_lDKAwTp8Fwc=.511a9972-8ba7-4304-99ab-3c548ccc111b@github.com> > This PR the implementation of "JEP 376: ZGC: Concurrent Thread-Stack Processing" (cf. > https://openjdk.java.net/jeps/376). > Basically, this patch modifies the epilog safepoint when returning from a frame (supporting interpreter frames, c1, c2, > and native wrapper frames), to compare the stack pointer against a thread-local value. This turns return polls into > more of a swiss army knife that can be used to poll for safepoints, handshakes, but also returns into not yet safe to > expose frames, denoted by a "stack watermark". ZGC will leave frames (and other thread oops) in a state of a mess in > the GC checkpoint safepoints, rather than processing all threads and their stacks. Processing is initialized > automagically when threads wake up for a safepoint, or get poked by a handshake or safepoint. Said initialization > processes a few (3) frames and other thread oops. The rest - the bulk of the frame processing, is deferred until it is > actually needed. It is needed when a frame is exposed to either 1) execution (returns or unwinding due to exception > handling), or 2) stack walker APIs. A hook is then run to go and finish the lazy processing of frames. Mutator and GC > threads can compete for processing. The processing is therefore performed under a per-thread lock. Note that disarming > of the poll word (that the returns are comparing against) is only performed by the thread itself. So sliding the > watermark up will require one runtime call for a thread to note that nothing needs to be done, and then update the poll > word accordingly. Downgrading the poll word concurrently by other threads was simply not worth the complexity it > brought (and is only possible on TSO machines). So left that one out. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: Review: Kim CR 1 and exception handling fix ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/296/files - new: https://git.openjdk.java.net/jdk/pull/296/files/2ffbd764..83c40895 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=296&range=08 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=296&range=07-08 Stats: 82 lines in 9 files changed: 48 ins; 26 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/296.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/296/head:pull/296 PR: https://git.openjdk.java.net/jdk/pull/296 From eosterlund at openjdk.java.net Thu Oct 1 09:53:21 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 1 Oct 2020 09:53:21 GMT Subject: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v8] In-Reply-To: <4sawJHiIuc7oH5ETjrwJtJE3gkB1U2VBMVJdPmxJrg4=.e4e9b4d3-a118-4870-9b5b-f23b351093e2@github.com> References: <4sawJHiIuc7oH5ETjrwJtJE3gkB1U2VBMVJdPmxJrg4=.e4e9b4d3-a118-4870-9b5b-f23b351093e2@github.com> Message-ID: On Tue, 29 Sep 2020 16:09:48 GMT, Robbin Ehn wrote: >> Erik ?sterlund has updated the pull request with a new target base due to a merge or a rebase. The pull request now >> contains 12 commits: >> - Review: Move barrier detach >> - Review: Remove assert that has outstayed its welcome >> - Merge branch 'master' into 8253180_conc_stack_scanning >> - Review: Albert CR2 and defensive programming >> - Review: StefanK CR 3 >> - Review: Per CR 1 >> - Merge branch 'master' into 8253180_conc_stack_scanning >> - Review: Albert CR 1 >> - Review: SteafanK CR 2 >> - Merge branch 'master' into 8253180_conc_stack_scanning >> - ... and 2 more: https://git.openjdk.java.net/jdk/compare/6bddeb70...2ffbd764 > > Marked as reviewed by rehn (Reviewer). > _Mailing list message from [Kim Barrett](mailto:kim.barrett at oracle.com) on > [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ > I've only looked at scattered pieces, but what I've looked at seemed to be > in good shape. Only a few minor comments. > > ------------------------------------------------------------------------------ > src/hotspot/share/runtime/frame.cpp > 456 // for(StackFrameStream fst(thread); !fst.is_done(); fst.next()) { > > Needs to be updated for the new constructor arguments. Just in general, the > class documentation seems to need some updating for this change. Fixed. > ------------------------------------------------------------------------------ > src/hotspot/share/runtime/frame.cpp > 466 StackFrameStream(JavaThread *thread, bool update, bool process_frames); > > Something to consider is that bool parameters like that, especially when > there are multiple, are error prone. An alternative is distinct enums, which > likely also obviates the need for comments in calls. Coleen also had the same comment, and we agreed to file a follow-up RFE to clean that up. This also applies to all the existing parameters passed in. So I would still like to do that, but in a follow-up RFE. Said RFE will also make the parameter use in RegisterMap and vframeStream explicit. > ------------------------------------------------------------------------------ > src/hotspot/share/runtime/thread.hpp > 956 void set_processed_thread(Thread *thread) { _processed_thread = thread; } > > I think this should assert that either _processed_thread or thread are NULL. > Or maybe the RememberProcessedThread constructor should be asserting that > _cur_thr->processed_thread() is NULL. Fixed. > ------------------------------------------------------------------------------ Thanks for the review Kim! ------------- PR: https://git.openjdk.java.net/jdk/pull/296 From shade at openjdk.java.net Thu Oct 1 09:58:51 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 1 Oct 2020 09:58:51 GMT Subject: RFR: 8253891: Debug x86_32 builds fail after JDK-8239090 Message-ID: `CPU_MAX_FEATURE` is actually a `uint64_t`, with at least 46 bits set. `exact_log2` expects `intptr_t`. The implicit conversion works on 64-bit, but fails on 32-bit. Calling to `exact_log2_long` seems to cater for both bitnesses. Testing: - [x] tier1 on Linux x86_64 - [x] tier1 on Linux x86_32 (some unrelated failures) ------------- Commit messages: - 8253891: Debug x86_32 builds fail after JDK-8239090 Changes: https://git.openjdk.java.net/jdk/pull/455/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=455&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253891 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/455.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/455/head:pull/455 PR: https://git.openjdk.java.net/jdk/pull/455 From eosterlund at openjdk.java.net Thu Oct 1 10:15:32 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 1 Oct 2020 10:15:32 GMT Subject: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v8] In-Reply-To: References: <4sawJHiIuc7oH5ETjrwJtJE3gkB1U2VBMVJdPmxJrg4=.e4e9b4d3-a118-4870-9b5b-f23b351093e2@github.com> Message-ID: On Thu, 1 Oct 2020 09:50:45 GMT, Erik ?sterlund wrote: >> Marked as reviewed by rehn (Reviewer). > >> _Mailing list message from [Kim Barrett](mailto:kim.barrett at oracle.com) on >> [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ >> I've only looked at scattered pieces, but what I've looked at seemed to be >> in good shape. Only a few minor comments. >> >> ------------------------------------------------------------------------------ >> src/hotspot/share/runtime/frame.cpp >> 456 // for(StackFrameStream fst(thread); !fst.is_done(); fst.next()) { >> >> Needs to be updated for the new constructor arguments. Just in general, the >> class documentation seems to need some updating for this change. > > Fixed. > >> ------------------------------------------------------------------------------ >> src/hotspot/share/runtime/frame.cpp >> 466 StackFrameStream(JavaThread *thread, bool update, bool process_frames); >> >> Something to consider is that bool parameters like that, especially when >> there are multiple, are error prone. An alternative is distinct enums, which >> likely also obviates the need for comments in calls. > > Coleen also had the same comment, and we agreed to file a follow-up RFE to clean that up. This also applies to all the > existing parameters passed in. So I would still like to do that, but in a follow-up RFE. Said RFE will also make the > parameter use in RegisterMap and vframeStream explicit. >> ------------------------------------------------------------------------------ >> src/hotspot/share/runtime/thread.hpp >> 956 void set_processed_thread(Thread *thread) { _processed_thread = thread; } >> >> I think this should assert that either _processed_thread or thread are NULL. >> Or maybe the RememberProcessedThread constructor should be asserting that >> _cur_thr->processed_thread() is NULL. > > Fixed. > >> ------------------------------------------------------------------------------ > > Thanks for the review Kim! In my last PR update, I included a fix to an exception handling problem that I encountered after lots of stress testing that I have been running for a while now. I managed to catch the issue, get a reliable reproducer, and fix it. The root problem is that the hook I had placed in SharedRuntime::exception_handler_for_return_address has been ignored. The reason is that the stack is not walkable at this point. The hook then just ignores it. This had some unexpected consequences. After looking closer at this code, I found that if we did have a walkable stack when we call SharedRuntime::raw_exception_handler_for_return_address, that would have been the only hook we need at all for exception handling. It is always the common root point where we unwind into a caller frame due to an exception throwing into the caller, and we need to look up the rethrow handler of the caller. However, we are indeed not walkable here. To deal with this, I have rearranged the exceptino hooks a bit. First of all, I have deleted all before_unwind hooks for exception handling, because they should not be needed if the after_unwind hook is reliably called on the caller side instead. And those hooks do indeed need to be there, because we do not always have a point where we can call before_unwind (e.g. C1 unwind exception code, that just unwinds and looks up the rethrow handler via SharedRuntime::exception_handler_for_return_address). I have then traced all paths from SharedRuntime::raw_exception_handler_for_return_address into runtime rethrow handlers called, for each rethrow exception handler PC exposed in the function. They are: * OptoRuntime::rethrow_C when unwinding into C2 code * exception_handler_for_pc_helper via Runtime1::handle_exception_from_callee_id when unwinding into C1 code * JavaCallWrapper::~JavaCallWrapper when unwinding into a Java call stub * InterpreterRuntime::exception_handler_for_exception when unwinding into an interpreted method * Deoptimization::fetch_unroll_info (with exec_mode == Unpack_exception) when unwinding into a deoptimized nmethod Each rethrow handler returned has a corresponding comment saying which rethrow runtime rethrow handler it will end up in, once the stack has been walkable and we have transferred control into the caller. And all of those runtime hooks now have an after_unwind() hook. The good news is that now the responsibility for who calls the unwind hook for exception is clearer: it is never done by the callee, and always done by the caller, in its rethrow handler, at which point the stack is walkable. In order to avoid further issues where an unwind hook is ignored, I have changed them to assert that there is a last_Java_frame present. Previously I did not assert that, because there was shared code between runtime native transitions and the native wrapper, that called an unwind hook. This prevented the unwind hooks to assert this, as compiler threads would still perform native transitions, that just ignored the request. I moved the problematic hook for native up one level in the hierarchy to a path where it is only called by native wrappers (where we always have a last_Java_frame), so that I can finally assert that the unwind hooks always are called at points where we have a last_Java_frame. This makes me feel confident that I do not have another hook that is being accidentally ignored. However, the relationship for the various exception handling code executed in the caller, the callee, and between the two (before we are walkable) is rather complicated. So it would be good to have someone that knows the exception code very well have a look at this, to make sure I have not missed anything. I have rerun all testing and done a load of stress testing to sanity check this. The reproducer I eventually found that reproduced the issue with 100% success rate, was run many times with the new patch, and no longer reproduces any issue. ------------- PR: https://git.openjdk.java.net/jdk/pull/296 From stuefe at openjdk.java.net Thu Oct 1 10:16:41 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 1 Oct 2020 10:16:41 GMT Subject: RFR: 8253891: Debug x86_32 builds fail after JDK-8239090 In-Reply-To: References: Message-ID: On Thu, 1 Oct 2020 09:52:54 GMT, Aleksey Shipilev wrote: > `CPU_MAX_FEATURE` is actually a `uint64_t`, with at least 46 bits set. `exact_log2` expects `intptr_t`. The implicit > conversion works on 64-bit, but fails on 32-bit. Calling to `exact_log2_long` seems to cater for both bitnesses. > Testing: > - [x] tier1 on Linux x86_64 > - [x] tier1 on Linux x86_32 (some unrelated failures) Looks good. I like the plural of bitness. jlong is signed and intptr_t is unsigned, but I don't think that is a problem here. I see a gtest for the exact_log2_long testing all bits. Would be clearer though to have an uint64_t variant. ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/455 From eosterlund at openjdk.java.net Thu Oct 1 10:17:58 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 1 Oct 2020 10:17:58 GMT Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: References: Message-ID: On Thu, 1 Oct 2020 08:35:06 GMT, Alan Hayward wrote: >> I am about to integrate concurrent stack scanning for ZGC. This patch changes things a bit so that disarming of the >> poll word is always done only by the thread itself. The consequence is that if a thread is in native or blocked, and >> the safepoint finishes, then the thread will when waking up, still take one slow path to figure out all is done, disarm >> the poll and run the cross modifying fence. The consequence is that we trivially need the cross modifying fence only in >> the poll slow path and nowhere else. So maybe if you wait for that one, we can delete a few more ISBs, and IMO also >> delete the verification code, as you can't really mess it up any more. > >> I am about to integrate concurrent stack scanning for ZGC. This patch changes things a bit so that disarming of the >> poll word is always done only by the thread itself. The consequence is that if a thread is in native or blocked, and >> the safepoint finishes, then the thread will when waking up, still take one slow path to figure out all is done, disarm >> the poll and run the cross modifying fence. The consequence is that we trivially need the cross modifying fence only in >> the poll slow path and nowhere else. So maybe if you wait for that one, we can delete a few more ISBs, and IMO also >> delete the verification code, as you can't really mess it up any more. > > I'd need to see what the code looks like - but I'm thinking it just reduces the number of cross_modify_fence calls, > instead of reducing explicit isb calls in the backend (of which only 3 remain with this patch). Either way, it's good > news. I'm going to try rebasing locally on the top of your patch (I'm assuming it's this one > https://github.com/openjdk/jdk/pull/296 ) and see where that gets me. I'd be cautious about removing the verification > code - it should be useful for helping to catch issues if we do start getting 1 run in a million crash dumps. But > agreed, doesn't need to be there if the whole isb process has become trivial. You no longer need any isb when returning from the native wrappers (interpreted or compiled variant) after my patch. That *should* be 2 if the mentioned 3 hooks (without looking closely at the code). Because that will be done in the runtime instead when waking up from native. Which hook do we have left? ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From stuefe at openjdk.java.net Thu Oct 1 10:21:09 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 1 Oct 2020 10:21:09 GMT Subject: RFR: 8253891: Debug x86_32 builds fail after JDK-8239090 In-Reply-To: References: Message-ID: On Thu, 1 Oct 2020 10:14:22 GMT, Thomas Stuefe wrote: > Looks good. I like the plural of bitness. > jlong is signed and intptr_t is unsigned, but I don't think that is a problem here. I see a gtest for the > exact_log2_long testing all bits. Would be clearer though to have an uint64_t variant. Oh never mind the last remark. The intptr_t variant was signed too, so we do not change anything for 64bit. ------------- PR: https://git.openjdk.java.net/jdk/pull/455 From kim.barrett at oracle.com Thu Oct 1 10:36:38 2020 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 1 Oct 2020 06:36:38 -0400 Subject: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v8] In-Reply-To: References: <4sawJHiIuc7oH5ETjrwJtJE3gkB1U2VBMVJdPmxJrg4=.e4e9b4d3-a118-4870-9b5b-f23b351093e2@github.com> Message-ID: <9D9D71DC-1AB6-41BE-A515-C0510FA47EC0@oracle.com> > On Oct 1, 2020, at 5:53 AM, Erik ?sterlund wrote: > > >> _Mailing list message from [Kim Barrett](mailto:kim.barrett at oracle.com) onsrc/hotspot/share/runtime/frame.cpp >> 466 StackFrameStream(JavaThread *thread, bool update, bool process_frames); >> >> Something to consider is that bool parameters like that, especially when >> there are multiple, are error prone. An alternative is distinct enums, which >> likely also obviates the need for comments in calls. > > Coleen also had the same comment, and we agreed to file a follow-up RFE to clean that up. This also applies to all the > existing parameters passed in. So I would still like to do that, but in a follow-up RFE. Said RFE will also make the > parameter use in RegisterMap and vframeStream explicit. That?s fine with me. From kbarrett at openjdk.java.net Thu Oct 1 10:48:23 2020 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Thu, 1 Oct 2020 10:48:23 GMT Subject: Integrated: 8247912: Make narrowOop a scoped enum In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 04:01:34 GMT, Kim Barrett wrote: > Please review this change to the type narrowOop from a typedef for juint to a > scoped enum with uint32_t as the representation type. This provides stronger > type checking when using this type. > > For the most part this was fairly straightforward, and the patch size is > relatively small. The implementation of some existing CompressedOops > "primitives" required adjustment. An explicit conversion to narrowOop was > added, with casts change to use it. There were a few places that were type > punning and needed explicit conversions,, mostly in platform-specific assembly > support. > > There are a couple of lingering problems. > > Relocation::pd_set_data_value in relocInfo_ppc.cpp is treating a narrowKlass > as a narrowOop. I adjusted the code to accommodate the narrowOop change, but > this probably ought to be done differently. > > There are a couple of `(narrowOop)` casts remaining in s390.ad. I'm not sure > whether these can be safely converted to CompressedOops::narrow_oop_cast. > > There might still be some casts from narrowOop to an integral type. Those are > hard to find in our cast-happy code base. > > Testing: > tier1-6 for Oracle supported platforms. > Build fastdebug linux-ppc64le and linux-s390x. This pull request has now been integrated. Changeset: 2d9fa9da Author: Kim Barrett URL: https://git.openjdk.java.net/jdk/commit/2d9fa9da Stats: 97 lines in 22 files changed: 42 ins; 9 del; 46 mod 8247912: Make narrowOop a scoped enum Reviewed-by: iklam, stefank ------------- PR: https://git.openjdk.java.net/jdk/pull/273 From tschatzl at openjdk.java.net Thu Oct 1 11:53:28 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 1 Oct 2020 11:53:28 GMT Subject: RFR: 8253650: Cleanup: remove alignment_hint parameter from os::reserve_memory [v2] In-Reply-To: References: Message-ID: On Wed, 30 Sep 2020 12:34:45 GMT, Thomas Stuefe wrote: >> Hi all, >> >> since ancient times os::reserve_memory() carried around an "alignment_hint" parameter. It was undocumented but from the >> name it suggests it would cause os::reserve_memory() to try to align the start of the mapping to a given alignment. >> However, the only platform ever doing anything with this parameter was AIX, and there only in mmap() mode. All other >> platforms ignored the parameter. So it can be removed, provided we fix the AIX case. >> Notes: >> - if one really needs alignment memory, there is os::reserve_memory_aligned() which guarantees the alignment. It will do >> the usual over-reserving-and-chopping-away to do that. >> - On AIX there is a second reason why we align the mmap() result pointer to 64K, since we "fake" 64K pages in some >> places. I disentangled that alignment handling from the caller provided alignment. >> - This affects os::reserve_memory() as well as the new os::reserve_memory_with_fd() >> - I also fixed comments in virtualSpace.cpp which do not apply anymore after JDK-8253638 >> >> Tests: tier1, manual builds and tests on AIX > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Comment fixes Marked as reviewed by tschatzl (Reviewer). src/hotspot/os/posix/os_posix.cpp line 327: > 325: // Todo: this probably works more out of accident. Using reserve_mmapped_memory would require an munmap > 326: // to release, but later in this function os::release_memory is used which is not guaranteed to use mmap. > 327: // See JDK-8253851. Looks good apart from these two todo comments (i.e. the TODO:really? one and the last paragraph): they only replicate the information you added to JDK-8253851. Source code isn't a good place to track issues, so I would prefer if these comments were removed since they seem superfluous. Your call. ------------- PR: https://git.openjdk.java.net/jdk/pull/430 From stuefe at openjdk.java.net Thu Oct 1 12:00:08 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 1 Oct 2020 12:00:08 GMT Subject: RFR: 8253650: Cleanup: remove alignment_hint parameter from os::reserve_memory [v2] In-Reply-To: References: Message-ID: On Thu, 1 Oct 2020 11:50:43 GMT, Thomas Schatzl wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> Comment fixes > > src/hotspot/os/posix/os_posix.cpp line 327: > >> 325: // Todo: this probably works more out of accident. Using reserve_mmapped_memory would require an munmap >> 326: // to release, but later in this function os::release_memory is used which is not guaranteed to use mmap. >> 327: // See JDK-8253851. > > Looks good apart from these two todo comments (i.e. the TODO:really? one and the last paragraph): they only replicate > the information you added to JDK-8253851. Source code isn't a good place to track issues, so I would prefer if these > comments were removed since they seem superfluous. Your call. Thanks Thomas. I will remove the TODOs as you requested. As outside contributor I find those remarks very useful to make internal "xxx works on it" herd knowledge visible. But here I have no strong emotions. ------------- PR: https://git.openjdk.java.net/jdk/pull/430 From stuefe at openjdk.java.net Thu Oct 1 12:13:01 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 1 Oct 2020 12:13:01 GMT Subject: RFR: 8253650: Cleanup: remove alignment_hint parameter from os::reserve_memory [v3] In-Reply-To: References: Message-ID: > Hi all, > > since ancient times os::reserve_memory() carried around an "alignment_hint" parameter. It was undocumented but from the > name it suggests it would cause os::reserve_memory() to try to align the start of the mapping to a given alignment. > However, the only platform ever doing anything with this parameter was AIX, and there only in mmap() mode. All other > platforms ignored the parameter. So it can be removed, provided we fix the AIX case. > Notes: > - if one really needs alignment memory, there is os::reserve_memory_aligned() which guarantees the alignment. It will do > the usual over-reserving-and-chopping-away to do that. > - On AIX there is a second reason why we align the mmap() result pointer to 64K, since we "fake" 64K pages in some > places. I disentangled that alignment handling from the caller provided alignment. > - This affects os::reserve_memory() as well as the new os::reserve_memory_with_fd() > - I also fixed comments in virtualSpace.cpp which do not apply anymore after JDK-8253638 > > Tests: tier1, manual builds and tests on AIX Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: Update comment in os_posix.cpp ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/430/files - new: https://git.openjdk.java.net/jdk/pull/430/files/d02225e7..b48c782a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=430&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=430&range=01-02 Stats: 7 lines in 1 file changed: 0 ins; 3 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/430.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/430/head:pull/430 PR: https://git.openjdk.java.net/jdk/pull/430 From stuefe at openjdk.java.net Thu Oct 1 12:13:06 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 1 Oct 2020 12:13:06 GMT Subject: Integrated: 8253650: Cleanup: remove alignment_hint parameter from os::reserve_memory In-Reply-To: References: Message-ID: <69iwH3l41NAYaYqZqoAGi1MCPDEu1prJRXhIm_HSS9s=.c19428b6-b354-4fb9-bbd0-7b6d728fb625@github.com> On Wed, 30 Sep 2020 10:08:09 GMT, Thomas Stuefe wrote: > Hi all, > > since ancient times os::reserve_memory() carried around an "alignment_hint" parameter. It was undocumented but from the > name it suggests it would cause os::reserve_memory() to try to align the start of the mapping to a given alignment. > However, the only platform ever doing anything with this parameter was AIX, and there only in mmap() mode. All other > platforms ignored the parameter. So it can be removed, provided we fix the AIX case. > Notes: > - if one really needs alignment memory, there is os::reserve_memory_aligned() which guarantees the alignment. It will do > the usual over-reserving-and-chopping-away to do that. > - On AIX there is a second reason why we align the mmap() result pointer to 64K, since we "fake" 64K pages in some > places. I disentangled that alignment handling from the caller provided alignment. > - This affects os::reserve_memory() as well as the new os::reserve_memory_with_fd() > - I also fixed comments in virtualSpace.cpp which do not apply anymore after JDK-8253638 > > Tests: tier1, manual builds and tests on AIX This pull request has now been integrated. Changeset: 44e6820c Author: Thomas Stuefe URL: https://git.openjdk.java.net/jdk/commit/44e6820c Stats: 92 lines in 12 files changed: 5 ins; 37 del; 50 mod 8253650: Cleanup: remove alignment_hint parameter from os::reserve_memory Reviewed-by: stefank, tschatzl ------------- PR: https://git.openjdk.java.net/jdk/pull/430 From mdoerr at openjdk.java.net Thu Oct 1 12:23:47 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Thu, 1 Oct 2020 12:23:47 GMT Subject: RFR: 8230664: Fix TestInstanceKlassSize for PowerPC [v2] In-Reply-To: References: Message-ID: On Thu, 1 Oct 2020 00:19:09 GMT, Serguei Spitsyn wrote: >> Ziviani has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains >> one commit: >> 8230664: Fix TestInstanceKlassSize >> >> The code hasStoredFingerprint() at InstanceKlass.java is not considering >> AOT disabled at compilation time, like has_stored_fingerprint() at >> instanceKlass.cpp does. Such difference can cause TestInstanceKlassSize >> failures because all objects will have an extra 8-bytes. > > LGTM I think this fix deserves a new JBS issue. It doesn't resolve the rounding problem described in JDK-8230664. It fixes an additiona issue. ------------- PR: https://git.openjdk.java.net/jdk/pull/358 From github.com+4146708+a74nh at openjdk.java.net Thu Oct 1 13:35:40 2020 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Thu, 1 Oct 2020 13:35:40 GMT Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: References: Message-ID: On Thu, 1 Oct 2020 10:15:38 GMT, Erik ?sterlund wrote: >>> I am about to integrate concurrent stack scanning for ZGC. This patch changes things a bit so that disarming of the >>> poll word is always done only by the thread itself. The consequence is that if a thread is in native or blocked, and >>> the safepoint finishes, then the thread will when waking up, still take one slow path to figure out all is done, disarm >>> the poll and run the cross modifying fence. The consequence is that we trivially need the cross modifying fence only in >>> the poll slow path and nowhere else. So maybe if you wait for that one, we can delete a few more ISBs, and IMO also >>> delete the verification code, as you can't really mess it up any more. >> >> I'd need to see what the code looks like - but I'm thinking it just reduces the number of cross_modify_fence calls, >> instead of reducing explicit isb calls in the backend (of which only 3 remain with this patch). Either way, it's good >> news. I'm going to try rebasing locally on the top of your patch (I'm assuming it's this one >> https://github.com/openjdk/jdk/pull/296 ) and see where that gets me. I'd be cautious about removing the verification >> code - it should be useful for helping to catch issues if we do start getting 1 run in a million crash dumps. But >> agreed, doesn't need to be there if the whole isb process has become trivial. > > You no longer need any isb when returning from the native wrappers (interpreted or compiled variant) after my patch. > That *should* be 2 if the mentioned 3 hooks (without looking closely at the code). Because that will be done in the > runtime instead when waking up from native. Which hook do we have left? Agreed. With both patches, and with the JNI isbs removed, that leaves just one safepoint isb in the AArch64 code (plus the other isb in emit_static_call_stub). The test patch exists not just to test the AArch64 but the common code too (ideally it would be extended to other targets too). Are we happy that the cross_modify_fence is called at all the required points? A hole in the common code would fail only very rarely. It does require quite a bit of code to add this test though. ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From phh at openjdk.java.net Thu Oct 1 13:42:13 2020 From: phh at openjdk.java.net (Paul Hohensee) Date: Thu, 1 Oct 2020 13:42:13 GMT Subject: RFR: 8253891: Debug x86_32 builds fail after JDK-8239090 In-Reply-To: References: Message-ID: On Thu, 1 Oct 2020 09:52:54 GMT, Aleksey Shipilev wrote: > `CPU_MAX_FEATURE` is actually a `uint64_t`, with at least 46 bits set. `exact_log2` expects `intptr_t`. The implicit > conversion works on 64-bit, but fails on 32-bit. Calling to `exact_log2_long` seems to cater for both bitnesses. > Testing: > - [x] tier1 on Linux x86_64 > - [x] tier1 on Linux x86_32 (some unrelated failures) Thanks for cleaning this up. ------------- Marked as reviewed by phh (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/455 From github.com+670087+jrziviani at openjdk.java.net Thu Oct 1 14:04:04 2020 From: github.com+670087+jrziviani at openjdk.java.net (Ziviani) Date: Thu, 1 Oct 2020 14:04:04 GMT Subject: RFR: 8230664: Fix TestInstanceKlassSize for PowerPC [v2] In-Reply-To: References: Message-ID: <02CIxrp2dIe5YnVvBmmVkRuyVxA0hezEUSzcX0zD080=.c29afb6f-7994-4c52-84a6-e979b7a5d761@github.com> On Thu, 1 Oct 2020 12:21:22 GMT, Martin Doerr wrote: > I think this fix deserves a new JBS issue. It doesn't resolve the rounding problem described in JDK-8230664. It fixes > an additional issue. The rounding problem may reoccur when other class layout changes are done. Totally agreed. But, would you mind to create the ticket for me, please? Then, I change the commit header/message. Thank you!! ------------- PR: https://git.openjdk.java.net/jdk/pull/358 From luhenry at openjdk.java.net Thu Oct 1 14:34:22 2020 From: luhenry at openjdk.java.net (Ludovic Henry) Date: Thu, 1 Oct 2020 14:34:22 GMT Subject: RFR: 8253757: Add LLVM-based backend for hsdis [v6] In-Reply-To: <91erxiMDb4ftvSomuJYHPi9SX-v8Z2VLD2qEwCbz5tk=.b9ed01b5-f0e0-4ed7-9c1a-b06bc0e64640@github.com> References: <91erxiMDb4ftvSomuJYHPi9SX-v8Z2VLD2qEwCbz5tk=.b9ed01b5-f0e0-4ed7-9c1a-b06bc0e64640@github.com> Message-ID: > When bringing up Hotspot onto new platforms, it is not always possible to compile hsdis because gcc is not yet > available. For example, for Windows-AArch64 and macOS-AArch64. > For some such platforms, it is possible to use LLVM as an alternative backend as it also supports a disassembler > feature. Ludovic Henry has updated the pull request incrementally with one additional commit since the last revision: Match assembly notation with gcc ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/392/files - new: https://git.openjdk.java.net/jdk/pull/392/files/2a1c724c..84c63455 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=392&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=392&range=04-05 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/392.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/392/head:pull/392 PR: https://git.openjdk.java.net/jdk/pull/392 From mdoerr at openjdk.java.net Thu Oct 1 15:01:08 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Thu, 1 Oct 2020 15:01:08 GMT Subject: RFR: 8230664: Fix TestInstanceKlassSize for PowerPC [v2] In-Reply-To: <02CIxrp2dIe5YnVvBmmVkRuyVxA0hezEUSzcX0zD080=.c29afb6f-7994-4c52-84a6-e979b7a5d761@github.com> References: <02CIxrp2dIe5YnVvBmmVkRuyVxA0hezEUSzcX0zD080=.c29afb6f-7994-4c52-84a6-e979b7a5d761@github.com> Message-ID: <2ZNc3x7LB3mMjwN9ONCccw-vgN-O95tZUcWLGetenOU=.394e3aef-85f6-41ef-861a-621a7fe25f78@github.com> On Thu, 1 Oct 2020 14:01:29 GMT, Ziviani wrote: >> I think this fix deserves a new JBS issue. It doesn't resolve the rounding problem described in JDK-8230664. It fixes >> an additional issue. The rounding problem may reoccur when other class layout changes are done. > >> I think this fix deserves a new JBS issue. It doesn't resolve the rounding problem described in JDK-8230664. It fixes >> an additional issue. The rounding problem may reoccur when other class layout changes are done. > > Totally agreed. But, would you mind to create the ticket for me, please? Then, I change the commit header/message. > Thank you!! Hi Jose, here's your new bug: https://bugs.openjdk.java.net/browse/JDK-8253900 ------------- PR: https://git.openjdk.java.net/jdk/pull/358 From cjplummer at openjdk.java.net Thu Oct 1 15:06:05 2020 From: cjplummer at openjdk.java.net (Chris Plummer) Date: Thu, 1 Oct 2020 15:06:05 GMT Subject: RFR: 8230664: Fix TestInstanceKlassSize for PowerPC [v2] In-Reply-To: <02CIxrp2dIe5YnVvBmmVkRuyVxA0hezEUSzcX0zD080=.c29afb6f-7994-4c52-84a6-e979b7a5d761@github.com> References: <02CIxrp2dIe5YnVvBmmVkRuyVxA0hezEUSzcX0zD080=.c29afb6f-7994-4c52-84a6-e979b7a5d761@github.com> Message-ID: On Thu, 1 Oct 2020 14:01:29 GMT, Ziviani wrote: > I think this fix deserves a new JBS issue. It doesn't resolve the rounding problem described in JDK-8230664. It fixes > an additional issue. The rounding problem may reoccur when other class layout changes are done. Does it still need to be problem listed due to this issue? I guess it may or may not be an issue based on the current size InstanceKlass, so probably best to keep in problem listed. ------------- PR: https://git.openjdk.java.net/jdk/pull/358 From vkempik at openjdk.java.net Thu Oct 1 15:09:09 2020 From: vkempik at openjdk.java.net (Vladimir Kempik) Date: Thu, 1 Oct 2020 15:09:09 GMT Subject: RFR: 8253899: Make IsClassUnloadingEnabled signature match specification Message-ID: Please review this change for hotspot and one test. There is few JVMTI callback/event functions in jdk which signature doesn't match specification. for example: static jvmtiError JNICALL IsClassUnloadingEnabled(const jvmtiEnv* env, jboolean* enabled, ...) but according to jvmti specs it should be: static jvmtiError JNICALL IsClassUnloadingEnabled(const jvmtiEnv* env, ...) same with ClassUnload(jvmtiEnv* jvmti_env, JNIEnv* jni_env, const char* name, ...) in tests for many years that didn't matter but with coming JEP-391 it becomes important to make it match the spec https://developer.apple.com/documentation/apple_silicon/addressing_architectural_differences_in_your_macos_code This commit makes the above mentioned functions to have signature matching jvmti specification ------------- Commit messages: - 8253899: Make IsClassUnloadingEnabled signature match specification + jcheck - 8253899: Make IsClassUnloadingEnabled signature match specification Changes: https://git.openjdk.java.net/jdk/pull/466/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=466&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253899 Stats: 17 lines in 2 files changed: 15 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/466.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/466/head:pull/466 PR: https://git.openjdk.java.net/jdk/pull/466 From mdoerr at openjdk.java.net Thu Oct 1 15:18:05 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Thu, 1 Oct 2020 15:18:05 GMT Subject: RFR: 8230664: Fix TestInstanceKlassSize for PowerPC [v2] In-Reply-To: References: <02CIxrp2dIe5YnVvBmmVkRuyVxA0hezEUSzcX0zD080=.c29afb6f-7994-4c52-84a6-e979b7a5d761@github.com> Message-ID: On Thu, 1 Oct 2020 15:02:55 GMT, Chris Plummer wrote: >>> I think this fix deserves a new JBS issue. It doesn't resolve the rounding problem described in JDK-8230664. It fixes >>> an additional issue. The rounding problem may reoccur when other class layout changes are done. >> >> Totally agreed. But, would you mind to create the ticket for me, please? Then, I change the commit header/message. >> Thank you!! > >> I think this fix deserves a new JBS issue. It doesn't resolve the rounding problem described in JDK-8230664. It fixes >> an additional issue. The rounding problem may reoccur when other class layout changes are done. > > Does it still need to be problem listed due to this issue? I guess it may or may not be an issue based on the current > size InstanceKlass, so probably best to keep in problem listed. If these tests are currently passing with Jose's fix, I suggest to comment them out in the problem list with a note that we may need to disable them again because of JDK-8230664. This way we can test the functionality Jose is fixing until then. And we'll see when the issue comes up again. ------------- PR: https://git.openjdk.java.net/jdk/pull/358 From luhenry at microsoft.com Thu Oct 1 15:48:12 2020 From: luhenry at microsoft.com (Ludovic Henry) Date: Thu, 1 Oct 2020 15:48:12 +0000 Subject: RFR: 8248238: Implementation of JEP: Windows AArch64 Support [v12] In-Reply-To: References: Message-ID: Hi, As we now have a whole bunch of reviews (thank you all!), we would need a sponsor to get it merged. Thank you :) ------------- PR: https://github.com/openjdk/jdk/pull/212 From github.com+670087+jrziviani at openjdk.java.net Thu Oct 1 16:20:12 2020 From: github.com+670087+jrziviani at openjdk.java.net (Ziviani) Date: Thu, 1 Oct 2020 16:20:12 GMT Subject: RFR: 8230664: Fix TestInstanceKlassSize for PowerPC [v3] In-Reply-To: References: Message-ID: > TestInstanceKlassSize was failing because, for PowerPC, the following code (instanceKlass.cpp) always compiles to > `return false;` bool InstanceKlass::has_stored_fingerprint() const { > #if INCLUDE_AOT > return should_store_fingerprint() || is_shared(); > #else > return false; > #endif > } > However, in `hasStoredFingerprint()@InstanceKlass.java` the condition `shouldStoreFingerprint() || isShared();` is > always evaluated and may return true (_AFAIK isShared() returns true_). Such condition adds 8 bytes in the > `getSize()@InstanceKlass.java` causing the failure in TestInstanceKlassSize: public long getSize() { // in number of > bytes > ... > if (hasStoredFingerprint()) { > size += 8; // uint64_t > } > return alignSize(size); > } > Considering these tests are failing for PowerPC only (_based on ProblemList.txt_), my solution checks if > `hasStoredFingerprint()` is running on a PowerPC platform. I decided to go this way because there is no existing flag > informing whether AOT is included or not and creating a new one just to handle the PowerPC case seems too much. This > patch is an attempt to fix https://bugs.openjdk.java.net/browse/JDK-8230664 Ziviani has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit: 8253900: SA: wrong size computation when JVM was built without AOT The code hasStoredFingerprint() at InstanceKlass.java is not considering AOT disabled at compilation time, like has_stored_fingerprint() at instanceKlass.cpp does. Such difference can cause TestInstanceKlassSize failures because all objects will have an extra 8-bytes. ------------- Changes: https://git.openjdk.java.net/jdk/pull/358/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=358&range=02 Stats: 32 lines in 6 files changed: 29 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/358.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/358/head:pull/358 PR: https://git.openjdk.java.net/jdk/pull/358 From github.com+670087+jrziviani at openjdk.java.net Thu Oct 1 16:40:06 2020 From: github.com+670087+jrziviani at openjdk.java.net (Ziviani) Date: Thu, 1 Oct 2020 16:40:06 GMT Subject: RFR: 8253900: SA: wrong size computation when JVM was built without AOT [v2] In-Reply-To: References: <02CIxrp2dIe5YnVvBmmVkRuyVxA0hezEUSzcX0zD080=.c29afb6f-7994-4c52-84a6-e979b7a5d761@github.com> Message-ID: On Thu, 1 Oct 2020 15:15:09 GMT, Martin Doerr wrote: > If these tests are currently passing with Jose's fix, I suggest to comment them out in the problem list with a note > that we may need to disable them again because of JDK-8230664. This way we can test the functionality Jose is fixing > until then. And we'll see when the issue comes up again. Done! Please, check if you're fine with this: +# The solution to bug JDK-8253900 seems to fix tests TestInstanceKlassSize and +# TestInstanceKlassSizeForInterface. However, while JDK-8230664 is not resolved, +# these tests may be disabled again if necessary. +# serviceability/sa/TestInstanceKlassSize.java 8230664 linux-ppc64le,linux-ppc64 +# serviceability/sa/TestInstanceKlassSizeForInterface.java 8230664 linux-ppc64le,linux-ppc64 Thank you Martin! ------------- PR: https://git.openjdk.java.net/jdk/pull/358 From rehn at openjdk.java.net Thu Oct 1 17:10:13 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 1 Oct 2020 17:10:13 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts Message-ID: The issue is that this test doesn't consider Handshake All operation. Depending if/when such operation is scheduled it can lockup the VM thread. And the safepoint that should timeout never happens. See issue for more information. So I changed the test to "try timeout" the safepoint, but if there was no safepoint (blocked by a handshake all), we retry. We sleep unsafe much longer than the interval SafepointALot generates operations, which 'guarantees' we will timeout if there is no handshake all. (some extreme case of kernel scheduling causing a very long context switch could also make us not timeout) Passes t1, t3, and repeat runs of the test. ------------- Commit messages: - Fixed test Changes: https://git.openjdk.java.net/jdk/pull/465/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=465&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253794 Stats: 53 lines in 3 files changed: 16 ins; 26 del; 11 mod Patch: https://git.openjdk.java.net/jdk/pull/465.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/465/head:pull/465 PR: https://git.openjdk.java.net/jdk/pull/465 From pchilanomate at openjdk.java.net Thu Oct 1 19:01:05 2020 From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo) Date: Thu, 1 Oct 2020 19:01:05 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts In-Reply-To: References: Message-ID: On Thu, 1 Oct 2020 14:35:45 GMT, Robbin Ehn wrote: > The issue is that this test doesn't consider Handshake All operation. > Depending if/when such operation is scheduled it can lockup the VM thread. > And the safepoint that should timeout never happens. > See issue for more information. > > So I changed the test to "try timeout" the safepoint, but if there was no safepoint (blocked by a handshake all), we > retry. We sleep unsafe much longer than the interval SafepointALot generates operations, which 'guarantees' we will > timeout if there is no handshake all. (some extreme case of kernel scheduling causing a very long context switch could > also make us not timeout) Passes t1, t3, and repeat runs of the test. LGTM ------------- Marked as reviewed by pchilanomate (Committer). PR: https://git.openjdk.java.net/jdk/pull/465 From cjplummer at openjdk.java.net Thu Oct 1 19:25:06 2020 From: cjplummer at openjdk.java.net (Chris Plummer) Date: Thu, 1 Oct 2020 19:25:06 GMT Subject: RFR: 8253900: SA: wrong size computation when JVM was built without AOT [v2] In-Reply-To: References: <02CIxrp2dIe5YnVvBmmVkRuyVxA0hezEUSzcX0zD080=.c29afb6f-7994-4c52-84a6-e979b7a5d761@github.com> Message-ID: On Thu, 1 Oct 2020 16:37:11 GMT, Ziviani wrote: > > If these tests are currently passing with Jose's fix, I suggest to comment them out in the problem list with a note > > that we may need to disable them again because of JDK-8230664. This way we can test the functionality Jose is fixing > > until then. And we'll see when the issue comes up again. > > Done! Please, check if you're fine with this: > > ``` > +# The solution to bug JDK-8253900 seems to fix tests TestInstanceKlassSize and > +# TestInstanceKlassSizeForInterface. However, while JDK-8230664 is not resolved, > +# these tests may be disabled again if necessary. > +# serviceability/sa/TestInstanceKlassSize.java 8230664 linux-ppc64le,linux-ppc64 > +# serviceability/sa/TestInstanceKlassSizeForInterface.java 8230664 linux-ppc64le,linux-ppc64 > ``` I don't believe there are any other cases were we comment out a test in a problem list. I think it would be best just to remove it completely from ProblemList.txt. JDK-8230664 will still be filed. I suggest maybe adding a comment there saying that the test was removed from the problem list when JDK-8253900 was fixed, but should be re-added if JDK-8230664 starts to reproduce again. ------------- PR: https://git.openjdk.java.net/jdk/pull/358 From dcubed at openjdk.java.net Thu Oct 1 20:14:05 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Thu, 1 Oct 2020 20:14:05 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts In-Reply-To: References: Message-ID: On Thu, 1 Oct 2020 14:35:45 GMT, Robbin Ehn wrote: > The issue is that this test doesn't consider Handshake All operation. > Depending if/when such operation is scheduled it can lockup the VM thread. > And the safepoint that should timeout never happens. > See issue for more information. > > So I changed the test to "try timeout" the safepoint, but if there was no safepoint (blocked by a handshake all), we > retry. We sleep unsafe much longer than the interval SafepointALot generates operations, which 'guarantees' we will > timeout if there is no handshake all. (some extreme case of kernel scheduling causing a very long context switch could > also make us not timeout) Passes t1, t3, and repeat runs of the test. Changes requested by dcubed (Reviewer). test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 71: > 69: ProcessBuilder pb = ProcessTools.createJavaProcessBuilder( > 70: "-XX:+UnlockDiagnosticVMOptions", > 71: "-XX:-UseBiasedLocking", I think "-XX:-UseBiasedLocking" is specified to make sure that Biased Locking is disabled even in test tasks where it is enabled by task specific flags. test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 66: > 64: "-Xms64m", > 65: "TestAbortVMOnSafepointTimeout", > 66: "" + unsafe_wait Cheap conversion from int to String? test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 42: > 40: public static void main(String[] args) throws Exception { > 41: if (args.length > 0) { > 42: Integer waitTime = Integer.parseInt(args[0]); What is this going to do if no argument is passed? Looks like it's going to throw NumberFormatException... Update: I missed the `if (args.length > 0)` above... test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 35: > 33: * @build TestAbortVMOnSafepointTimeout > 34: * @run driver ClassFileInstaller sun.hotspot.WhiteBox > 35: * @run main/othervm -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI > TestAbortVMOnSafepointTimeout L42 below parses args[0], but you're not passing a parameter here... Update: Ahhh... this just launches the test and the test launches another VM... got it. test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 49: > 47: System.out.println("This message would occur after some time."); > 48: } else { > 49: testWith(50, 1, 999); Please consider: `testWith(50 /* sfpt_interval */, 1 /* timeout_delay */, 999 /* unsafe_wait */);` test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 30: > 28: /* > 29: * @test TestAbortVMOnSafepointTimeout > 30: * @summary Check if VM can kill thread which doesn't reach safepoint. Not your bug, but this summary is wrong. Perhaps: `@summary Check if VM aborts when a thread doesn't reach safepoint.` src/hotspot/share/prims/whitebox.cpp line 2294: > 2292: > 2293: WB_ENTRY(jboolean, WB_WaitUnsafe(JNIEnv* env, jobject wb, jint time)) > 2294: SafepointStateTracker tracker = SafepointSynchronize::safepoint_state_tracker(); I had to go back and reread the `SafepointStateTracker` code... Because this JavaThread is executing and not at a safepoint, the call to `SafepointSynchronize::safepoint_state_tracker()` will save state as not-at-a-safepoint (with some safepoint_id value). src/hotspot/share/prims/whitebox.cpp line 2296: > 2294: SafepointStateTracker tracker = SafepointSynchronize::safepoint_state_tracker(); > 2295: os::naked_short_sleep(time); > 2296: return tracker.safepoint_state_changed(); Ahhh... returns true when we've had a state change or when the safepoint ID has changed, but... how can the system change the `SafepointSynchronize::is_at_safepoint()` return value or the safepoint_id value while we're sleeping? The system shouldn't be able to go to a safepoint or change safepoint_id values while this calling thread is not safepoint safe. I would think that this function would always return false, but maybe I'm missing something here. test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 47: > 45: System.out.println("Waiting for safepoint"); > 46: } > 47: System.out.println("This message would occur after some time."); Maybe I'm missing something, but I don't see how `wb.waitUnsafe(waitTime)` is ever going to return anything but false so this message should never be printed. test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 79: > 77: } > 78: } > 79: output.shouldNotHaveExitValue(0); Looks like the test doesn't require that this mesg get printed: `System.out.println("This message would occur after some time.");` And it is set up to detect that the SafepointTimeout happened which is what we want the test to verify at the core. ------------- PR: https://git.openjdk.java.net/jdk/pull/465 From dcubed at openjdk.java.net Thu Oct 1 20:14:06 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Thu, 1 Oct 2020 20:14:06 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts In-Reply-To: References: Message-ID: On Thu, 1 Oct 2020 19:40:04 GMT, Daniel D. Daugherty wrote: >> The issue is that this test doesn't consider Handshake All operation. >> Depending if/when such operation is scheduled it can lockup the VM thread. >> And the safepoint that should timeout never happens. >> See issue for more information. >> >> So I changed the test to "try timeout" the safepoint, but if there was no safepoint (blocked by a handshake all), we >> retry. We sleep unsafe much longer than the interval SafepointALot generates operations, which 'guarantees' we will >> timeout if there is no handshake all. (some extreme case of kernel scheduling causing a very long context switch could >> also make us not timeout) Passes t1, t3, and repeat runs of the test. > > test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 49: > >> 47: System.out.println("This message would occur after some time."); >> 48: } else { >> 49: testWith(50, 1, 999); > > Please consider: > > `testWith(50 /* sfpt_interval */, 1 /* timeout_delay */, 999 /* unsafe_wait */);` Also, I think the test would be more clear if this testWith() part was in a `if (args.length == 0)` block at the top and the else part was the rest of the test. After all, code flow wise, you execute the `(args.length == 0)` case first and then come back for the case with the unsafe_wait value. ------------- PR: https://git.openjdk.java.net/jdk/pull/465 From github.com+670087+jrziviani at openjdk.java.net Thu Oct 1 20:34:11 2020 From: github.com+670087+jrziviani at openjdk.java.net (Ziviani) Date: Thu, 1 Oct 2020 20:34:11 GMT Subject: RFR: 8253900: SA: wrong size computation when JVM was built without AOT [v4] In-Reply-To: References: Message-ID: > TestInstanceKlassSize was failing because, for PowerPC, the following code (instanceKlass.cpp) always compiles to > `return false;` bool InstanceKlass::has_stored_fingerprint() const { > #if INCLUDE_AOT > return should_store_fingerprint() || is_shared(); > #else > return false; > #endif > } > However, in `hasStoredFingerprint()@InstanceKlass.java` the condition `shouldStoreFingerprint() || isShared();` is > always evaluated and may return true (_AFAIK isShared() returns true_). Such condition adds 8 bytes in the > `getSize()@InstanceKlass.java` causing the failure in TestInstanceKlassSize: public long getSize() { // in number of > bytes > ... > if (hasStoredFingerprint()) { > size += 8; // uint64_t > } > return alignSize(size); > } > Considering these tests are failing for PowerPC only (_based on ProblemList.txt_), my solution checks if > `hasStoredFingerprint()` is running on a PowerPC platform. I decided to go this way because there is no existing flag > informing whether AOT is included or not and creating a new one just to handle the PowerPC case seems too much. This > patch is an attempt to fix https://bugs.openjdk.java.net/browse/JDK-8230664 Ziviani has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit: 8253900: SA: wrong size computation when JVM was built without AOT The code hasStoredFingerprint() at InstanceKlass.java is not considering AOT disabled at compilation time, like has_stored_fingerprint() at instanceKlass.cpp does. Such difference can cause TestInstanceKlassSize failures because all objects will have an extra 8-bytes. ------------- Changes: https://git.openjdk.java.net/jdk/pull/358/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=358&range=03 Stats: 30 lines in 6 files changed: 27 ins; 2 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/358.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/358/head:pull/358 PR: https://git.openjdk.java.net/jdk/pull/358 From github.com+670087+jrziviani at openjdk.java.net Thu Oct 1 20:39:05 2020 From: github.com+670087+jrziviani at openjdk.java.net (Ziviani) Date: Thu, 1 Oct 2020 20:39:05 GMT Subject: RFR: 8253900: SA: wrong size computation when JVM was built without AOT [v2] In-Reply-To: References: <02CIxrp2dIe5YnVvBmmVkRuyVxA0hezEUSzcX0zD080=.c29afb6f-7994-4c52-84a6-e979b7a5d761@github.com> Message-ID: On Thu, 1 Oct 2020 19:22:24 GMT, Chris Plummer wrote: > ... I suggest maybe adding a comment there saying that the test was removed from the problem list when JDK-8253900 was > fixed, but should be re-added if JDK-8230664 starts to reproduce again. I like this idea because a simple `grep` will be enough to understand what happened. Just pushed my branch, any issue just let me know. Thank you Chris! ------------- PR: https://git.openjdk.java.net/jdk/pull/358 From iklam at openjdk.java.net Thu Oct 1 21:09:05 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 1 Oct 2020 21:09:05 GMT Subject: RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive [v7] In-Reply-To: References: Message-ID: On Wed, 30 Sep 2020 04:59:20 GMT, Yumin Qi wrote: >> This patch is reorganized after 8252725, which is separated from this patch to refactor jlink glugin code. The previous >> webrev with hg can be found at: http://cr.openjdk.java.net/~minqi/2020/8247536/webrev-05. With 8252725 integrated, the >> regeneration of holder classes is simply to call the new added GenerateJLIClassesHelper.cdsGenerateHolderClasses >> function. Tests: tier1-4 > > Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: > > Remove trailing word of line which is not used in holder class regeneration. There is a trailing LF (Line Feed) so trim > white spaces from both front and end of the line or it will fail method type validation. Changes requested by iklam (Reviewer). test/hotspot/jtreg/runtime/cds/appcds/DumpClassListWithLF.java line 70: > 68: "Hello", > 69: "@lambda-form-invoker [LF_RESOLVE] java.lang.invoke.DirectMethodHandle$Holder invokeNothing L7_L > (anyword)", 70: "@lambda-form-invoker [LF_RESOLVE] java.lang.invoke.DirectMethodHandle$Holder > invokeNothing LL_I anyword"), We shouldn't allow the classlist to contain arbitrary data. These two cases should generate an error. src/hotspot/share/classfile/lambdaFormInvokers.cpp line 52: > 50: > 51: // trim white spaces from front and end of string. > 52: char* trim(char* s) { I think this creates unnecessary dependency between the C code and the Java code. The C code assumes that the Java code has appended something like "(salvaged)" into the output, and tries to get rid of that in a non-obvious way. It's better to modify the Java code from static void traceSpeciesType(String cn, Class salvage) { if (TRACE_RESOLVE || CDS.isDumpLoadedClassList()) { String traceSP = SPECIES_RESOLVE + " " + cn + (salvage != null ? " (salvaged)" : " (generated)"); if (TRACE_RESOLVE) { System.out.println(traceSP); } CDS.logTraceResolve(traceSP); } } to if (TRACE_RESOLVE || CDS.isDumpLoadedClassList()) { String traceSP = SPECIES_RESOLVE + " " + cn; if (TRACE_RESOLVE) { System.out.println(traceSP + (salvage != null ? " (salvaged)" : " (generated)")); } CDS.logTraceResolve(traceSP); } test/hotspot/jtreg/runtime/cds/appcds/DumpClassListWithLF.java line 85: > 83: appJar, classlist( > 84: "Hello", > 85: "@lambda-form-invoker [LF_XYRESOLVE] java.lang.invoke.DirectMethodHandle$Holder invokeStatic L7_L > (any)", We should not allow incorrect input. This should generate an error. test/hotspot/jtreg/runtime/cds/appcds/DumpClassListWithLF.java line 78: > 76: "Hello", > 77: "@lambda-form-invoker [LF_RESOLVE] my.nonexist.package.MyNonExistClassName$holder invokeStatic > L7_L", 78: "@lambda-form-invoker [LF_RESOLVE] my.nonexist.package.MyNonExistClassName$holder > invokeStatic LL_I"), I think it's dangerous to allow arbitrary class names here. InvokerBytecodeGenerator doesn't check the classname. This will make it possible to overwrite the contents of arbitrary classes. We should have a check here and allow only the specific holder classes that are supported. ------------- PR: https://git.openjdk.java.net/jdk/pull/193 From iklam at openjdk.java.net Thu Oct 1 21:14:05 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 1 Oct 2020 21:14:05 GMT Subject: RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive [v7] In-Reply-To: References: Message-ID: <0pxum63ap0sTm7YsoOevKNQEhKZQQmfvigtIxyg2clc=.50b5da47-754e-4b03-9111-1e99d0311312@github.com> On Wed, 30 Sep 2020 04:59:20 GMT, Yumin Qi wrote: >> This patch is reorganized after 8252725, which is separated from this patch to refactor jlink glugin code. The previous >> webrev with hg can be found at: http://cr.openjdk.java.net/~minqi/2020/8247536/webrev-05. With 8252725 integrated, the >> regeneration of holder classes is simply to call the new added GenerateJLIClassesHelper.cdsGenerateHolderClasses >> function. Tests: tier1-4 > > Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: > > Remove trailing word of line which is not used in holder class regeneration. There is a trailing LF (Line Feed) so trim > white spaces from both front and end of the line or it will fail method type validation. Changes requested by iklam (Reviewer). src/java.base/share/classes/jdk/internal/misc/CDS.java line 30: > 28: public class CDS { > 29: // cache the result > 30: static private boolean isDumpLoadedClassList; `isDumpLoadedClassList` is not gramatically correct. Also the field should be final. How about: static final private boolean isDumpingClassList = isDumpingClassList0(); public static boolean isDumpingClassList() { return isDumpingClassList; } private static boolean isDumpingClassList0(); src/java.base/share/classes/jdk/internal/misc/CDS.java line 82: > 80: * log output to DumpLoadedClassList > 81: */ > 82: public static void logTraceResolve(String line) { `logTraceResolve` is too generic. How about `CDS.logLambdaFormInvoker()` to match the `@lambda-form-invoker` in the classlist file? ------------- PR: https://git.openjdk.java.net/jdk/pull/193 From iignatyev at openjdk.java.net Thu Oct 1 21:49:09 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Thu, 1 Oct 2020 21:49:09 GMT Subject: RFR: 8253913: unify gtest test names Message-ID: Hi all, could you please review this small and trivial patch that unifies the names of hotspot gtests? in some cases, "_test" is added to gtest names, in others it isn't. given some tests specify "test" in their names, so we get "test_test" like in `UninitializedDoubleElementWorkerDataArrayTest.sum_test_test_vm`. the patch removes `_test` from the suffixes added by `TEST*` macros. testing: ? `make test TEST=gtest` on macosx-x64 ? `test/hotspot/jtreg/gtest/` on {linux,windows,macosx}-x64 ------------- Commit messages: - 8253913: unify gtest test names Changes: https://git.openjdk.java.net/jdk/pull/475/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=475&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253913 Stats: 11 lines in 2 files changed: 4 ins; 0 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/475.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/475/head:pull/475 PR: https://git.openjdk.java.net/jdk/pull/475 From cjplummer at openjdk.java.net Thu Oct 1 21:50:05 2020 From: cjplummer at openjdk.java.net (Chris Plummer) Date: Thu, 1 Oct 2020 21:50:05 GMT Subject: RFR: 8253900: SA: wrong size computation when JVM was built without AOT [v2] In-Reply-To: References: <02CIxrp2dIe5YnVvBmmVkRuyVxA0hezEUSzcX0zD080=.c29afb6f-7994-4c52-84a6-e979b7a5d761@github.com> Message-ID: On Thu, 1 Oct 2020 20:35:57 GMT, Ziviani wrote: > > ... I suggest maybe adding a comment there saying that the test was removed from the problem list when JDK-8253900 was > > fixed, but should be re-added if JDK-8230664 starts to reproduce again. > > I like this idea because a simple `grep` will be enough to understand what happened. Just pushed my branch, any issue > just let me know. Hi Ziviani, My suggestion was to put the comment in the CR. I really don't think there should be any comment in ProblemList.txt. This is not an uncommon situation. Tests are often removed from the problem list simply because they have not been seen for a while (and we want to determine if the bug is still an issue) or because failures are rare, or possibly even so hard to reproduce that we want to eventually trigger a failure as part of regular testing to help with failure analysis. We have not added comments in the past to document these problem list removals. ------------- PR: https://git.openjdk.java.net/jdk/pull/358 From iignatyev at openjdk.java.net Thu Oct 1 22:16:16 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Thu, 1 Oct 2020 22:16:16 GMT Subject: RFR: 8253913: unify gtest test names [v2] In-Reply-To: References: Message-ID: > Hi all, > > could you please review this small and trivial patch that unifies the names of hotspot gtests? in some cases, "_test" > is added to gtest names, in others it isn't. given some tests specify "test" in their names, so we get "test_test" like > in `UninitializedDoubleElementWorkerDataArrayTest.sum_test_test_vm`. the patch removes `_test` from the suffixes added > by `TEST*` macros. testing: ? `make test TEST=gtest` on macosx-x64 > ? `test/hotspot/jtreg/gtest/` on {linux,windows,macosx}-x64 Igor Ignatyev has updated the pull request incrementally with one additional commit since the last revision: adjust test class names in LogStream's friends ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/475/files - new: https://git.openjdk.java.net/jdk/pull/475/files/d0a45837..5d7e7f50 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=475&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=475&range=00-01 Stats: 3 lines in 1 file changed: 1 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/475.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/475/head:pull/475 PR: https://git.openjdk.java.net/jdk/pull/475 From david.holmes at oracle.com Thu Oct 1 22:49:24 2020 From: david.holmes at oracle.com (David Holmes) Date: Fri, 2 Oct 2020 08:49:24 +1000 Subject: RFR: 8253899: Make IsClassUnloadingEnabled signature match specification In-Reply-To: References: Message-ID: <067adc98-e612-a1b2-63c2-ea84232941b4@oracle.com> Hi Vladimir, On 2/10/2020 1:09 am, Vladimir Kempik wrote: > Please review this change for hotspot and one test. > There is few JVMTI callback/event functions in jdk which signature doesn't match specification. > for example: > static jvmtiError JNICALL IsClassUnloadingEnabled(const jvmtiEnv* env, jboolean* enabled, ...) > but according to jvmti specs it should be: > static jvmtiError JNICALL IsClassUnloadingEnabled(const jvmtiEnv* env, ...) > same with ClassUnload(jvmtiEnv* jvmti_env, JNIEnv* jni_env, const char* name, ...) in tests Sorry I'm missing something - where in the specification is this? This is an extension event and I don't see it documented. Thanks, David > for many years that didn't matter but with coming JEP-391 it becomes important to make it match the spec > https://developer.apple.com/documentation/apple_silicon/addressing_architectural_differences_in_your_macos_code > This commit makes the above mentioned functions to have signature matching jvmti specification > > ------------- > > Commit messages: > - 8253899: Make IsClassUnloadingEnabled signature match specification + jcheck > - 8253899: Make IsClassUnloadingEnabled signature match specification > > Changes: https://git.openjdk.java.net/jdk/pull/466/files > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=466&range=00 > Issue: https://bugs.openjdk.java.net/browse/JDK-8253899 > Stats: 17 lines in 2 files changed: 15 ins; 0 del; 2 mod > Patch: https://git.openjdk.java.net/jdk/pull/466.diff > Fetch: git fetch https://git.openjdk.java.net/jdk pull/466/head:pull/466 > > PR: https://git.openjdk.java.net/jdk/pull/466 > From github.com+670087+jrziviani at openjdk.java.net Fri Oct 2 02:40:16 2020 From: github.com+670087+jrziviani at openjdk.java.net (Ziviani) Date: Fri, 2 Oct 2020 02:40:16 GMT Subject: RFR: 8253900: SA: wrong size computation when JVM was built without AOT [v5] In-Reply-To: References: Message-ID: > TestInstanceKlassSize was failing because, for PowerPC, the following code (instanceKlass.cpp) always compiles to > `return false;` bool InstanceKlass::has_stored_fingerprint() const { > #if INCLUDE_AOT > return should_store_fingerprint() || is_shared(); > #else > return false; > #endif > } > However, in `hasStoredFingerprint()@InstanceKlass.java` the condition `shouldStoreFingerprint() || isShared();` is > always evaluated and may return true (_AFAIK isShared() returns true_). Such condition adds 8 bytes in the > `getSize()@InstanceKlass.java` causing the failure in TestInstanceKlassSize: public long getSize() { // in number of > bytes > ... > if (hasStoredFingerprint()) { > size += 8; // uint64_t > } > return alignSize(size); > } > Considering these tests are failing for PowerPC only (_based on ProblemList.txt_), my solution checks if > `hasStoredFingerprint()` is running on a PowerPC platform. I decided to go this way because there is no existing flag > informing whether AOT is included or not and creating a new one just to handle the PowerPC case seems too much. This > patch is an attempt to fix https://bugs.openjdk.java.net/browse/JDK-8230664 Ziviani has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: 8253900: SA: wrong size computation when JVM was built without AOT The code hasStoredFingerprint() at InstanceKlass.java is not considering AOT disabled at compilation time, like has_stored_fingerprint() at instanceKlass.cpp does. Such difference can cause TestInstanceKlassSize failures because all objects will have an extra 8-bytes. TestInstanceKlassSize and TestInstanceKlassSizeForInterface were removed from ProblemList.txt because it cannot be reproduced after this change, but, if JDK-8230664 starts to reproduce again, these tests can be disabled. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/358/files - new: https://git.openjdk.java.net/jdk/pull/358/files/5c08095f..8e9d1368 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=358&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=358&range=03-04 Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/358.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/358/head:pull/358 PR: https://git.openjdk.java.net/jdk/pull/358 From github.com+670087+jrziviani at openjdk.java.net Fri Oct 2 02:40:16 2020 From: github.com+670087+jrziviani at openjdk.java.net (Ziviani) Date: Fri, 2 Oct 2020 02:40:16 GMT Subject: RFR: 8253900: SA: wrong size computation when JVM was built without AOT [v2] In-Reply-To: References: <02CIxrp2dIe5YnVvBmmVkRuyVxA0hezEUSzcX0zD080=.c29afb6f-7994-4c52-84a6-e979b7a5d761@github.com> Message-ID: On Thu, 1 Oct 2020 21:47:01 GMT, Chris Plummer wrote: > My suggestion was to put the comment in the CR. I really don't think there should be any comment in ProblemList.txt. > This is not an uncommon situation. Tests are often removed from the problem list simply because they have not been seen > for a while (and we want to determine if the bug is still an issue) or because failures are rare, or possibly even so > hard to reproduce that we want to eventually trigger a failure as part of regular testing to help with failure > analysis. We have not added comments in the past to document these problem list removals. It's clear now. Sorry for the confusion. :-). I removed the tests from problemlist again and add a comment in my commit message. Any `git blame` will easily find it. Hope I did it right now :-D Thanks Chris! ------------- PR: https://git.openjdk.java.net/jdk/pull/358 From cjplummer at openjdk.java.net Fri Oct 2 03:36:04 2020 From: cjplummer at openjdk.java.net (Chris Plummer) Date: Fri, 2 Oct 2020 03:36:04 GMT Subject: RFR: 8253900: SA: wrong size computation when JVM was built without AOT [v5] In-Reply-To: References: Message-ID: <-CBayWf5LBODPitJO2x4vQNdr7_BSUhoo0SH_1oJlUE=.dda4c9d9-aba3-4d49-8a6b-35c33463bd8e@github.com> On Fri, 2 Oct 2020 02:40:16 GMT, Ziviani wrote: >> TestInstanceKlassSize was failing because, for PowerPC, the following code (instanceKlass.cpp) always compiles to >> `return false;` bool InstanceKlass::has_stored_fingerprint() const { >> #if INCLUDE_AOT >> return should_store_fingerprint() || is_shared(); >> #else >> return false; >> #endif >> } >> However, in `hasStoredFingerprint()@InstanceKlass.java` the condition `shouldStoreFingerprint() || isShared();` is >> always evaluated and may return true (_AFAIK isShared() returns true_). Such condition adds 8 bytes in the >> `getSize()@InstanceKlass.java` causing the failure in TestInstanceKlassSize: public long getSize() { // in number of >> bytes >> ... >> if (hasStoredFingerprint()) { >> size += 8; // uint64_t >> } >> return alignSize(size); >> } >> Considering these tests are failing for PowerPC only (_based on ProblemList.txt_), my solution checks if >> `hasStoredFingerprint()` is running on a PowerPC platform. I decided to go this way because there is no existing flag >> informing whether AOT is included or not and creating a new one just to handle the PowerPC case seems too much. This >> patch is an attempt to fix https://bugs.openjdk.java.net/browse/JDK-8230664 > > Ziviani has refreshed the contents of this pull request, and previous commits have been removed. The incremental views > will show differences compared to the previous content of the PR. Looks good. ------------- Marked as reviewed by cjplummer (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/358 From dholmes at openjdk.java.net Fri Oct 2 04:27:04 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 2 Oct 2020 04:27:04 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts In-Reply-To: References: Message-ID: On Thu, 1 Oct 2020 20:11:46 GMT, Daniel D. Daugherty wrote: >> The issue is that this test doesn't consider Handshake All operation. >> Depending if/when such operation is scheduled it can lockup the VM thread. >> And the safepoint that should timeout never happens. >> See issue for more information. >> >> So I changed the test to "try timeout" the safepoint, but if there was no safepoint (blocked by a handshake all), we >> retry. We sleep unsafe much longer than the interval SafepointALot generates operations, which 'guarantees' we will >> timeout if there is no handshake all. (some extreme case of kernel scheduling causing a very long context switch could >> also make us not timeout) Passes t1, t3, and repeat runs of the test. > > Changes requested by dcubed (Reviewer). Hi Robbin, So.... The old test used an "uncounted loop" (based on internal JIT knowledge) to create looping code with no safepoint polls so that it remains safepoint-unsafe (and Patricio had to tweak the test conditions to avoid unexpected safepoints). The new code has a WhiteBox entry that uses an internal naked_sleep which keeps the thread _thread_in_VM IIUC, which is not safepoint-safe, but also potentially different to being _thread_in_Java. But lets just accept the net effect is the same - the thread will prevent a safepoint from being reached until the sleep time has elapsed. If that time is > (GuaranteedSafepointInterval + SafepointTimeoutDelay) then we should see a safepoint timeout and the VM abort. Okay ... so how does that solve the problem the test currently experiences with handshakes ... if we are at a handshake the handshake can't proceed until the sleep time expires, but then when we transition back to Java the thread will see the handshake and so the handshake will proceed. As long as the WB function returns false we will repeat the process, eventually when the expected safepoint is requested we should again trigger the safepoint timeout and abort. But like Dan I'm unclear how the WB function can ever return true as the safepoint state can't change whilst the thread is in the naked sleep. ?? Aside: rather than using "args.length > 0" to discriminate between the original and subsequent executions of the test class, it can be clearer (IMO) to add a static nested class which has the main method that performs the actual test, and you invoke that via ProcessTools. That all said, for the record, we really should have a handshake timeout mechanism the same as we have the safepoint timeout mechanism. Thanks, David ------------- PR: https://git.openjdk.java.net/jdk/pull/465 From sspitsyn at openjdk.java.net Fri Oct 2 04:50:04 2020 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Fri, 2 Oct 2020 04:50:04 GMT Subject: RFR: 8253900: SA: wrong size computation when JVM was built without AOT [v5] In-Reply-To: References: Message-ID: <0pw_Crky-y-0bmejUyDxB1cxuY1ESU_F6EVO4pO4a6U=.60aa8fdc-4d0f-47e3-be57-06f975228c63@github.com> On Fri, 2 Oct 2020 02:40:16 GMT, Ziviani wrote: >> TestInstanceKlassSize was failing because, for PowerPC, the following code (instanceKlass.cpp) always compiles to >> `return false;` bool InstanceKlass::has_stored_fingerprint() const { >> #if INCLUDE_AOT >> return should_store_fingerprint() || is_shared(); >> #else >> return false; >> #endif >> } >> However, in `hasStoredFingerprint()@InstanceKlass.java` the condition `shouldStoreFingerprint() || isShared();` is >> always evaluated and may return true (_AFAIK isShared() returns true_). Such condition adds 8 bytes in the >> `getSize()@InstanceKlass.java` causing the failure in TestInstanceKlassSize: public long getSize() { // in number of >> bytes >> ... >> if (hasStoredFingerprint()) { >> size += 8; // uint64_t >> } >> return alignSize(size); >> } >> Considering these tests are failing for PowerPC only (_based on ProblemList.txt_), my solution checks if >> `hasStoredFingerprint()` is running on a PowerPC platform. I decided to go this way because there is no existing flag >> informing whether AOT is included or not and creating a new one just to handle the PowerPC case seems too much. This >> patch is an attempt to fix https://bugs.openjdk.java.net/browse/JDK-8230664 > > Ziviani has refreshed the contents of this pull request, and previous commits have been removed. The incremental views > will show differences compared to the previous content of the PR. LGTM ------------- Marked as reviewed by sspitsyn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/358 From sspitsyn at openjdk.java.net Fri Oct 2 05:13:02 2020 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Fri, 2 Oct 2020 05:13:02 GMT Subject: RFR: 8253899: Make IsClassUnloadingEnabled signature match specification In-Reply-To: References: Message-ID: On Thu, 1 Oct 2020 15:02:01 GMT, Vladimir Kempik wrote: > Please review this change for hotspot and one test. > There is few JVMTI callback/event functions in jdk which signature doesn't match specification. > for example: > static jvmtiError JNICALL IsClassUnloadingEnabled(const jvmtiEnv* env, jboolean* enabled, ...) > but according to jvmti specs it should be: > static jvmtiError JNICALL IsClassUnloadingEnabled(const jvmtiEnv* env, ...) > same with ClassUnload(jvmtiEnv* jvmti_env, JNIEnv* jni_env, const char* name, ...) in tests > for many years that didn't matter but with coming JEP-391 it becomes important to make it match the spec > https://developer.apple.com/documentation/apple_silicon/addressing_architectural_differences_in_your_macos_code > This commit makes the above mentioned functions to have signature matching jvmti specification Vladimir, it looks good to me. David, I think, Vladimir is referring to the JVMTI extension mechanism spec: https://docs.oracle.com/en/java/javase/15/docs/specs/jvmti.html#jvmtiExtensionFunction https://docs.oracle.com/en/java/javase/15/docs/specs/jvmti.html#jvmtiExtensionEvent ------------- Marked as reviewed by sspitsyn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/466 From stefank at openjdk.java.net Fri Oct 2 06:25:05 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Fri, 2 Oct 2020 06:25:05 GMT Subject: RFR: 8253913: unify gtest test names [v2] In-Reply-To: References: Message-ID: On Thu, 1 Oct 2020 22:16:16 GMT, Igor Ignatyev wrote: >> Hi all, >> >> could you please review this small and trivial patch that unifies the names of hotspot gtests? in some cases, "_test" >> is added to gtest names, in others it isn't. given some tests specify "test" in their names, so we get "test_test" like >> in `UninitializedDoubleElementWorkerDataArrayTest.sum_test_test_vm`. the patch removes `_test` from the suffixes added >> by `TEST*` macros. testing: ? `make test TEST=gtest` on macosx-x64 >> ? `test/hotspot/jtreg/gtest/` on {linux,windows,macosx}-x64 > > Igor Ignatyev has updated the pull request incrementally with one additional commit since the last revision: > > adjust test class names in LogStream's friends Looks good. I think it's great that you clean this up. I started doing something similar myself, but never got around to create an RFE for it. IIRC, when I looked at it there were more inconsistencies with the names, but maybe they are gone now. ------------- Marked as reviewed by stefank (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/475 From sspitsyn at openjdk.java.net Fri Oct 2 06:35:05 2020 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Fri, 2 Oct 2020 06:35:05 GMT Subject: RFR: 8252657: JVMTI agent is not unloaded when Agent_OnAttach is failed In-Reply-To: References: <1H1wUQdxCLU2qddqEIYSx2iOhIKL3b5etUmjsS6NBlU=.0bf1fe0c-8dcf-4ca0-bd57-b8794d5f2810@github.com> Message-ID: On Tue, 8 Sep 2020 07:12:51 GMT, Yasumasa Suenaga wrote: >> If `Agent_OnAttach()` in JVMTI agent which is attempted to load via JVMTI.agent_load dcmd is failed, it would not be >> unloaded. We've [discussed it on >> serviceability-dev](https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-September/032839.html). This PR is >> a continuation of that. This PR also includes to call `Agent_OnUnload()` when `Agent_OnAttach()` failed. >> >> How to reproduce: >> >> 1. Build JVMTI agent for test >> $ git clone https://github.com/YaSuenag/jvmti-examples.git >> $ cd jvmti-examples/helloworld/out/build >> $ cmake ../.. >> >> 2. Run JShell >> >> 3. Load JVMTI agent via `jcmd JVMTI.agent_load` with "error" ("error" means `Agent_OnAttach()` returns JNI_ERR) >> $ jcmd >> 89456 jdk.jshell.execution.RemoteExecutionControl 45651 >> 89547 sun.tools.jcmd.JCmd >> 89436 jdk.jshell/jdk.internal.jshell.tool.JShellToolProvider >> $ jcmd 89436 JVMTI.agent_load `pwd`/libhelloworld.so error >> 89436: >> return code: -1 >> >> 4. Check loaded libraries via `jcmd VM.dynlibs` >> $ jcmd 89436 VM.dynlibs | grep libhelloworld >> 7f2f8b06b000-7f2f8b06c000 r--p 00000000 fd:00 11818202 >> /home/ysuenaga/github/jvmti-examples/helloworld/out/build/libhelloworld.so 7f2f8b06c000-7f2f8b06d000 r-xp 00001000 >> fd:00 11818202 /home/ysuenaga/github/jvmti-examples/helloworld/out/build/libhelloworld.so 7f2f8b06d000-7f2f8b06e000 >> r--p 00002000 fd:00 11818202 /home/ysuenaga/github/jvmti-examples/helloworld/out/build/libhelloworld.so >> 7f2f8b06e000-7f2f8b06f000 r--p 00002000 fd:00 11818202 >> /home/ysuenaga/github/jvmti-examples/helloworld/out/build/libhelloworld.so 7f2f8b06f000-7f2f8b070000 rw-p 00003000 >> fd:00 11818202 /home/ysuenaga/github/jvmti-examples/helloworld/out/build/libhelloworld.so > > @edvbld Can you approve me to run tier1 tests with /test PR command again? I was initially wrong by supporting this, and now I share David's concerns about unclear semantics of this. The questions are: - Q1: Is it necessary to call the Agent_OnUnload()? - Q2: Would it be a JVMTI spec violation to call the Agent_OnAttach() multiple times? (It seems to be the case to me.) - Q3: What has to be done for statically linked agent? - Q4: Should the agent be correctly loadable in the first place? What were the reasons its loading to fail? Yes, at least, a CSR is needed for this. ------------- PR: https://git.openjdk.java.net/jdk/pull/19 From shade at openjdk.java.net Fri Oct 2 07:00:06 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 2 Oct 2020 07:00:06 GMT Subject: Integrated: 8253891: Debug x86_32 builds fail after JDK-8239090 In-Reply-To: References: Message-ID: On Thu, 1 Oct 2020 09:52:54 GMT, Aleksey Shipilev wrote: > `CPU_MAX_FEATURE` is actually a `uint64_t`, with at least 46 bits set. `exact_log2` expects `intptr_t`. The implicit > conversion works on 64-bit, but fails on 32-bit. Calling to `exact_log2_long` seems to cater for both bitnesses. > Testing: > - [x] tier1 on Linux x86_64 > - [x] tier1 on Linux x86_32 (some unrelated failures) This pull request has now been integrated. Changeset: 6f40a414 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/6f40a414 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8253891: Debug x86_32 builds fail after JDK-8239090 Reviewed-by: stuefe, phh ------------- PR: https://git.openjdk.java.net/jdk/pull/455 From vkempik at openjdk.java.net Fri Oct 2 07:02:03 2020 From: vkempik at openjdk.java.net (Vladimir Kempik) Date: Fri, 2 Oct 2020 07:02:03 GMT Subject: RFR: 8253899: Make IsClassUnloadingEnabled signature match specification In-Reply-To: References: Message-ID: <5UPvhVjwK3TOKrd2rSzeCz-xT0tXsCUkMT9R8tgFR8I=.f1723d22-6883-4ad3-af55-a7437ef905de@github.com> On Fri, 2 Oct 2020 05:10:20 GMT, Serguei Spitsyn wrote: >> Please review this change for hotspot and one test. >> There is few JVMTI callback/event functions in jdk which signature doesn't match specification. >> for example: >> static jvmtiError JNICALL IsClassUnloadingEnabled(const jvmtiEnv* env, jboolean* enabled, ...) >> but according to jvmti specs it should be: >> static jvmtiError JNICALL IsClassUnloadingEnabled(const jvmtiEnv* env, ...) >> same with ClassUnload(jvmtiEnv* jvmti_env, JNIEnv* jni_env, const char* name, ...) in tests >> for many years that didn't matter but with coming JEP-391 it becomes important to make it match the spec >> https://developer.apple.com/documentation/apple_silicon/addressing_architectural_differences_in_your_macos_code >> This commit makes the above mentioned functions to have signature matching jvmti specification > > Vladimir, it looks good to me. > David, > I think, Vladimir is referring to the JVMTI extension mechanism spec: > https://docs.oracle.com/en/java/javase/15/docs/specs/jvmti.html#jvmtiExtensionFunction > https://docs.oracle.com/en/java/javase/15/docs/specs/jvmti.html#jvmtiExtensionEvent Hello Serguei, you are right, I was talking about this documents. Thank you. ------------- PR: https://git.openjdk.java.net/jdk/pull/466 From shade at openjdk.java.net Fri Oct 2 07:06:03 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 2 Oct 2020 07:06:03 GMT Subject: RFR: 8253882: remove PropertyResolvingWrapper In-Reply-To: <1DMgNmvJfsDOtn5b8yNiFMzREHTlTfH7rNQzB4-zVAo=.8f35b465-4ac9-4e7f-bf8e-7571ea9bc4aa@github.com> References: <1DMgNmvJfsDOtn5b8yNiFMzREHTlTfH7rNQzB4-zVAo=.8f35b465-4ac9-4e7f-bf8e-7571ea9bc4aa@github.com> Message-ID: On Wed, 30 Sep 2020 22:43:35 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this trivial patch which removes `PropertyResolvingWrapper`? > from JBS: >> w/ all [JDK-8219140]'s sub-tasks being done, there are no more usages of `PropertyResolvingWrapper`, so this class can >> be removed. > > [JDK-8219140]: https://bugs.openjdk.java.net/browse/JDK-8219140 Marked as reviewed by shade (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/446 From rehn at openjdk.java.net Fri Oct 2 07:18:08 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Fri, 2 Oct 2020 07:18:08 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts In-Reply-To: References: Message-ID: <9CQIDFYeLLRibIAwy6T_XgWPoB4STYNycFra37XBvGw=.34931f06-646a-4b93-be55-594bfdea6931@github.com> On Thu, 1 Oct 2020 19:49:20 GMT, Daniel D. Daugherty wrote: >> The issue is that this test doesn't consider Handshake All operation. >> Depending if/when such operation is scheduled it can lockup the VM thread. >> And the safepoint that should timeout never happens. >> See issue for more information. >> >> So I changed the test to "try timeout" the safepoint, but if there was no safepoint (blocked by a handshake all), we >> retry. We sleep unsafe much longer than the interval SafepointALot generates operations, which 'guarantees' we will >> timeout if there is no handshake all. (some extreme case of kernel scheduling causing a very long context switch could >> also make us not timeout) Passes t1, t3, and repeat runs of the test. > > src/hotspot/share/prims/whitebox.cpp line 2294: > >> 2292: >> 2293: WB_ENTRY(jboolean, WB_WaitUnsafe(JNIEnv* env, jobject wb, jint time)) >> 2294: SafepointStateTracker tracker = SafepointSynchronize::safepoint_state_tracker(); > > I had to go back and reread the `SafepointStateTracker` code... > Because this JavaThread is executing and not at a safepoint, the call > to `SafepointSynchronize::safepoint_state_tracker()` will save state > as not-at-a-safepoint (with some safepoint_id value). Yes. We can never return true from this function, e.g. a safepoint happened when unsafe, if we don't have a bug. > src/hotspot/share/prims/whitebox.cpp line 2296: > >> 2294: SafepointStateTracker tracker = SafepointSynchronize::safepoint_state_tracker(); >> 2295: os::naked_short_sleep(time); >> 2296: return tracker.safepoint_state_changed(); > > Ahhh... returns true when we've had a state change or when the > safepoint ID has changed, but... how can the system change the > `SafepointSynchronize::is_at_safepoint()` return value or the > safepoint_id value while we're sleeping? The system shouldn't > be able to go to a safepoint or change safepoint_id values while > this calling thread is not safepoint safe. > > I would think that this function would always return false, but maybe > I'm missing something here. Yes it is like saying return false when everything is working. So yes it may not be so useful. > test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 30: > >> 28: /* >> 29: * @test TestAbortVMOnSafepointTimeout >> 30: * @summary Check if VM can kill thread which doesn't reach safepoint. > > Not your bug, but this summary is wrong. Perhaps: > `@summary Check if VM aborts when a thread doesn't reach safepoint.` The timeout shots a SIGILL on the 'slow' thread, it does not abort (it do abort if it can't send the signal). Test also checks that the log says we have done this. > test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 47: > >> 45: System.out.println("Waiting for safepoint"); >> 46: } >> 47: System.out.println("This message would occur after some time."); > > Maybe I'm missing something, but I don't see how `wb.waitUnsafe(waitTime)` is > ever going to return anything but false so this message should never be printed. It's not a new line. The line was there if the VM fails timeout, so it was never printed assuming a working VM. And now it still not printed assuming a working VM. > test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 71: > >> 69: ProcessBuilder pb = ProcessTools.createJavaProcessBuilder( >> 70: "-XX:+UnlockDiagnosticVMOptions", >> 71: "-XX:-UseBiasedLocking", > > I think "-XX:-UseBiasedLocking" is specified to make sure > that Biased Locking is disabled even in test tasks where it > is enabled by task specific flags. Yes. But now this test is fine with using biased locking. > test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 66: > >> 64: "-Xms64m", >> 65: "TestAbortVMOnSafepointTimeout", >> 66: "" + unsafe_wait > > Cheap conversion from int to String? I followed the other ones: "-XX:SafepointTimeoutDelay=" + timeout_delay In this case we have no name for the 'option', it thus becomes: "" + unsafe_wait > test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 79: > >> 77: } >> 78: } >> 79: output.shouldNotHaveExitValue(0); > > Looks like the test doesn't require that this mesg get printed: > `System.out.println("This message would occur after some time.");` > > And it is set up to detect that the SafepointTimeout happened > which is what we want the test to verify at the core. The line "This message would occur after some time." should never be print if VM is working. If the VM fails for some reason and the timeout is not performed, line: "Timed out while spinning to reach a safepoint." is never printed and the OutputAnalyzer fails the test. If we did timeout and it was printed we know that we didn't print the other message, since the only thread that can timeout is the one printing that message. The second part verifies that the SIGILL was delivered. ------------- PR: https://git.openjdk.java.net/jdk/pull/465 From rehn at openjdk.java.net Fri Oct 2 07:18:04 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Fri, 2 Oct 2020 07:18:04 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts In-Reply-To: References: Message-ID: On Fri, 2 Oct 2020 04:24:44 GMT, David Holmes wrote: > Hi Robbin, > Hi David, > So.... The old test used an "uncounted loop" (based on internal JIT knowledge) to create looping code with no safepoint > polls so that it remains safepoint-unsafe (and Patricio had to tweak the test conditions to avoid unexpected > safepoints). The new code has a WhiteBox entry that uses an internal naked_sleep which keeps the thread _thread_in_VM > IIUC, which is not safepoint-safe, but also potentially different to being _thread_in_Java. But lets just accept the > net effect is the same - the thread will prevent a safepoint from being reached until the sleep time has elapsed. If > that time is > (GuaranteedSafepointInterval + SafepointTimeoutDelay) then we should see a safepoint timeout and the VM > abort. Okay ... so how does that solve the problem the test currently experiences with handshakes ... if we are at a > handshake the handshake can't proceed until the sleep time expires, but then when we transition back to Java the thread > will see the handshake and so the handshake will proceed. As long as the WB function returns false we will repeat the > process, eventually when the expected safepoint is requested we should again trigger the safepoint timeout and abort. > But like Dan I'm unclear how the WB function can ever return true as the safepoint state can't change whilst the thread > is in the naked sleep. ?? It can't return true if the VM is working. So yes the safepoint tracker maybe overkill. > > Aside: rather than using "args.length > 0" to discriminate between the original and subsequent executions of the test > class, it can be clearer (IMO) to add a static nested class which has the main method that performs the actual test, > and you invoke that via ProcessTools. I didn't change any of that. > That all said, for the record, we really should have a handshake timeout mechanism the same as we have the safepoint > timeout mechanism. We have a timeout mechanism but default off HandshakeTimeout. But it doesn't fire SIGILL to troubled thread as safepoint does. Thanks, Robbin > > Thanks, > David ------------- PR: https://git.openjdk.java.net/jdk/pull/465 From rehn at openjdk.java.net Fri Oct 2 07:18:08 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Fri, 2 Oct 2020 07:18:08 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts In-Reply-To: References: Message-ID: On Thu, 1 Oct 2020 20:11:12 GMT, Daniel D. Daugherty wrote: >> test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 49: >> >>> 47: System.out.println("This message would occur after some time."); >>> 48: } else { >>> 49: testWith(50, 1, 999); >> >> Please consider: >> >> `testWith(50 /* sfpt_interval */, 1 /* timeout_delay */, 999 /* unsafe_wait */);` > > Also, I think the test would be more clear if this testWith() part > was in a `if (args.length == 0)` block at the top and the else > part was the rest of the test. After all, code flow wise, you > execute the `(args.length == 0)` case first and then come > back for the case with the unsafe_wait value. I only removed the return in favor of the else, the rest is the same as before. ------------- PR: https://git.openjdk.java.net/jdk/pull/465 From dholmes at openjdk.java.net Fri Oct 2 07:30:04 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 2 Oct 2020 07:30:04 GMT Subject: RFR: 8253899: Make IsClassUnloadingEnabled signature match specification In-Reply-To: <5UPvhVjwK3TOKrd2rSzeCz-xT0tXsCUkMT9R8tgFR8I=.f1723d22-6883-4ad3-af55-a7437ef905de@github.com> References: <5UPvhVjwK3TOKrd2rSzeCz-xT0tXsCUkMT9R8tgFR8I=.f1723d22-6883-4ad3-af55-a7437ef905de@github.com> Message-ID: On Fri, 2 Oct 2020 06:59:13 GMT, Vladimir Kempik wrote: >> Vladimir, it looks good to me. > >> David, >> I think, Vladimir is referring to the JVMTI extension mechanism spec: >> https://docs.oracle.com/en/java/javase/15/docs/specs/jvmti.html#jvmtiExtensionFunction >> https://docs.oracle.com/en/java/javase/15/docs/specs/jvmti.html#jvmtiExtensionEvent > > Hello Serguei, you are right, I was talking about this documents. > Thank you. Okay but look at the example that documentation gives: > For example, if the jvmtiParamInfo returned by GetExtensionEvents indicates that there is a jint parameter, the event > handler should be declared: > void JNICALL myHandler(jvmtiEnv* jvmti_env, jint myInt, ...) The myInt is explicit, just as our "jboolean* enabled" is explicit. I think they key point is that the signature must end with "..." which it does. I don't see anything here that needs to be fixed. ------------- PR: https://git.openjdk.java.net/jdk/pull/466 From ysuenaga at openjdk.java.net Fri Oct 2 07:31:03 2020 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Fri, 2 Oct 2020 07:31:03 GMT Subject: RFR: 8252657: JVMTI agent is not unloaded when Agent_OnAttach is failed In-Reply-To: References: <1H1wUQdxCLU2qddqEIYSx2iOhIKL3b5etUmjsS6NBlU=.0bf1fe0c-8dcf-4ca0-bd57-b8794d5f2810@github.com> Message-ID: <80LJDTCsT_y-KlThryd5Bxu5RRyrjmKfs5p9vJUn61E=.68b594a0-fe58-4f4d-a49c-eec2e90f9373@github.com> On Fri, 2 Oct 2020 06:30:34 GMT, Serguei Spitsyn wrote: > * Q1: Is it necessary to call the Agent_OnUnload()? [JVMTI spec of Agent_OnUnload()](https://docs.oracle.com/en/java/javase/15/docs/specs/jvmti.html#onunload) says this function will be called when the agent library will be unloaded by platform specific mechanism. OTOH it also says `Agent_OnUnload()` will be called both at VM termination and **by other reasons**. The spec don't say for the case if `Agent_OnAttach()` would be failed. IMHO `Agent_OnUnload()` should be called because this PR would unload library if `Agent_OnAttach()` failed. > * Q2: Would it be a JVMTI spec violation to call the Agent_OnAttach() multiple times? (It seems to be the case to me.) `Agent_OnAttach()` should be called only once per attach request, but VM should accept multiple attach request for same agent library. For example, we can add multiple `-agentlib` and `-agentpath` request as below. JVMTI agent might change behavior due to arguments or configuration file. -agentlib:test=profile=A -agentlib:test=profile=B -agentpath:/path/to/libtest=profile=C Agent developers should have responsibility for the behavior when more than one agent is loaded at a time. > * Q3: What has to be done for statically linked agent? JVMTI spec says "unless it is statically linked into the executable", so I think we can ignore about Agent_OnUnload_L() in this PR. > * Q4: Should the agent be correctly loadable in the first place? What were the reasons its loading to fail? Agent (`Agent_OnAttach()`) might fail due to error in agent logic. For example, some agents load configuration file at initialization. If the user gives wrong value, it will fail. > Yes, at least, a CSR is needed for this. I will file CSR for this PR after this discussion. ------------- PR: https://git.openjdk.java.net/jdk/pull/19 From vkempik at openjdk.java.net Fri Oct 2 07:37:03 2020 From: vkempik at openjdk.java.net (Vladimir Kempik) Date: Fri, 2 Oct 2020 07:37:03 GMT Subject: RFR: 8253899: Make IsClassUnloadingEnabled signature match specification In-Reply-To: References: <5UPvhVjwK3TOKrd2rSzeCz-xT0tXsCUkMT9R8tgFR8I=.f1723d22-6883-4ad3-af55-a7437ef905de@github.com> Message-ID: On Fri, 2 Oct 2020 07:27:17 GMT, David Holmes wrote: > Okay but look at the example that documentation gives: > > > For example, if the jvmtiParamInfo returned by GetExtensionEvents indicates that there is a jint parameter, the event > > handler should be declared: ``` > > void JNICALL myHandler(jvmtiEnv* jvmti_env, jint myInt, ...) > > ``` > > The myInt is explicit, just as our "jboolean* enabled" is explicit. I think they key point is that the signature must > end with "..." which it does. > I don't see anything here that needs to be fixed. Hello David. On majority of platforms this would be fine. But on some platforms, variadic arguments and non variadic arguments are passed differently ( for example on macos-aarch64, variadic args are passed always on stack, non variadic on registers (and on stack for 9th+ arg) , that causes issues. If you still see no issues here we can delay and make this changeset part of JEP-391. But since this changeset isn't much macos-aarch64 specific, I thought it would be good to integrate it separately from jep-391. Regards, Vladimir ------------- PR: https://git.openjdk.java.net/jdk/pull/466 From rehn at openjdk.java.net Fri Oct 2 07:45:21 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Fri, 2 Oct 2020 07:45:21 GMT Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: References: Message-ID: On Thu, 1 Oct 2020 13:30:23 GMT, Alan Hayward wrote: >> You no longer need any isb when returning from the native wrappers (interpreted or compiled variant) after my patch. >> That *should* be 2 if the mentioned 3 hooks (without looking closely at the code). Because that will be done in the >> runtime instead when waking up from native. Which hook do we have left? > > Agreed. With both patches, and with the JNI isbs removed, that leaves just one safepoint isb in the AArch64 code (plus > the other isb in emit_static_call_stub). > The test patch exists not just to test the AArch64 but the common code too (ideally it would be extended to other > targets too). Are we happy that the cross_modify_fence is called at all the required points? A hole in the common code > would fail only very rarely. It does require quite a bit of code to add this test though. The only reason for having cmf() on thread is the validation in debug builds, right? If you do it the opposite way: inline void OrderAccess::cross_modify_fence_non_arch_specific() { OrderAccess::cross_modify_fence_arch_impl(); #ifdef ASSERT if (VerifyCrossModifyFence) { Thread::current()->set_requires_cross_modify_fence(false); } #endif } You only need the boolean in debug builds on JavaThreads and you don't need to move the cmf() from OA and create the new one? ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From github.com+4146708+a74nh at openjdk.java.net Fri Oct 2 08:08:39 2020 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Fri, 2 Oct 2020 08:08:39 GMT Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: References: Message-ID: On Fri, 2 Oct 2020 07:42:28 GMT, Robbin Ehn wrote: >> Agreed. With both patches, and with the JNI isbs removed, that leaves just one safepoint isb in the AArch64 code (plus >> the other isb in emit_static_call_stub). >> The test patch exists not just to test the AArch64 but the common code too (ideally it would be extended to other >> targets too). Are we happy that the cross_modify_fence is called at all the required points? A hole in the common code >> would fail only very rarely. It does require quite a bit of code to add this test though. > > The only reason for having cmf() on thread is the validation in debug builds, right? > If you do it the opposite way: > > inline void OrderAccess::cross_modify_fence_non_arch_specific() { > OrderAccess::cross_modify_fence_arch_impl(); > #ifdef ASSERT > if (VerifyCrossModifyFence) { > Thread::current()->set_requires_cross_modify_fence(false); > } > #endif > } > > You only need the boolean in debug builds on JavaThreads and you don't need to move the cmf() from OA and create the > new one? > _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on > [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ > Hi Alan, > > On 1/10/2020 2:30 am, Alan Hayward wrote: > > > The AArch64 port uses maybe_isb in places where an ISB might be required > > because the code may have safepointed. These maybe_isbs are very conservative > > and are used in many places are used when a safepoint has not happened. > > cross_modify_fence was added in common code to place a barrier in all the > > places after a safepoint has occurred. All the uses of it are in common code, > > yet it remains unimplemented on AArch64. > > This set of patches implements cross_modify_fence for AArch64 and reconsiders > > every uses of maybe_isb, discarding many of them. In addition, it introduces > > a new diagnostic option, which when enabled on AArch64 tests the correct > > usage of the barriers. > > Advantage of this patch is threefold: > > * Reducing the number of ISBs - giving a theoretical performance improvement. > > * Use of common code instead of backend specific code. > > * Additional test diagnostic options > > Patch 1: Split cross_modify_fence > > ================================= > > This is simply refactoring work split out to simplify the other two patches. > > instruction_fence() is provided by each target and simply places > > a fence for the instruction stream. > > cross_modify_fence() is now a member of JavaThread and just calls > > instruction_fence. This function will be extended in Patch 3. > > I don't agree with the change here. The cross_modify_fence() is not > related to thread API imo, it belongs in OrderAccess. The name was > deliberately selected to abstract away from the specific details of why > a given platform may need this fence: > > http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-March/037153.html > > "The name "instruction_pipeline" seems a bit implementation specific > about what HW architectural features need to be taken care of due to > cross-modifying code, which may or may not apply to a given platform. > Perhaps cross_modify_fence(), or something along those lines, would be > better. That makes it more clear what we are protecting against, as > opposed to what HW architectural features that might concern on a given > platform." > > @robehn , @fisk please chime in here. :) > > Thanks, > David I have no strong feeling on the names of these functions. The reason for moving it was that in the third part (the verification testing) it needs JavaThread. In an earlier version I simply had OrderAccess::cross_modify_fence(JavaThread *thread). But then OrderAccess is dependant on JavaThread, which felt wrong. Obvious solution was to add a wrapper in JavaThread that calls down to OrderAccess. Alternatively, I could switch the name of instruction_fence back to cross_modify_fence, and then think of a name for the added function in JavaThread. Alternatively alternatively, they could both be call cross_modify_fence. If the verification test patch was removed from the set, then most of the first patch wouldn't be needed either. ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From aph at redhat.com Fri Oct 2 08:39:34 2020 From: aph at redhat.com (Andrew Haley) Date: Fri, 2 Oct 2020 09:39:34 +0100 Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: References: Message-ID: <09b0fe78-d1eb-389a-843b-a1b864488d6b@redhat.com> On 01/10/2020 03:18, David Holmes wrote: >> cross_modify_fence() is now a member of JavaThread and just calls >> instruction_fence. This function will be extended in Patch 3. > I don't agree with the change here. The cross_modify_fence() is not > related to thread API imo, it belongs in OrderAccess. I agree with you, David. cross_modify_fence() should not be moved. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From github.com+70893615+jasontatton-aws at openjdk.java.net Fri Oct 2 08:40:58 2020 From: github.com+70893615+jasontatton-aws at openjdk.java.net (Jason Tatton) Date: Fri, 2 Oct 2020 08:40:58 GMT Subject: RFR: 8173585: Intrinsify StringLatin1.indexOf(char) [v4] In-Reply-To: <_T0873dC5tfUtGn9r1_Y21JkPKRt-za3MM9hPN2GQKQ=.b865fe53-5417-424f-81b6-1566a330640e@github.com> References: <_T0873dC5tfUtGn9r1_Y21JkPKRt-za3MM9hPN2GQKQ=.b865fe53-5417-424f-81b6-1566a330640e@github.com> Message-ID: > This is an implementation of the indexOf(char) intrinsic for StringLatin1 (1 byte encoded Strings). It is provided for > x86 and ARM64. The implementation is greatly inspired by the indexOf(char) intrinsic for StringUTF16. To incorporate it > I had to make a small change to StringLatin1.java (refactor of functionality to intrisified private method) as well as > code for C2. Submitted to: hotspot-compiler-dev and core-libs-dev as this patch contains a change to hotspot and > java/lang/StringLatin1.java https://bugs.openjdk.java.net/browse/JDK-8173585 > > Details of testing: > ============ > I have created a jtreg test ?compiler/intrinsics/string/TestStringLatin1IndexOfChar? to cover this new intrinsic. Note > that, particularly for the x86 implementation of the intrinsic, the code path taken is dependent upon the length of the > input String. Hence the test has been designed to cover all these cases. In summary they are: > - A ?short? string of < 16 characters. > - A SIMD String of 16 ? 31 characters. > - A AVX2 SIMD String of 32 characters+. > > Hardware used for testing: > ----------------------------- > > - Intel Xeon CPU E5-2680 (JVM did not recognize this as having AVX2 support) ? Intel i7 processor (with AVX2 support). > - AWS Graviton 2 (ARM 64 processor). > > I also ran; ?run-test-tier1? and ?run-test-tier2? for: x86_64 and aarch64. > > Possible future enhancements: > ==================== > For the x86 implementation there may be two further improvements we can make in order to improve performance of both > the StringUTF16 and StringLatin1 indexOf(char) intrinsics: > 1. Make use of AVX-512 instructions. > 2. For ?short? Strings (see below), I think it may be possible to modify the existing algorithm to still use SSE SIMD > instructions instead of a loop. > Benchmark results: > ============ > **Without** the new StringLatin1 indexOf(char) intrinsic: > > | Benchmark | Mode | Cnt | Score | Error | Units | > | ------------- | ------------- |------------- |------------- |------------- |------------- | > | IndexOfBenchmark.latin1_mixed_char | avgt | 5 | **26,389.129** | ? 182.581 | ns/op | > | IndexOfBenchmark.utf16_mixed_char | avgt | 5 | 17,885.383 | ? 435.933 | ns/op | > > > **With** the new StringLatin1 indexOf(char) intrinsic: > > | Benchmark | Mode | Cnt | Score | Error | Units | > | ------------- | ------------- |------------- |------------- |------------- |------------- | > | IndexOfBenchmark.latin1_mixed_char | avgt | 5 | **17,875.185** | ? 407.716 | ns/op | > | IndexOfBenchmark.utf16_mixed_char | avgt | 5 | 18,292.802 | ? 167.306 | ns/op | > > > The objective of the patch is to bring the performance of StringLatin1 indexOf(char) in line with StringUTF16 > indexOf(char) for x86 and ARM64. We can see above that this has been achieved. Similar results were obtained when > running on ARM. Jason Tatton has updated the pull request incrementally with one additional commit since the last revision: 8173585: Intrinsify StringLatin1.indexOf(char) Rewrite of unit test and newlines added to end of files Changes to unit test: - main test adjusted such that Strings gennerated are much longer (up to 2048 characters) and of the form: azaza, aazaazaa, aaazaaazaaa, etc with 'z' being the search character searched for. Multiple instances of the search character are included in the String in order to validate that the starting offset is correctly handleded. Results are compared to non intrinsified version of the code. Longer strings means that the looping functionality of the various paths is entered into. - Run configurations introduced such that it checks behaviour where use of SSE and AVX instructions are restricted. - Tier4InvocationThreshold adjusted so as to ensure C2 code iis invoked. Other changes: - newlines added at end of files ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/71/files - new: https://git.openjdk.java.net/jdk/pull/71/files/c8cc441e..c8a2849e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=71&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=71&range=02-03 Stats: 60 lines in 3 files changed: 26 ins; 6 del; 28 mod Patch: https://git.openjdk.java.net/jdk/pull/71.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/71/head:pull/71 PR: https://git.openjdk.java.net/jdk/pull/71 From github.com+70893615+jasontatton-aws at openjdk.java.net Fri Oct 2 08:41:01 2020 From: github.com+70893615+jasontatton-aws at openjdk.java.net (Jason Tatton) Date: Fri, 2 Oct 2020 08:41:01 GMT Subject: RFR: 8173585: Intrinsify StringLatin1.indexOf(char) [v2] In-Reply-To: References: <_T0873dC5tfUtGn9r1_Y21JkPKRt-za3MM9hPN2GQKQ=.b865fe53-5417-424f-81b6-1566a330640e@github.com> <-DRWR4_f5u6DsSGHAuPnpHrhaG8Una8BXf4zDekQjLM=.469b08b6-a8b2-4a6b-8ab8-1a40810aede0@github.com> Message-ID: On Mon, 21 Sep 2020 10:11:28 GMT, Volker Simonis wrote: >> Jason Tatton has updated the pull request with a new target base due to a merge or a rebase. The pull request now >> contains four commits: >> - Merge master >> - 8173585: further whitespace changes required by jcheck >> - JDK-8173585 - whitespace changes required by jcheck >> - JDK-8173585 > > test/hotspot/jtreg/compiler/intrinsics/string/TestStringLatin1IndexOfChar.java line 24: > >> 22: >> 23: public static void main(String[] args) throws Exception { >> 24: for (int i = 0; i < 100_0; ++i) {//repeat such that we enter into C2 code... > > The placement of the underscore looks strange to me. I'd expect it to separate thousands (like 1_000) if at all but not > sure if id use it for one thousand at all as that's really not such a big number that it is hard to read.. > Also, the Tier4InvocationThreshold is 5000 so I'm not sure youre reaching C2? I have added Tier4InvocationThreshold=200 to the unit test config in order to trigger generation earlier ------------- PR: https://git.openjdk.java.net/jdk/pull/71 From github.com+70893615+jasontatton-aws at openjdk.java.net Fri Oct 2 08:47:47 2020 From: github.com+70893615+jasontatton-aws at openjdk.java.net (Jason Tatton) Date: Fri, 2 Oct 2020 08:47:47 GMT Subject: RFR: 8173585: Intrinsify StringLatin1.indexOf(char) [v2] In-Reply-To: References: <_T0873dC5tfUtGn9r1_Y21JkPKRt-za3MM9hPN2GQKQ=.b865fe53-5417-424f-81b6-1566a330640e@github.com> <-DRWR4_f5u6DsSGHAuPnpHrhaG8Una8BXf4zDekQjLM=.469b08b6-a8b2-4a6b-8ab8-1a40810aede0@github.com> Message-ID: On Tue, 22 Sep 2020 15:19:37 GMT, Volker Simonis wrote: >> Jason Tatton has updated the pull request with a new target base due to a merge or a rebase. The pull request now >> contains four commits: >> - Merge master >> - 8173585: further whitespace changes required by jcheck >> - JDK-8173585 - whitespace changes required by jcheck >> - JDK-8173585 > > Hi Jason, > > thanks for bringing String.indexOf() for latin strings up to date with the Unicode version. > > Your changes look good except a few minor issues I've commented on right in the code. > > I'd only like to ask you if you could possibly improve your test a little bit. As far as I understand, your search text > is a consecutive sequence of "abc" characters, so you'll always find the character your searching for within the next > three characters of the source text. This won't exercise the loops of your intrinsic. Maybe you can also add some test > versions where the search character will be found beyond the first 32/64 characters after "fromIndex"? @simonis Thank you for the corrections, I have ammended them in the latest comit as follows: Changes to unit test: - main test adjusted such that Strings gennerated are much longer (up to 2048 characters) and of the form: `azaza`, `aazaazaa`, `aaazaaazaaa`, etc with `'z'` being the search character searched for. Multiple instances of the search character are included in the String in order to validate that the starting offset is correctly handleded. Results are compared to non intrinsified version of the code. Longer strings means that the looping functionality of the various paths is entered into. - Run configurations introduced such that it checks behaviour where use of SSE and AVX instructions are restricted. - Tier4InvocationThreshold adjusted so as to ensure C2 code iis invoked. Other changes: - newlines added at end of files @vnkozlov here are the performance numbers as requested. I have included performance of the UTF16 version of the intrinsic for reference: | UseAVX= | UseSSE= | Benchmark | Mode | Cnt | Score | Error | Units | |---------|---------|-----------------------------------|------|-----|-------------|-------------|-------| | | 0 | IndexOfBenchmark.latin1_long_char | avgt | 5 | **447,493.398** | ? 4,666.386 | ns/op | | 0 | | IndexOfBenchmark.latin1_long_char | avgt | 5 | **104,735.941** | ? 2,484.403 | ns/op | | 1 | | IndexOfBenchmark.latin1_long_char | avgt | 5 | **104,342.844** | ? 2,656.343 | ns/op | | 2 | | IndexOfBenchmark.latin1_long_char | avgt | 5 | **61,000.418** | ? 1,543.951 | ns/op | | 3 | | IndexOfBenchmark.latin1_long_char | avgt | 5 | **60,607.988** | ? 1,466.354 | ns/op | | | 0 | IndexOfBenchmark.utf16_long_char | avgt | 5 | 672,475.302 | ? 4,998.596 | ns/op | | 0 | | IndexOfBenchmark.utf16_long_char | avgt | 5 | 175,521.654 | ? 7,549.094 | ns/op | | 1 | | IndexOfBenchmark.utf16_long_char | avgt | 5 | 172,514.981 | ? 3,561.040 | ns/op | | 2 | | IndexOfBenchmark.utf16_long_char | avgt | 5 | 120,725.748 | ? 2,004.400 | ns/op | | 3 | | IndexOfBenchmark.utf16_long_char | avgt | 5 | 120,664.623 | ? 1,988.419 | ns/op | I think the results are as expected, we see improvements in performance as the range of SSE and AVX instructions which can be used is expanded upon. Note that no improvement is observed with UseAVX=3 because there is no AVX-512 code in these intrinsics. ------------- PR: https://git.openjdk.java.net/jdk/pull/71 From aph at redhat.com Fri Oct 2 08:48:37 2020 From: aph at redhat.com (Andrew Haley) Date: Fri, 2 Oct 2020 09:48:37 +0100 Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: References: Message-ID: <9b64062a-6761-6999-ebc1-1383f7bd34c0@redhat.com> On 02/10/2020 09:08, Alan Hayward wrote: > I have no strong feeling on the names of these functions. > The reason for moving it was that in the third part (the verification testing) it needs JavaThread. Right, but you can get the JavaThread efficiently any time you want, so you don't need to pass it to cross_modify_fence(). -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From rehn at openjdk.java.net Fri Oct 2 09:00:54 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Fri, 2 Oct 2020 09:00:54 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts [v2] In-Reply-To: References: Message-ID: > The issue is that this test doesn't consider Handshake All operation. > Depending if/when such operation is scheduled it can lockup the VM thread. > And the safepoint that should timeout never happens. > See issue for more information. > > So I changed the test to "try timeout" the safepoint, but if there was no safepoint (blocked by a handshake all), we > retry. We sleep unsafe much longer than the interval SafepointALot generates operations, which 'guarantees' we will > timeout if there is no handshake all. (some extreme case of kernel scheduling causing a very long context switch could > also make us not timeout) Passes t1, t3, and repeat runs of the test. Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: Update with input from reviews ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/465/files - new: https://git.openjdk.java.net/jdk/pull/465/files/13b40338..90fb3106 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=465&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=465&range=00-01 Stats: 34 lines in 3 files changed: 12 ins; 15 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/465.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/465/head:pull/465 PR: https://git.openjdk.java.net/jdk/pull/465 From github.com+4146708+a74nh at openjdk.java.net Fri Oct 2 09:18:38 2020 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Fri, 2 Oct 2020 09:18:38 GMT Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: References: Message-ID: <35eLsMpWmcCUoiEWhnYdSpZNmvLy4ra56Qtd6eRW574=.4e7c9278-3e0d-457d-9c15-eef45bae9755@github.com> On Fri, 2 Oct 2020 08:06:16 GMT, Alan Hayward wrote: >> The only reason for having cmf() on thread is the validation in debug builds, right? >> If you do it the opposite way: >> >> inline void OrderAccess::cross_modify_fence_non_arch_specific() { >> OrderAccess::cross_modify_fence_arch_impl(); >> #ifdef ASSERT >> if (VerifyCrossModifyFence) { >> Thread::current()->set_requires_cross_modify_fence(false); >> } >> #endif >> } >> >> You only need the boolean in debug builds on JavaThreads and you don't need to move the cmf() from OA and create the >> new one? > >> _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on >> [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ >> Hi Alan, >> >> On 1/10/2020 2:30 am, Alan Hayward wrote: >> >> > The AArch64 port uses maybe_isb in places where an ISB might be required >> > because the code may have safepointed. These maybe_isbs are very conservative >> > and are used in many places are used when a safepoint has not happened. >> > cross_modify_fence was added in common code to place a barrier in all the >> > places after a safepoint has occurred. All the uses of it are in common code, >> > yet it remains unimplemented on AArch64. >> > This set of patches implements cross_modify_fence for AArch64 and reconsiders >> > every uses of maybe_isb, discarding many of them. In addition, it introduces >> > a new diagnostic option, which when enabled on AArch64 tests the correct >> > usage of the barriers. >> > Advantage of this patch is threefold: >> > * Reducing the number of ISBs - giving a theoretical performance improvement. >> > * Use of common code instead of backend specific code. >> > * Additional test diagnostic options >> > Patch 1: Split cross_modify_fence >> > ================================= >> > This is simply refactoring work split out to simplify the other two patches. >> > instruction_fence() is provided by each target and simply places >> > a fence for the instruction stream. >> > cross_modify_fence() is now a member of JavaThread and just calls >> > instruction_fence. This function will be extended in Patch 3. >> >> I don't agree with the change here. The cross_modify_fence() is not >> related to thread API imo, it belongs in OrderAccess. The name was >> deliberately selected to abstract away from the specific details of why >> a given platform may need this fence: >> >> http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-March/037153.html >> >> "The name "instruction_pipeline" seems a bit implementation specific >> about what HW architectural features need to be taken care of due to >> cross-modifying code, which may or may not apply to a given platform. >> Perhaps cross_modify_fence(), or something along those lines, would be >> better. That makes it more clear what we are protecting against, as >> opposed to what HW architectural features that might concern on a given >> platform." >> >> @robehn , @fisk please chime in here. :) >> >> Thanks, >> David > > I have no strong feeling on the names of these functions. > The reason for moving it was that in the third part (the verification testing) it needs JavaThread. > In an earlier version I simply had OrderAccess::cross_modify_fence(JavaThread *thread). But then OrderAccess is > dependant on JavaThread, which felt wrong. Obvious solution was to add a wrapper in JavaThread that calls down to > OrderAccess. Alternatively, I could switch the name of instruction_fence back to cross_modify_fence, and then think of > a name for the added function in JavaThread. Alternatively alternatively, they could both be call cross_modify_fence. > If the verification test patch was removed from the set, then most of the first patch wouldn't be needed either. > _Mailing list message from [Andrew Haley](mailto:aph at redhat.com) on [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ > > On 02/10/2020 09:08, Alan Hayward wrote: > > > I have no strong feeling on the names of these functions. > > The reason for moving it was that in the third part (the verification testing) it needs JavaThread. > > Right, but you can get the JavaThread efficiently any time you want, > so you don't need to pass it to cross_modify_fence(). > Oh, ok, didn't spot that. This would result in code in OrderAccess.cpp calling a function in JavaThread. It feels that OrderAccess should be much lower level than JavaThread. But, that might be ok. ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From mdoerr at openjdk.java.net Fri Oct 2 09:52:41 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Fri, 2 Oct 2020 09:52:41 GMT Subject: RFR: 8253900: SA: wrong size computation when JVM was built without AOT [v5] In-Reply-To: <0pw_Crky-y-0bmejUyDxB1cxuY1ESU_F6EVO4pO4a6U=.60aa8fdc-4d0f-47e3-be57-06f975228c63@github.com> References: <0pw_Crky-y-0bmejUyDxB1cxuY1ESU_F6EVO4pO4a6U=.60aa8fdc-4d0f-47e3-be57-06f975228c63@github.com> Message-ID: On Fri, 2 Oct 2020 04:46:53 GMT, Serguei Spitsyn wrote: >> Ziviani has refreshed the contents of this pull request, and previous commits have been removed. The incremental views >> will show differences compared to the previous content of the PR. > > LGTM Added comment to JDK-8230664. Test results look good on our side, too. ------------- PR: https://git.openjdk.java.net/jdk/pull/358 From ihse at openjdk.java.net Fri Oct 2 11:47:42 2020 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Fri, 2 Oct 2020 11:47:42 GMT Subject: RFR: 8253757: Add LLVM-based backend for hsdis In-Reply-To: References: <91erxiMDb4ftvSomuJYHPi9SX-v8Z2VLD2qEwCbz5tk=.b9ed01b5-f0e0-4ed7-9c1a-b06bc0e64640@github.com> Message-ID: On Wed, 30 Sep 2020 00:55:23 GMT, Ludovic Henry wrote: >> When bringing up Hotspot onto new platforms, it is not always possible to compile hsdis because gcc is not yet >> available. For example, for Windows-AArch64 and macOS-AArch64. >> For some such platforms, it is possible to use LLVM as an alternative backend as it also supports a disassembler >> feature. > > @navyxliu I've merged the sources into `src/utils/hsdis` and added support to build it in the Makefile. This is an interesting suggestion. There is a similar attempt at replacing binutils with capstone in https://bugs.openjdk.java.net/browse/JDK-8188073, which unfortunately has not seen much progress due to lack of resources; I don't know if you are aware of that? There is also a (extremely low priority) effort to rewrite the hsdis makefile to be part of the normal build system, see e.g. https://bugs.openjdk.java.net/browse/JDK-8208495. Neither of these should be any blocker for your change, but I think it might be good if you know about them. I have couple of concerns with your patch. One is the method in which LLVM is selected instead of binutils; afaict this depends on having the `LLVM` variable set when executing the makefile. At the very least, this should be documented in the README. I don't think any more complicated configuration is really necessary at this point. With full integration with the build system, a more user-friendly way of selecting hsdis backend should be implemented, though. Second, and I don't know if this is an artifact of git/github/the new skara tooling, but if you renamed hsdis.c to hsdis.cpp, this relationship does not show up, not even in the generated webrevs. Instead they are considered a new + a deleted file. This makes it hard to see what code changes you have done in that file. And third; have you tested that your changes (both changing the main file from C to C++, and any code changes in it) does not break the old binutils functionality? Afaic there are no test suites for exercising hsdis :-( so manual ad-hoc testing is likely needed. ------------- PR: https://git.openjdk.java.net/jdk/pull/392 From rehn at openjdk.java.net Fri Oct 2 12:10:40 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Fri, 2 Oct 2020 12:10:40 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts [v2] In-Reply-To: References: Message-ID: On Thu, 1 Oct 2020 18:58:03 GMT, Patricio Chilano Mateo wrote: > LGTM Thanks! There is an update, please consider. ------------- PR: https://git.openjdk.java.net/jdk/pull/465 From iignatyev at openjdk.java.net Fri Oct 2 13:50:44 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Fri, 2 Oct 2020 13:50:44 GMT Subject: RFR: 8253913: unify gtest test names [v2] In-Reply-To: References: Message-ID: On Fri, 2 Oct 2020 06:22:29 GMT, Stefan Karlsson wrote: >> Igor Ignatyev has updated the pull request incrementally with one additional commit since the last revision: >> >> adjust test class names in LogStream's friends > > Looks good. I think it's great that you clean this up. I started doing something similar myself, but never got around > to create an RFE for it. IIRC, when I looked at it there were more inconsistencies with the names, but maybe they are > gone now. @stefank, thank you for your review. please let me know if you notice any other inconsistencies after the patch ------------- PR: https://git.openjdk.java.net/jdk/pull/475 From iignatyev at openjdk.java.net Fri Oct 2 13:50:47 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Fri, 2 Oct 2020 13:50:47 GMT Subject: Integrated: 8253913: unify gtest test names In-Reply-To: References: Message-ID: On Thu, 1 Oct 2020 21:42:10 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this small and trivial patch that unifies the names of hotspot gtests? in some cases, "_test" > is added to gtest names, in others it isn't. given some tests specify "test" in their names, so we get "test_test" like > in `UninitializedDoubleElementWorkerDataArrayTest.sum_test_test_vm`. the patch removes `_test` from the suffixes added > by `TEST*` macros. testing: ? `make test TEST=gtest` on macosx-x64 > ? `test/hotspot/jtreg/gtest/` on {linux,windows,macosx}-x64 This pull request has now been integrated. Changeset: 406db1c2 Author: Igor Ignatyev URL: https://git.openjdk.java.net/jdk/commit/406db1c2 Stats: 14 lines in 3 files changed: 5 ins; 0 del; 9 mod 8253913: unify gtest test names Reviewed-by: stefank ------------- PR: https://git.openjdk.java.net/jdk/pull/475 From iignatyev at openjdk.java.net Fri Oct 2 13:51:39 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Fri, 2 Oct 2020 13:51:39 GMT Subject: RFR: 8253882: remove PropertyResolvingWrapper In-Reply-To: References: <1DMgNmvJfsDOtn5b8yNiFMzREHTlTfH7rNQzB4-zVAo=.8f35b465-4ac9-4e7f-bf8e-7571ea9bc4aa@github.com> Message-ID: On Fri, 2 Oct 2020 07:03:36 GMT, Aleksey Shipilev wrote: >> Hi all, >> >> could you please review this trivial patch which removes `PropertyResolvingWrapper`? >> from JBS: >>> w/ all [JDK-8219140]'s sub-tasks being done, there are no more usages of `PropertyResolvingWrapper`, so this class can >>> be removed. >> >> [JDK-8219140]: https://bugs.openjdk.java.net/browse/JDK-8219140 > > Marked as reviewed by shade (Reviewer). Thanks, Aleksey. ------------- PR: https://git.openjdk.java.net/jdk/pull/446 From iignatyev at openjdk.java.net Fri Oct 2 13:51:40 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Fri, 2 Oct 2020 13:51:40 GMT Subject: Integrated: 8253882: remove PropertyResolvingWrapper In-Reply-To: <1DMgNmvJfsDOtn5b8yNiFMzREHTlTfH7rNQzB4-zVAo=.8f35b465-4ac9-4e7f-bf8e-7571ea9bc4aa@github.com> References: <1DMgNmvJfsDOtn5b8yNiFMzREHTlTfH7rNQzB4-zVAo=.8f35b465-4ac9-4e7f-bf8e-7571ea9bc4aa@github.com> Message-ID: On Wed, 30 Sep 2020 22:43:35 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this trivial patch which removes `PropertyResolvingWrapper`? > from JBS: >> w/ all [JDK-8219140]'s sub-tasks being done, there are no more usages of `PropertyResolvingWrapper`, so this class can >> be removed. > > [JDK-8219140]: https://bugs.openjdk.java.net/browse/JDK-8219140 This pull request has now been integrated. Changeset: fff8c8de Author: Igor Ignatyev URL: https://git.openjdk.java.net/jdk/commit/fff8c8de Stats: 136 lines in 1 file changed: 0 ins; 136 del; 0 mod 8253882: remove PropertyResolvingWrapper Reviewed-by: shade ------------- PR: https://git.openjdk.java.net/jdk/pull/446 From avoitylov at openjdk.java.net Fri Oct 2 13:55:18 2020 From: avoitylov at openjdk.java.net (Aleksei Voitylov) Date: Fri, 2 Oct 2020 13:55:18 GMT Subject: RFR: JDK-8247589: Implementation of Alpine Linux/x64 Port [v3] In-Reply-To: References: Message-ID: > continuing the review thread from here https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-September/068546.html > >> The download side of using JNI in these tests is that it complicates the >> setup a bit for those that run jtreg directly and/or just build the JDK >> and not the test libraries. You could reduce this burden a bit by >> limiting the load library/isMusl check to Linux only, meaning isMusl >> would not be called on other platforms. >> >> The alternative you suggest above might indeed be better. I assume you >> don't mean splitting the tests but rather just adding a second @test >> description so that the vm.musl case runs the test with a system >> property that allows the test know the expected load library path behavior. > > I have updated the PR to split the two tests in multiple @test s. > >> The updated comment in java_md.c in this looks good. A minor comment on >> Platform.isBusybox is Files.isSymbolicLink returning true implies that >> the link exists so no need to check for exists too. Also the >> if-then-else style for the new class in ProcessBuilder/Basic.java is >> inconsistent with the rest of the test so it stands out. > > Thank you, these changes are done in the updated PR. > >> Given the repo transition this weekend then I assume you'll create a PR >> for the final review at least. Also I see JEP 386 hasn't been targeted >> yet but I assume Boris, as owner, will propose-to-target and wait for it >> to be targeted before it is integrated. > > Yes. How can this be best accomplished with the new git workflow? > - we can continue the review process till the end and I will request the integration to happen only after the JEP is > targeted. I guess this step is now done by typing "slash integrate" in a comment. > - we can pause the review process now until the JEP is targeted. > > In the first case I'm kindly asking the Reviewers who already chimed in on that to re-confirm the review here. Aleksei Voitylov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge branch 'JDK-8247589' of https://github.com/voitylov/jdk into JDK-8247589 - JDK-8247589: Implementation of Alpine Linux/x64 Port - JDK-8247589: Implementation of Alpine Linux/x64 Port - JDK-8247589: Implementation of Alpine Linux/x64 Port - JDK-8247589: Implementation of Alpine Linux/x64 Port ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/49/files - new: https://git.openjdk.java.net/jdk/pull/49/files/d5994cb5..705b8555 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=49&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=49&range=01-02 Stats: 73505 lines in 3006 files changed: 26172 ins; 37386 del; 9947 mod Patch: https://git.openjdk.java.net/jdk/pull/49.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/49/head:pull/49 PR: https://git.openjdk.java.net/jdk/pull/49 From david.holmes at oracle.com Fri Oct 2 15:12:08 2020 From: david.holmes at oracle.com (David Holmes) Date: Sat, 3 Oct 2020 01:12:08 +1000 Subject: RFR: 8253899: Make IsClassUnloadingEnabled signature match specification In-Reply-To: References: <5UPvhVjwK3TOKrd2rSzeCz-xT0tXsCUkMT9R8tgFR8I=.f1723d22-6883-4ad3-af55-a7437ef905de@github.com> Message-ID: Hi Vladimir, On 2/10/2020 5:37 pm, Vladimir Kempik wrote: > On Fri, 2 Oct 2020 07:27:17 GMT, David Holmes wrote: > >> Okay but look at the example that documentation gives: >> >>> For example, if the jvmtiParamInfo returned by GetExtensionEvents indicates that there is a jint parameter, the event >>> handler should be declared: ``` >>> void JNICALL myHandler(jvmtiEnv* jvmti_env, jint myInt, ...) >>> ``` >> >> The myInt is explicit, just as our "jboolean* enabled" is explicit. I think they key point is that the signature must >> end with "..." which it does. >> I don't see anything here that needs to be fixed. > > Hello David. On majority of platforms this would be fine. > > But on some platforms, variadic arguments and non variadic arguments are passed differently ( for example on > macos-aarch64, variadic args are passed always on stack, non variadic on registers (and on stack for 9th+ arg) , that > causes issues. Okay - I see the potential for a problme here but ... > If you still see no issues here we can delay and make this changeset part of JEP-391. > But since this changeset isn't much macos-aarch64 specific, I thought it would be good to integrate it separately from > jep-391. ... this change actually goes against the example in the spec, so if you make this change it indicates the spec needs to be updated too. Cheers, David ----- > Regards, Vladimir > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/466 > From vkempik at openjdk.java.net Fri Oct 2 15:29:38 2020 From: vkempik at openjdk.java.net (Vladimir Kempik) Date: Fri, 2 Oct 2020 15:29:38 GMT Subject: RFR: 8253899: Make IsClassUnloadingEnabled signature match specification In-Reply-To: References: <5UPvhVjwK3TOKrd2rSzeCz-xT0tXsCUkMT9R8tgFR8I=.f1723d22-6883-4ad3-af55-a7437ef905de@github.com> Message-ID: On Fri, 2 Oct 2020 07:34:45 GMT, Vladimir Kempik wrote: >> Okay but look at the example that documentation gives: >> >>> For example, if the jvmtiParamInfo returned by GetExtensionEvents indicates that there is a jint parameter, the event >>> handler should be declared: >>> void JNICALL myHandler(jvmtiEnv* jvmti_env, jint myInt, ...) >> >> The myInt is explicit, just as our "jboolean* enabled" is explicit. I think they key point is that the signature must >> end with "..." which it does. >> I don't see anything here that needs to be fixed. > >> Okay but look at the example that documentation gives: >> >> > For example, if the jvmtiParamInfo returned by GetExtensionEvents indicates that there is a jint parameter, the event >> > handler should be declared: ``` >> > void JNICALL myHandler(jvmtiEnv* jvmti_env, jint myInt, ...) >> > ``` >> >> The myInt is explicit, just as our "jboolean* enabled" is explicit. I think they key point is that the signature must >> end with "..." which it does. >> I don't see anything here that needs to be fixed. > > Hello David. On majority of platforms this would be fine. > > But on some platforms, variadic arguments and non variadic arguments are passed differently ( for example on > macos-aarch64, variadic args are passed always on stack, non variadic on registers (and on stack for 9th+ arg) , that > causes issues. If you still see no issues here we can delay and make this changeset part of JEP-391. > But since this changeset isn't much macos-aarch64 specific, I thought it would be good to integrate it separately from > jep-391. > Regards, Vladimir > _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on > [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ > Hi Vladimir, > > On 2/10/2020 5:37 pm, Vladimir Kempik wrote: > > > On Fri, 2 Oct 2020 07:27:17 GMT, David Holmes wrote: > > > Okay but look at the example that documentation gives: > > > > For example, if the jvmtiParamInfo returned by GetExtensionEvents indicates that there is a jint parameter, the event > > > > handler should be declared: ``` > > > > void JNICALL myHandler(jvmtiEnv* jvmti_env, jint myInt, ...) > > > > ``` > > > > > > > > > The myInt is explicit, just as our "jboolean* enabled" is explicit. I think they key point is that the signature must > > > end with "..." which it does. > > > I don't see anything here that needs to be fixed. > > > > > > Hello David. On majority of platforms this would be fine. > > But on some platforms, variadic arguments and non variadic arguments are passed differently ( for example on > > macos-aarch64, variadic args are passed always on stack, non variadic on registers (and on stack for 9th+ arg) , that > > causes issues. > > Okay - I see the potential for a problme here but ... > > > If you still see no issues here we can delay and make this changeset part of JEP-391. > > But since this changeset isn't much macos-aarch64 specific, I thought it would be good to integrate it separately from > > jep-391. > > ... this change actually goes against the example in the spec, so if you > make this change it indicates the spec needs to be updated too. > > Cheers, > David > ----- Hello David I really believe the problem is in document here ( in examples) first, the doc clearly specify the type typedef jvmtiError (JNICALL *jvmtiExtensionFunction) (jvmtiEnv* jvmti_env, ...); then in examples it declares the function not matching this spec. Is it a good idea to update the docs in a separate bug ? Thanks, Vladimir ------------- PR: https://git.openjdk.java.net/jdk/pull/466 From avoitylov at openjdk.java.net Fri Oct 2 15:39:09 2020 From: avoitylov at openjdk.java.net (Aleksei Voitylov) Date: Fri, 2 Oct 2020 15:39:09 GMT Subject: RFR: JDK-8247589: Implementation of Alpine Linux/x64 Port [v4] In-Reply-To: References: Message-ID: > continuing the review thread from here https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-September/068546.html > >> The download side of using JNI in these tests is that it complicates the >> setup a bit for those that run jtreg directly and/or just build the JDK >> and not the test libraries. You could reduce this burden a bit by >> limiting the load library/isMusl check to Linux only, meaning isMusl >> would not be called on other platforms. >> >> The alternative you suggest above might indeed be better. I assume you >> don't mean splitting the tests but rather just adding a second @test >> description so that the vm.musl case runs the test with a system >> property that allows the test know the expected load library path behavior. > > I have updated the PR to split the two tests in multiple @test s. > >> The updated comment in java_md.c in this looks good. A minor comment on >> Platform.isBusybox is Files.isSymbolicLink returning true implies that >> the link exists so no need to check for exists too. Also the >> if-then-else style for the new class in ProcessBuilder/Basic.java is >> inconsistent with the rest of the test so it stands out. > > Thank you, these changes are done in the updated PR. > >> Given the repo transition this weekend then I assume you'll create a PR >> for the final review at least. Also I see JEP 386 hasn't been targeted >> yet but I assume Boris, as owner, will propose-to-target and wait for it >> to be targeted before it is integrated. > > Yes. How can this be best accomplished with the new git workflow? > - we can continue the review process till the end and I will request the integration to happen only after the JEP is > targeted. I guess this step is now done by typing "slash integrate" in a comment. > - we can pause the review process now until the JEP is targeted. > > In the first case I'm kindly asking the Reviewers who already chimed in on that to re-confirm the review here. Aleksei Voitylov has updated the pull request incrementally with one additional commit since the last revision: test2 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/49/files - new: https://git.openjdk.java.net/jdk/pull/49/files/705b8555..5feda5ff Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=49&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=49&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/49.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/49/head:pull/49 PR: https://git.openjdk.java.net/jdk/pull/49 From iignatyev at openjdk.java.net Fri Oct 2 15:41:41 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Fri, 2 Oct 2020 15:41:41 GMT Subject: RFR: 8253750: use build-stable default seed for Utils.RANDOM_GENERATOR In-Reply-To: References: Message-ID: On Tue, 29 Sep 2020 00:06:02 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review the patch which updates `jdk.test.lib.Utils` to use md5 hash-sum of `java.vm.version` property > as default seed for `Utils.RANDOM_GENERATOR`? > from JBS: >> using the same seed for all runs of a build will make it possible (easier) to compare results from different test runs >> (e.g. on different platforms, w/ different flags) and consequently will make test results analysis easier. > > the patch also updates `RandomGeneratorTest` test, so it expects now that the same values are generated if no seed is > provided. > testing: ? tier1 ping? ------------- PR: https://git.openjdk.java.net/jdk/pull/391 From avoitylov at openjdk.java.net Fri Oct 2 15:45:56 2020 From: avoitylov at openjdk.java.net (Aleksei Voitylov) Date: Fri, 2 Oct 2020 15:45:56 GMT Subject: RFR: JDK-8247589: Implementation of Alpine Linux/x64 Port [v5] In-Reply-To: References: Message-ID: > continuing the review thread from here https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-September/068546.html > >> The download side of using JNI in these tests is that it complicates the >> setup a bit for those that run jtreg directly and/or just build the JDK >> and not the test libraries. You could reduce this burden a bit by >> limiting the load library/isMusl check to Linux only, meaning isMusl >> would not be called on other platforms. >> >> The alternative you suggest above might indeed be better. I assume you >> don't mean splitting the tests but rather just adding a second @test >> description so that the vm.musl case runs the test with a system >> property that allows the test know the expected load library path behavior. > > I have updated the PR to split the two tests in multiple @test s. > >> The updated comment in java_md.c in this looks good. A minor comment on >> Platform.isBusybox is Files.isSymbolicLink returning true implies that >> the link exists so no need to check for exists too. Also the >> if-then-else style for the new class in ProcessBuilder/Basic.java is >> inconsistent with the rest of the test so it stands out. > > Thank you, these changes are done in the updated PR. > >> Given the repo transition this weekend then I assume you'll create a PR >> for the final review at least. Also I see JEP 386 hasn't been targeted >> yet but I assume Boris, as owner, will propose-to-target and wait for it >> to be targeted before it is integrated. > > Yes. How can this be best accomplished with the new git workflow? > - we can continue the review process till the end and I will request the integration to happen only after the JEP is > targeted. I guess this step is now done by typing "slash integrate" in a comment. > - we can pause the review process now until the JEP is targeted. > > In the first case I'm kindly asking the Reviewers who already chimed in on that to re-confirm the review here. Aleksei Voitylov has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/49/files - new: https://git.openjdk.java.net/jdk/pull/49/files/5feda5ff..b7ffed87 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=49&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=49&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/49.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/49/head:pull/49 PR: https://git.openjdk.java.net/jdk/pull/49 From kvn at openjdk.java.net Fri Oct 2 16:00:42 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 2 Oct 2020 16:00:42 GMT Subject: RFR: 8223347: Integration of Vector API (Incubator) In-Reply-To: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> References: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> Message-ID: On Fri, 25 Sep 2020 20:14:29 GMT, Paul Sandoz wrote: > This pull request is for integration of the Vector API. It was previously reviewed under conditions when mercurial was > used for the source code control system. Review threads can be found here (searching for issue number 8223347 in the > title): https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-April/thread.html > https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-May/thread.html > https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-July/thread.html > > If mercurial was still being used the code would be pushed directly, once the CSR is approved. However, in this case a > pull request is required and needs explicit reviewer approval. Between the final review and this pull request no code > has changed, except for that related to merging. Good ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/367 From chegar at openjdk.java.net Fri Oct 2 16:00:41 2020 From: chegar at openjdk.java.net (Chris Hegarty) Date: Fri, 2 Oct 2020 16:00:41 GMT Subject: RFR: 8223347: Integration of Vector API (Incubator) In-Reply-To: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> References: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> Message-ID: On Fri, 25 Sep 2020 20:14:29 GMT, Paul Sandoz wrote: > This pull request is for integration of the Vector API. It was previously reviewed under conditions when mercurial was > used for the source code control system. Review threads can be found here (searching for issue number 8223347 in the > title): https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-April/thread.html > https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-May/thread.html > https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-July/thread.html > > If mercurial was still being used the code would be pushed directly, once the CSR is approved. However, in this case a > pull request is required and needs explicit reviewer approval. Between the final review and this pull request no code > has changed, except for that related to merging. Marked as reviewed by chegar (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/367 From rriggs at openjdk.java.net Fri Oct 2 17:11:38 2020 From: rriggs at openjdk.java.net (Roger Riggs) Date: Fri, 2 Oct 2020 17:11:38 GMT Subject: RFR: 8253750: use build-stable default seed for Utils.RANDOM_GENERATOR In-Reply-To: References: Message-ID: <-nJ85Iyg2_335BQ19H6gKf2gEc4GU_Bg5iTzrvGSKUk=.d17d917e-8afa-455c-985b-d7280013d5d3@github.com> On Fri, 2 Oct 2020 15:39:01 GMT, Igor Ignatyev wrote: >> Hi all, >> >> could you please review the patch which updates `jdk.test.lib.Utils` to use md5 hash-sum of `java.vm.version` property >> as default seed for `Utils.RANDOM_GENERATOR`? >> from JBS: >>> using the same seed for all runs of a build will make it possible (easier) to compare results from different test runs >>> (e.g. on different platforms, w/ different flags) and consequently will make test results analysis easier. >> >> the patch also updates `RandomGeneratorTest` test, so it expects now that the same values are generated if no seed is >> provided. >> testing: ? tier1 > > ping? Is this really a good idea? The purpose of using random numbers is to get broader coverage on multiple runs. If the seed only changes once per version (6 months), that reduces test coverage. At least for dev submitted runs, I would like to be different for every build (unless overridden). ------------- PR: https://git.openjdk.java.net/jdk/pull/391 From mbeckwit at openjdk.java.net Fri Oct 2 17:20:45 2020 From: mbeckwit at openjdk.java.net (Monica Beckwith) Date: Fri, 2 Oct 2020 17:20:45 GMT Subject: RFR: 8248238: Implementation: JEP 388: Windows AArch64 Support [v12] In-Reply-To: References: Message-ID: On Wed, 30 Sep 2020 00:40:29 GMT, David Holmes wrote: >> Monica Beckwith has updated the pull request incrementally with one additional commit since the last revision: >> >> change string representation for r18 to "r18_tls" on every platform > > Marked as reviewed by dholmes (Reviewer). > The JEP is not yet targeted so we have to wait for that formality. But > once that happens I can sponsor for you. Thanks. > > Also note that the PR references the wrong JEP so can you please edit > the description to fix that. Added JEP # (388) here and updated the JBS entry. After looking at JEPs 386, 377, and 379, I also did the following: - listed JDK-8248238 as a sub-task for JDK-8248496 - added this PR link in a comment for the JEP. As soon as the JEP is targetted, I will update the "Fix version" for the 'Implementation' (JDK-8248238) and ping you @dholmes-ora . > > Meanwhile I'll see if I can take this for a spin through our internal > testing. Thanks so much. Regards, Monica > > Cheers, > David ------------- PR: https://git.openjdk.java.net/jdk/pull/212 From joe.darcy at oracle.com Fri Oct 2 17:26:37 2020 From: joe.darcy at oracle.com (Joe Darcy) Date: Fri, 2 Oct 2020 10:26:37 -0700 Subject: RFR: 8253750: use build-stable default seed for Utils.RANDOM_GENERATOR In-Reply-To: <-nJ85Iyg2_335BQ19H6gKf2gEc4GU_Bg5iTzrvGSKUk=.d17d917e-8afa-455c-985b-d7280013d5d3@github.com> References: <-nJ85Iyg2_335BQ19H6gKf2gEc4GU_Bg5iTzrvGSKUk=.d17d917e-8afa-455c-985b-d7280013d5d3@github.com> Message-ID: I agree with Roger that this change should *not* go forward since it would have the effect of reducing test coverage. Regression tests that use randomness should using the "randomness" jtreg label and should output the seed value used so a failing result can be replicated. Thanks, -Joe On 10/2/2020 10:11 AM, Roger Riggs wrote: > On Fri, 2 Oct 2020 15:39:01 GMT, Igor Ignatyev wrote: > >>> Hi all, >>> >>> could you please review the patch which updates `jdk.test.lib.Utils` to use md5 hash-sum of `java.vm.version` property >>> as default seed for `Utils.RANDOM_GENERATOR`? >>> from JBS: >>>> using the same seed for all runs of a build will make it possible (easier) to compare results from different test runs >>>> (e.g. on different platforms, w/ different flags) and consequently will make test results analysis easier. >>> the patch also updates `RandomGeneratorTest` test, so it expects now that the same values are generated if no seed is >>> provided. >>> testing: ? tier1 >> ping? > Is this really a good idea? The purpose of using random numbers is to get broader coverage on multiple runs. > If the seed only changes once per version (6 months), that reduces test coverage. > At least for dev submitted runs, I would like to be different for every build (unless overridden). > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/391 From iignatyev at openjdk.java.net Fri Oct 2 17:50:38 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Fri, 2 Oct 2020 17:50:38 GMT Subject: RFR: 8253750: use build-stable default seed for Utils.RANDOM_GENERATOR In-Reply-To: <-nJ85Iyg2_335BQ19H6gKf2gEc4GU_Bg5iTzrvGSKUk=.d17d917e-8afa-455c-985b-d7280013d5d3@github.com> References: <-nJ85Iyg2_335BQ19H6gKf2gEc4GU_Bg5iTzrvGSKUk=.d17d917e-8afa-455c-985b-d7280013d5d3@github.com> Message-ID: On Fri, 2 Oct 2020 17:09:07 GMT, Roger Riggs wrote: >> ping? > > Is this really a good idea? The purpose of using random numbers is to get broader coverage on multiple runs. > If the seed only changes once per version (6 months), that reduces test coverage. > At least for dev submitted runs, I would like to be different for every build (unless overridden). Hi Roger, it's exactly that you want it to be. It is different for every build (not every release, as we are using `java.vm.version` not `java.version`) unless overridden, each dev submitted run and each CI builds have different `java.vm.version` (e.g. the last two mach5 CI builds have `16-ea+19-938` and `16-ea+19-939` as their `java.vm.version`, one of my ad-hoc mach5 runs -- `16-internal+0-2020-10-01-2150482.igor.ignatyev.jdk`, my local build -- `16-internal+0-2020-10-01-2252075.iignatye...`, ) and hence would get different seeds. all the test tasks for these builds, on the other hand, would use the same seed, so one could be more confident that if test `T` passed on all platforms but platform `P`, it's platform `P` specific problem, there before this fix, one would need to rerun test `T` on each platform with the _faling_ seed and on platform `P` with at least one _passing_ seed, this problem becomes even more acute when you start considering all build flavor, vm flags, host configurations we cover in our testing. ------------- PR: https://git.openjdk.java.net/jdk/pull/391 From iignatyev at openjdk.java.net Fri Oct 2 17:57:38 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Fri, 2 Oct 2020 17:57:38 GMT Subject: RFR: 8253750: use build-stable default seed for Utils.RANDOM_GENERATOR In-Reply-To: References: <-nJ85Iyg2_335BQ19H6gKf2gEc4GU_Bg5iTzrvGSKUk=.d17d917e-8afa-455c-985b-d7280013d5d3@github.com> Message-ID: On Fri, 2 Oct 2020 17:47:40 GMT, Igor Ignatyev wrote: >> Is this really a good idea? The purpose of using random numbers is to get broader coverage on multiple runs. >> If the seed only changes once per version (6 months), that reduces test coverage. >> At least for dev submitted runs, I would like to be different for every build (unless overridden). > > Hi Roger, > > it's exactly that you want it to be. It is different for every build (not every release, as we are using > `java.vm.version` not `java.version`) unless overridden, each dev submitted run and each CI builds have different > `java.vm.version` (e.g. the last two mach5 CI builds have `16-ea+19-938` and `16-ea+19-939` as their `java.vm.version`, > one of my ad-hoc mach5 runs -- `16-internal+0-2020-10-01-2150482.igor.ignatyev.jdk`, my local build -- > `16-internal+0-2020-10-01-2252075.iignatye...`, ) and hence would get different seeds. all the test tasks for these > builds, on the other hand, would use the same seed, so one could be more confident that if test `T` passed on all > platforms but platform `P`, it's platform `P` specific problem, there before this fix, one would need to rerun test `T` > on each platform with the _faling_ seed and on platform `P` with at least one _passing_ seed, this problem becomes even > more acute when you start considering all build flavor, vm flags, host configurations we cover in our testing. Hi Joe, this change indeed reduces test coverage, but not drastically (one can even argue that it doesn't reduce coverage as generally speaking you can merge coverage from runs with different configurations, e.g. run on different platforms), due to that and the reasons I wrote in my response to Roger, I don't think the reducing coverage is an issue here, nor do I think that possible impact on coverage overweights the benefits we get from comparable runs. ------------- PR: https://git.openjdk.java.net/jdk/pull/391 From rriggs at openjdk.java.net Fri Oct 2 18:49:37 2020 From: rriggs at openjdk.java.net (Roger Riggs) Date: Fri, 2 Oct 2020 18:49:37 GMT Subject: RFR: 8253750: use build-stable default seed for Utils.RANDOM_GENERATOR In-Reply-To: References: Message-ID: On Tue, 29 Sep 2020 00:06:02 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review the patch which updates `jdk.test.lib.Utils` to use md5 hash-sum of `java.vm.version` property > as default seed for `Utils.RANDOM_GENERATOR`? > from JBS: >> using the same seed for all runs of a build will make it possible (easier) to compare results from different test runs >> (e.g. on different platforms, w/ different flags) and consequently will make test results analysis easier. > > the patch also updates `RandomGeneratorTest` test, so it expects now that the same values are generated if no seed is > provided. > testing: ? tier1 ok, if I read the code closely and know how the promoted build process works then I see your rationale. Please update the bug report and edit the PR description to describe the conditions under which the seed for random is computed from the build number. Its might be clearer to refer to the Runtime.Version build information in the description. Also make it clear that unless the system property is set, it will use a 'random' seed. ------------- Changes requested by rriggs (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/391 From luhenry at microsoft.com Fri Oct 2 19:49:39 2020 From: luhenry at microsoft.com (Ludovic Henry) Date: Fri, 2 Oct 2020 19:49:39 +0000 Subject: [jdk11u] 8253947: Implementation: JEP 388: Windows AArch64 Support Message-ID: Hi, As we are getting closer to merge JEP 388 (Windows/AArch64 Port), I wanted to share the work I've done for the backport to JDK 11. This email is not an RFR as the webrev is not finalized yet (the JEP implementation hasn't landed on JDK tip). I'm sharing it just to give an idea of the work which will be required, what kind of changes we are facing, and for other parties who have expressed interest in this backport. I'll keep updating it as we get reviews on [1]. Once we've landed the JEP implementation into JDK tip, I'll create the appropriate bug IDs for all the necessary backports, and contact the relevant mailing lists with the different webrevs and JBS. I've tested this change on Windows-AArch64 against hotspot:tier1, jdk:tier1 and langtools, but not yet on Linux-AArch64. Webrev: http://cr.openjdk.java.net/~luhenry/8253947/webrev.00/ Thank you, Ludovic [1] https://github.com/openjdk/jdk/pull/212 From iignatyev at openjdk.java.net Fri Oct 2 20:46:48 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Fri, 2 Oct 2020 20:46:48 GMT Subject: RFR: 8253750: use build-stable default seed for Utils.RANDOM_GENERATOR [v2] In-Reply-To: References: Message-ID: > Hi all, > > could you please review the patch which updates `jdk.test.lib.Utils` to use md5 hash-sum of `java.vm.version` property > as default seed for `Utils.RANDOM_GENERATOR`? > from JBS: >> using the same seed for all runs of a build will make it possible (easier) to compare results from different test runs >> (e.g. on different platforms, w/ different flags) and consequently will make test results analysis easier. the proposed >> solution is to use the seed based on Runtime.version() / "java.vm.version", which are different from build to build, if >> there is no seed specified by "jdk.test.lib.random.seed" property. > > the patch also updates `RandomGeneratorTest` test, so it expects now that the same values are generated if no seed is > provided. > testing: ? tier1 Igor Ignatyev has updated the pull request incrementally with one additional commit since the last revision: used Runtime.version() instead of ${java.vm.version} updated javadoc ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/391/files - new: https://git.openjdk.java.net/jdk/pull/391/files/b5437f75..381e14f7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=391&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=391&range=00-01 Stats: 10 lines in 1 file changed: 4 ins; 0 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/391.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/391/head:pull/391 PR: https://git.openjdk.java.net/jdk/pull/391 From iignatyev at openjdk.java.net Fri Oct 2 20:50:38 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Fri, 2 Oct 2020 20:50:38 GMT Subject: RFR: 8253750: use build-stable default seed for Utils.RANDOM_GENERATOR [v2] In-Reply-To: References: Message-ID: On Fri, 2 Oct 2020 18:46:39 GMT, Roger Riggs wrote: >> Igor Ignatyev has updated the pull request incrementally with one additional commit since the last revision: >> >> used Runtime.version() instead of ${java.vm.version} >> updated javadoc > > ok, if I read the code closely and know how the promoted build process works then I see your rationale. > Please update the bug report and edit the PR description to describe the conditions under which the seed > for random is computed from the build number. Its might be clearer to refer to the Runtime.Version build information > in the description. Also make it clear that unless the system property is set, it will use a 'random' seed. @RogerRiggs , I have updated the code to use `Runtime::version` and updated the docs to better reflect how `seed` value is being set, as well as added some explanation to both the JBS issue and PR. please let me know if it's still not clear enough. ------------- PR: https://git.openjdk.java.net/jdk/pull/391 From rriggs at openjdk.java.net Fri Oct 2 23:13:39 2020 From: rriggs at openjdk.java.net (Roger Riggs) Date: Fri, 2 Oct 2020 23:13:39 GMT Subject: RFR: 8253750: use build-stable default seed for Utils.RANDOM_GENERATOR [v2] In-Reply-To: References: Message-ID: On Fri, 2 Oct 2020 20:46:48 GMT, Igor Ignatyev wrote: >> Hi all, >> >> could you please review the patch which updates `jdk.test.lib.Utils` to use md5 hash-sum of `java.vm.version` property >> as default seed for `Utils.RANDOM_GENERATOR`? >> from JBS: >>> using the same seed for all runs of a build will make it possible (easier) to compare results from different test runs >>> (e.g. on different platforms, w/ different flags) and consequently will make test results analysis easier. the proposed >>> solution is to use the seed based on Runtime.version() / "java.vm.version", which are different from build to build, if >>> there is no seed specified by "jdk.test.lib.random.seed" property. >> >> the patch also updates `RandomGeneratorTest` test, so it expects now that the same values are generated if no seed is >> provided. >> testing: ? tier1 > > Igor Ignatyev has updated the pull request incrementally with one additional commit since the last revision: > > used Runtime.version() instead of ${java.vm.version} > updated javadoc The code and description are reversed from what I suggested. The seed should *only* be computed from the version information if it is a promoted or weekly build. All other builds should use a random seed. ------------- Changes requested by rriggs (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/391 From luhenry at microsoft.com Sat Oct 3 19:21:28 2020 From: luhenry at microsoft.com (Ludovic Henry) Date: Sat, 3 Oct 2020 19:21:28 +0000 Subject: [aarch64-port-dev ] [jdk11u] 8253947: Implementation: JEP 388: Windows AArch64 Support In-Reply-To: <99a84e5c-0838-7281-8eed-f6bf7c6342f1@redhat.com> References: <99a84e5c-0838-7281-8eed-f6bf7c6342f1@redhat.com> Message-ID: Hi Andrew, > I warn you now that you may never get approval to backport this stuff to mainline jdk11u because it's too disruptive. That is in line with what we discussed previously. > That needn't worry you: we've never backported AArch64 to jdk8u, but that doesn't matter to anyone as long as it all works. Microsoft do intend to do the backport on the Microsoft distribution of the OpenJDK, similarly to other features in other distributions. But instead of keeping everything internally, we much rather have this patch open to the community. Even if the patch doesn't get integrated, what would be the best place to keep it in the open? Could we create a repo in the aarch64-port project [1], like a jdk11-windows repo for example? The same problematic will rise for macOS-AArch64. Thank you, Ludovic [1] https://hg.openjdk.java.net/aarch64-port From vkempik at azul.com Sat Oct 3 19:57:32 2020 From: vkempik at azul.com (Vladimir Kempik) Date: Sat, 3 Oct 2020 19:57:32 +0000 Subject: [aarch64-port-dev ] [jdk11u] 8253947: Implementation: JEP 388: Windows AArch64 Support In-Reply-To: References: <99a84e5c-0838-7281-8eed-f6bf7c6342f1@redhat.com> Message-ID: <957d85208f5d4bcab0d83d3f65ee4995@azul.com> Hello Andrew. >I warn you now that you may never get approval to backport this stuff to mainline jdk11u because it's too disruptive. Shenandoah? Could you also please suggest us about jep-391. Does Redhat would like to see macos-aarch64 support in openjdk11, or its better to leave users with Rosetta translator? If jep-391 is welcomed into 11u, then it will need some parts of windows-aarch64 port (x18 exclusion). Kind regards, Vladimir. Ludovic Henry 3 ??????? 2020 ?. 22:21:51 ???????: Hi Andrew, I warn you now that you may never get approval to backport this stuff to mainline jdk11u because it's too disruptive. That is in line with what we discussed previously. That needn't worry you: we've never backported AArch64 to jdk8u, but that doesn't matter to anyone as long as it all works. Microsoft do intend to do the backport on the Microsoft distribution of the OpenJDK, similarly to other features in other distributions. But instead of keeping everything internally, we much rather have this patch open to the community. Even if the patch doesn't get integrated, what would be the best place to keep it in the open? Could we create a repo in the aarch64-port project [1], like a jdk11-windows repo for example? The same problematic will rise for macOS-AArch64. Thank you, Ludovic [1] https://hg.openjdk.java.net/aarch64-port From luhenry at microsoft.com Sun Oct 4 15:11:48 2020 From: luhenry at microsoft.com (Ludovic Henry) Date: Sun, 4 Oct 2020 15:11:48 +0000 Subject: [aarch64-port-dev ] [jdk11u] 8253947: Implementation: JEP 388: Windows AArch64 Support In-Reply-To: <957d85208f5d4bcab0d83d3f65ee4995@azul.com> References: <99a84e5c-0838-7281-8eed-f6bf7c6342f1@redhat.com> <957d85208f5d4bcab0d83d3f65ee4995@azul.com> Message-ID: Hi Vladmir, For macOS-AArch64, I'd be happy to work with you create a common webrev for both Windows-AArch64 and macOS-AArch64 support for JDK 11. And if we take the road for a `jdk11-windows` repo in the aarch64-project, we should extend the idea to a `jdk11u` repo that would host both Windows-AArch64 and macOS-AArch64 patches. Thank you, Ludovic From: Vladimir Kempik Sent: Saturday, October 3, 2020 12:58 PM To: Ludovic Henry ; aarch64-port-dev at openjdk.java.net; aph at redhat.com; hotspot-dev at openjdk.java.net Cc: openjdk-aarch64 Subject: RE: [aarch64-port-dev ] [jdk11u] 8253947: Implementation: JEP 388: Windows AArch64 Support Hello Andrew. >I warn you now that you may never get approval to backport this stuff to mainline jdk11u because it's too disruptive. Shenandoah? Could you also please suggest us about jep-391. Does Redhat would like to see macos-aarch64 support in openjdk11, or its better to leave users with Rosetta translator? If jep-391 is welcomed into 11u, then it will need some parts of windows-aarch64 port (x18 exclusion). Kind regards, Vladimir. Ludovic Henry > 3 ??????? 2020 ?. 22:21:51 ???????: Hi Andrew, I warn you now that you may never get approval to backport this stuff to mainline jdk11u because it's too disruptive. That is in line with what we discussed previously. That needn't worry you: we've never backported AArch64 to jdk8u, but that doesn't matter to anyone as long as it all works. Microsoft do intend to do the backport on the Microsoft distribution of the OpenJDK, similarly to other features in other distributions. But instead of keeping everything internally, we much rather have this patch open to the community. Even if the patch doesn't get integrated, what would be the best place to keep it in the open? Could we create a repo in the aarch64-port project [1], like a jdk11-windows repo for example? The same problematic will rise for macOS-AArch64. Thank you, Ludovic [1] https://hg.openjdk.java.net/aarch64-port From dholmes at openjdk.java.net Mon Oct 5 03:24:44 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 5 Oct 2020 03:24:44 GMT Subject: RFR: 8248238: Implementation: JEP 388: Windows AArch64 Support [v12] In-Reply-To: References: Message-ID: On Fri, 2 Oct 2020 17:16:50 GMT, Monica Beckwith wrote: >> Marked as reviewed by dholmes (Reviewer). > >> The JEP is not yet targeted so we have to wait for that formality. But >> once that happens I can sponsor for you. > > Thanks. >> >> Also note that the PR references the wrong JEP so can you please edit >> the description to fix that. > > Added JEP # (388) here and updated the JBS entry. > After looking at JEPs 386, 377, and 379, I also did the following: > - listed JDK-8248238 as a sub-task for JDK-8248496 > - added this PR link in a comment for the JEP. > > As soon as the JEP is targetted, I will update the "Fix version" for the 'Implementation' (JDK-8248238) and ping you > @dholmes-ora . >> >> Meanwhile I'll see if I can take this for a spin through our internal >> testing. > Thanks so much. > > Regards, > Monica > >> >> Cheers, >> David @mo-beck The initial comment still has this incorrect link: [2] https://openjdk.java.net/jeps/8251280 Please edit the comment and fix the link. ------------- PR: https://git.openjdk.java.net/jdk/pull/212 From mbeckwit at openjdk.java.net Mon Oct 5 03:24:45 2020 From: mbeckwit at openjdk.java.net (Monica Beckwith) Date: Mon, 5 Oct 2020 03:24:45 GMT Subject: Integrated: 8248238: Implementation: JEP 388: Windows AArch64 Support In-Reply-To: References: Message-ID: On Wed, 16 Sep 2020 20:26:10 GMT, Monica Beckwith wrote: > This is a continuation of https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-August/009566.html > > Changes since then: > * We've improved the write barrier as suggested by Andrew [1] > * The define-guards around R18 have been changed to `R18_RESERVED`. This will be enabled for Windows only for now but > will be required for the upcoming macOS+Aarch64 [2] port as well. > * We've incorporated https://github.com/openjdk/jdk/pull/154 by @AntonKozlov in our PR for now and built the > Windows-specific CPU feature detection on top of it. > > [1] https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-August/009597.html > [2] https://openjdk.java.net/jeps/8251280 This pull request has now been integrated. Changeset: 9604ee82 Author: Monica Beckwith Committer: David Holmes URL: https://git.openjdk.java.net/jdk/commit/9604ee82 Stats: 2566 lines in 62 files changed: 2208 ins; 126 del; 232 mod 8248238: Implementation: JEP 388: Windows AArch64 Support Co-authored-by: Monica Beckwith Co-authored-by: Ludovic Henry Co-authored-by: Bernhard Urban-Forster Reviewed-by: dholmes, cjplummer, aph, ihse ------------- PR: https://git.openjdk.java.net/jdk/pull/212 From david.holmes at oracle.com Mon Oct 5 04:00:40 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 5 Oct 2020 14:00:40 +1000 Subject: RFR: 8253899: Make IsClassUnloadingEnabled signature match specification In-Reply-To: References: <5UPvhVjwK3TOKrd2rSzeCz-xT0tXsCUkMT9R8tgFR8I=.f1723d22-6883-4ad3-af55-a7437ef905de@github.com> Message-ID: <3ac4f938-71ec-ddf7-ed27-7fafe8a0c204@oracle.com> Hi Vladimir, On 3/10/2020 1:29 am, Vladimir Kempik wrote: > On Fri, 2 Oct 2020 07:34:45 GMT, Vladimir Kempik wrote: >>> If you still see no issues here we can delay and make this changeset part of JEP-391. >>> But since this changeset isn't much macos-aarch64 specific, I thought it would be good to integrate it separately from >>> jep-391. >> >> ... this change actually goes against the example in the spec, so if you >> make this change it indicates the spec needs to be updated too. >> >> Cheers, >> David >> ----- > > Hello David > > I really believe the problem is in document here ( in examples) > first, the doc clearly specify the type > > typedef jvmtiError (JNICALL *jvmtiExtensionFunction) > (jvmtiEnv* jvmti_env, > ...); > > then in examples it declares the function not matching this spec. > > Is it a good idea to update the docs in a separate bug ? I don't think it really matters one way or the other as long as any new bug is promptly fixed. At the moment the code matches the example in the spec but they are both "wrong". I'd like to see them both "right" asap. Thanks, David > Thanks, Vladimir > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/466 > From mbeckwit at openjdk.java.net Mon Oct 5 04:11:47 2020 From: mbeckwit at openjdk.java.net (Monica Beckwith) Date: Mon, 5 Oct 2020 04:11:47 GMT Subject: RFR: 8248238: Implementation: JEP 388: Windows AArch64 Support [v12] In-Reply-To: References: Message-ID: On Mon, 5 Oct 2020 03:20:43 GMT, David Holmes wrote: > @mo-beck The initial comment still has this incorrect link: > > [2] https://openjdk.java.net/jeps/8251280 > > Please edit the comment and fix the link. That was a link to the macOS + Arm64 port. But I have removed it as it wasn't needed in the description of this implementation. ------------- PR: https://git.openjdk.java.net/jdk/pull/212 From akozlov at azul.com Mon Oct 5 08:07:17 2020 From: akozlov at azul.com (Anton Kozlov) Date: Mon, 5 Oct 2020 11:07:17 +0300 Subject: [aarch64-port-dev ] [jdk11u] 8253947: Implementation: JEP 388: Windows AArch64 Support In-Reply-To: References: Message-ID: Hi, Adding jdk-updates, where jdk11u is developed. In general, it looks it clearly falls into the category of changes worth backporting[1] (although written for jdk8u) > New features should not generally be back-ported to 8u, except where it is necessary to adapt OpenJDK to new computing environments. For example, [...] ports to new hardware or operating systems [might qualify]. These are necessary for JDK 8u to remain relevant. Process-wise, you don't need to create backports, this is done automatically by a push hook on the HG repo. You need to commit changes under the same bug IDs that were used for integration into the mainline. [2] IMHO, to simplify reviewing (I'm not a reviewer, just expressing what I would find beneficial), it may be useful to split the patch into two. One patch for changes that carry the risk for an existing jdk11u functionality, including support of other operating systems. Another patch for the the new platform support only, like a new file or an extra #ifdef that is turns to nothing on the rest of the platforms. The second patch review could then be reduced to ensure it actually does not harm. Thanks, Anton [1] https://mail.openjdk.java.net/pipermail/jdk8u-dev/2020-June/012002.html [2] https://wiki.openjdk.java.net/display/JDKUpdates/How+to+contribute+a+fix On 02.10.2020 22:49, Ludovic Henry wrote: > Hi, > > As we are getting closer to merge JEP 388 (Windows/AArch64 Port), I wanted to share the work I've done for the backport to JDK 11. > > This email is not an RFR as the webrev is not finalized yet (the JEP implementation hasn't landed on JDK tip). I'm sharing it just to give an idea of the work which will be required, what kind of changes we are facing, and for other parties who have expressed interest in this backport. I'll keep updating it as we get reviews on [1]. > > Once we've landed the JEP implementation into JDK tip, I'll create the appropriate bug IDs for all the necessary backports, and contact the relevant mailing lists with the different webrevs and JBS. > > I've tested this change on Windows-AArch64 against hotspot:tier1, jdk:tier1 and langtools, but not yet on Linux-AArch64. > > Webrev: http://cr.openjdk.java.net/~luhenry/8253947/webrev.00/ > > Thank you, > Ludovic > > [1] https://github.com/openjdk/jdk/pull/212 > From akozlov at azul.com Mon Oct 5 08:44:06 2020 From: akozlov at azul.com (Anton Kozlov) Date: Mon, 5 Oct 2020 11:44:06 +0300 Subject: [aarch64-port-dev ] [jdk11u] 8253947: Implementation: JEP 388: Windows AArch64 Support In-Reply-To: References: <99a84e5c-0838-7281-8eed-f6bf7c6342f1@redhat.com> <957d85208f5d4bcab0d83d3f65ee4995@azul.com> Message-ID: <09bd3b72-43e9-e543-8143-53dfd9e367a1@azul.com> Hi, I think jdk-updates-dev is a right place to discuss this, where all stakeholders are. I took the courage to reply to your original message with the jdk-updates included, as this thread became quite messy. IMHO, we should not mix windows and macos in one patch. It would be harder to review and understand, the factors causing the resistance to new features in update projects. There is always a risk to break something working, and be able to find that relatively late, due to any update project is a codebase fragmentation of the OpenJDK as a whole. I also hope to see windows port integrated into the jdk11u, to avoid fragmentation even more for jdk11 and e.g. jdk11-aarch64. Thanks, Anton On 04.10.2020 18:11, Ludovic Henry wrote: > Hi Vladmir, > > For macOS-AArch64, I'd be happy to work with you create a common webrev for both Windows-AArch64 and macOS-AArch64 support for JDK 11. And if we take the road for a `jdk11-windows` repo in the aarch64-project, we should extend the idea to a `jdk11u` repo that would host both Windows-AArch64 and macOS-AArch64 patches. > > Thank you, > Ludovic > > From: Vladimir Kempik > Sent: Saturday, October 3, 2020 12:58 PM > To: Ludovic Henry ; aarch64-port-dev at openjdk.java.net; aph at redhat.com; hotspot-dev at openjdk.java.net > Cc: openjdk-aarch64 > Subject: RE: [aarch64-port-dev ] [jdk11u] 8253947: Implementation: JEP 388: Windows AArch64 Support > > Hello Andrew. > >> I warn you now that you may never get approval to backport this stuff to mainline jdk11u because it's too disruptive. > > Shenandoah? > > Could you also please suggest us about jep-391. > Does Redhat would like to see macos-aarch64 support in openjdk11, or its better to leave users with Rosetta translator? > > If jep-391 is welcomed into 11u, then it will need some parts of windows-aarch64 port (x18 exclusion). > > Kind regards, Vladimir. > > Ludovic Henry > 3 ??????? 2020 ?. 22:21:51 ???????: > > Hi Andrew, > > I warn you now that you may never get approval to backport this stuff to mainline jdk11u because it's too disruptive. > > That is in line with what we discussed previously. > > That needn't worry you: we've never backported AArch64 to jdk8u, but that doesn't matter to anyone as long as it all works. > > Microsoft do intend to do the backport on the Microsoft distribution of the OpenJDK, similarly to other features in other distributions. But instead of keeping everything internally, we much rather have this patch open to the community. > > Even if the patch doesn't get integrated, what would be the best place to keep it in the open? Could we create a repo in the aarch64-port project [1], like a jdk11-windows repo for example? The same problematic will rise for macOS-AArch64. > > Thank you, > Ludovic > > [1] https://hg.openjdk.java.net/aarch64-port > From thartmann at openjdk.java.net Mon Oct 5 10:09:48 2020 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Mon, 5 Oct 2020 10:09:48 GMT Subject: RFR: 8254010: GrowableArrayView::print fails to compile Message-ID: When adding some debugging code, I've noticed that the (currently unused) GrowableArrayView::print fails to compile. The fix is to use the `%d` format specifier for the int fields `_len` and `_max` and cast `this` to `intptr_t`. Thanks, Tobias ------------- Commit messages: - 8254010: GrowableArrayView::print fails to compile Changes: https://git.openjdk.java.net/jdk/pull/502/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=502&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254010 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/502.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/502/head:pull/502 PR: https://git.openjdk.java.net/jdk/pull/502 From bulasevich at openjdk.java.net Mon Oct 5 10:43:42 2020 From: bulasevich at openjdk.java.net (Boris Ulasevich) Date: Mon, 5 Oct 2020 10:43:42 GMT Subject: RFR: 8253901: ARM32 build crashes after JDK-8253540 Message-ID: [JDK-8253540](https://bugs.openjdk.java.net/browse/JDK-8253540) changed InterpreterRuntime::monitorexit call from call_VM to call_VM_leaf. This requires additional arrangement for ARM32: the parameter must be in R0. ------------- Commit messages: - 8253901: ARM32 build crashes after JDK-8253540 Changes: https://git.openjdk.java.net/jdk/pull/503/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=503&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253901 Stats: 9 lines in 3 files changed: 2 ins; 0 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/503.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/503/head:pull/503 PR: https://git.openjdk.java.net/jdk/pull/503 From aph at redhat.com Mon Oct 5 10:43:42 2020 From: aph at redhat.com (Andrew Haley) Date: Mon, 5 Oct 2020 11:43:42 +0100 Subject: [aarch64-port-dev ] [jdk11u] 8253947: Implementation: JEP 388: Windows AArch64 Support In-Reply-To: <957d85208f5d4bcab0d83d3f65ee4995@azul.com> References: <99a84e5c-0838-7281-8eed-f6bf7c6342f1@redhat.com> <957d85208f5d4bcab0d83d3f65ee4995@azul.com> Message-ID: <88ffa3cc-2a72-2211-f69b-9fd281d66b75@redhat.com> On 03/10/2020 20:57, Vladimir Kempik wrote: > >>I warn you now that you may never get approval to backport this >>stuff to mainline jdk11u because it's too disruptive. > > Shenandoah? Shenandoah doesn't give blanket permission for any and every backport to be done. There are differences between Shenandoah and an AArch64/Windows backport. The first three I think of are: 1. Over time, a release branch becomes more and more stable, and fewer and few backports are accepted. JDK 11u is getting well into middle age. 2. Shenandoah had been maintained in a fairly stable out-of-mainline repo for a very long time. It was not in any way new when it was merged. 3. The Shenandoah backport into 11u provably had no effect unless it was enabled. > Could you also please suggest us about jep-391. Same as above. > Does Redhat would like to see macos-aarch64 support in openjdk11, or > its better to leave users with Rosetta translator? I'm not sure what Red Hat's opinion might be, or even how I'd find out. However, I do know that stability is the first priority for the JDK 11u project, which is why we went to such extraordinary lengths to ensure we didn't break anything with the Shenandoah backport. In addition, there will be ways for users to get JDK 11 / MacOS / AArch64 even if it's not in the man 11u tree. > If jep-391 is welcomed into 11u, then it will need some parts of > windows-aarch64 port (x18 exclusion). I see. Note that I am *not* ruling out AArch64 support in JDK 11. If it can be done cleanly and safely that will be great. However, to begin with, it will not go into OpenJDK 11u. The stability of the main release branch of OpenJDK is far too important for it to be broken by an untested new port. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From stefank at openjdk.java.net Mon Oct 5 10:55:39 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 5 Oct 2020 10:55:39 GMT Subject: RFR: 8254010: GrowableArrayView::print fails to compile In-Reply-To: References: Message-ID: On Mon, 5 Oct 2020 10:04:17 GMT, Tobias Hartmann wrote: > When adding some debugging code, I've noticed that the (currently unused) GrowableArrayView::print fails to compile. > The fix is to use the `%d` format specifier for the int fields `_len` and `_max` and cast `this` to `intptr_t`. > Thanks, > Tobias Marked as reviewed by stefank (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/502 From mdoerr at openjdk.java.net Mon Oct 5 11:00:38 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Mon, 5 Oct 2020 11:00:38 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts [v2] In-Reply-To: References: Message-ID: On Fri, 2 Oct 2020 09:00:54 GMT, Robbin Ehn wrote: >> The issue is that this test doesn't consider Handshake All operation. >> Depending if/when such operation is scheduled it can lockup the VM thread. >> And the safepoint that should timeout never happens. >> See issue for more information. >> >> So I changed the test to "try timeout" the safepoint, but if there was no safepoint (blocked by a handshake all), we >> retry. We sleep unsafe much longer than the interval SafepointALot generates operations, which 'guarantees' we will >> timeout if there is no handshake all. (some extreme case of kernel scheduling causing a very long context switch could >> also make us not timeout) Passes t1, t3, and repeat runs of the test. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Update with input from reviews Thanks for reimplementing it to resolve problems with handshake all operations. test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 97: Can we check for another frame like e.g. WB_WaitUnsafe? AbortVMOnSafepointTimeout is designed to provide a stack trace of the thread which is blocking the safepoint. ------------- PR: https://git.openjdk.java.net/jdk/pull/465 From eosterlund at openjdk.java.net Mon Oct 5 11:43:52 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 5 Oct 2020 11:43:52 GMT Subject: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v10] In-Reply-To: References: Message-ID: > This PR the implementation of "JEP 376: ZGC: Concurrent Thread-Stack Processing" (cf. > https://openjdk.java.net/jeps/376). > Basically, this patch modifies the epilog safepoint when returning from a frame (supporting interpreter frames, c1, c2, > and native wrapper frames), to compare the stack pointer against a thread-local value. This turns return polls into > more of a swiss army knife that can be used to poll for safepoints, handshakes, but also returns into not yet safe to > expose frames, denoted by a "stack watermark". ZGC will leave frames (and other thread oops) in a state of a mess in > the GC checkpoint safepoints, rather than processing all threads and their stacks. Processing is initialized > automagically when threads wake up for a safepoint, or get poked by a handshake or safepoint. Said initialization > processes a few (3) frames and other thread oops. The rest - the bulk of the frame processing, is deferred until it is > actually needed. It is needed when a frame is exposed to either 1) execution (returns or unwinding due to exception > handling), or 2) stack walker APIs. A hook is then run to go and finish the lazy processing of frames. Mutator and GC > threads can compete for processing. The processing is therefore performed under a per-thread lock. Note that disarming > of the poll word (that the returns are comparing against) is only performed by the thread itself. So sliding the > watermark up will require one runtime call for a thread to note that nothing needs to be done, and then update the poll > word accordingly. Downgrading the poll word concurrently by other threads was simply not worth the complexity it > brought (and is only possible on TSO machines). So left that one out. Erik ?sterlund has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 16 commits: - Review: Deal with new assert from mainline - Merge branch 'master' into 8253180_conc_stack_scanning - Review: StackWalker hook - Review: Kim CR 1 and exception handling fix - Review: Move barrier detach - Review: Remove assert that has outstayed its welcome - Merge branch 'master' into 8253180_conc_stack_scanning - Review: Albert CR2 and defensive programming - Review: StefanK CR 3 - Review: Per CR 1 - ... and 6 more: https://git.openjdk.java.net/jdk/compare/9604ee82...e633cb94 ------------- Changes: https://git.openjdk.java.net/jdk/pull/296/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=296&range=09 Stats: 2740 lines in 131 files changed: 2167 ins; 311 del; 262 mod Patch: https://git.openjdk.java.net/jdk/pull/296.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/296/head:pull/296 PR: https://git.openjdk.java.net/jdk/pull/296 From rehn at openjdk.java.net Mon Oct 5 12:38:41 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 5 Oct 2020 12:38:41 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts [v2] In-Reply-To: References: Message-ID: On Mon, 5 Oct 2020 10:55:08 GMT, Martin Doerr wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Update with input from reviews > > test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 97: > > > Can we check for another frame like e.g. WB_WaitUnsafe? AbortVMOnSafepointTimeout is designed to provide a stack trace > of the thread which is blocking the safepoint. Since top frame is always different (platform dependent, e.g. clock_nanosleep) I removed that. But "sun.hotspot.WhiteBox.waitUnsafe" or "TestAbortVMOnSafepointTimeout$Test.main" should be below, we could use something like that. So sure I'll add it back with "TestAbortVMOnSafepointTimeout$Test.main". ------------- PR: https://git.openjdk.java.net/jdk/pull/465 From vkempik at openjdk.java.net Mon Oct 5 12:47:02 2020 From: vkempik at openjdk.java.net (Vladimir Kempik) Date: Mon, 5 Oct 2020 12:47:02 GMT Subject: RFR: 8253899: Make IsClassUnloadingEnabled signature match specification In-Reply-To: References: <5UPvhVjwK3TOKrd2rSzeCz-xT0tXsCUkMT9R8tgFR8I=.f1723d22-6883-4ad3-af55-a7437ef905de@github.com> Message-ID: On Fri, 2 Oct 2020 15:26:30 GMT, Vladimir Kempik wrote: >>> Okay but look at the example that documentation gives: >>> >>> > For example, if the jvmtiParamInfo returned by GetExtensionEvents indicates that there is a jint parameter, the event >>> > handler should be declared: ``` >>> > void JNICALL myHandler(jvmtiEnv* jvmti_env, jint myInt, ...) >>> > ``` >>> >>> The myInt is explicit, just as our "jboolean* enabled" is explicit. I think they key point is that the signature must >>> end with "..." which it does. >>> I don't see anything here that needs to be fixed. >> >> Hello David. On majority of platforms this would be fine. >> >> But on some platforms, variadic arguments and non variadic arguments are passed differently ( for example on >> macos-aarch64, variadic args are passed always on stack, non variadic on registers (and on stack for 9th+ arg) , that >> causes issues. If you still see no issues here we can delay and make this changeset part of JEP-391. >> But since this changeset isn't much macos-aarch64 specific, I thought it would be good to integrate it separately from >> jep-391. >> Regards, Vladimir > >> _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on >> [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ >> Hi Vladimir, >> >> On 2/10/2020 5:37 pm, Vladimir Kempik wrote: >> >> > On Fri, 2 Oct 2020 07:27:17 GMT, David Holmes wrote: >> > > Okay but look at the example that documentation gives: >> > > > For example, if the jvmtiParamInfo returned by GetExtensionEvents indicates that there is a jint parameter, the event >> > > > handler should be declared: ``` >> > > > void JNICALL myHandler(jvmtiEnv* jvmti_env, jint myInt, ...) >> > > > ``` >> > > >> > > >> > > The myInt is explicit, just as our "jboolean* enabled" is explicit. I think they key point is that the signature must >> > > end with "..." which it does. >> > > I don't see anything here that needs to be fixed. >> > >> > >> > Hello David. On majority of platforms this would be fine. >> > But on some platforms, variadic arguments and non variadic arguments are passed differently ( for example on >> > macos-aarch64, variadic args are passed always on stack, non variadic on registers (and on stack for 9th+ arg) , that >> > causes issues. >> >> Okay - I see the potential for a problme here but ... >> >> > If you still see no issues here we can delay and make this changeset part of JEP-391. >> > But since this changeset isn't much macos-aarch64 specific, I thought it would be good to integrate it separately from >> > jep-391. >> >> ... this change actually goes against the example in the spec, so if you >> make this change it indicates the spec needs to be updated too. >> >> Cheers, >> David >> ----- > > Hello David > > I really believe the problem is in document here ( in examples) > first, the doc clearly specify the type > > typedef jvmtiError (JNICALL *jvmtiExtensionFunction) > (jvmtiEnv* jvmti_env, > ...); > > then in examples it declares the function not matching this spec. > > Is it a good idea to update the docs in a separate bug ? > > Thanks, Vladimir Hello David I have created CSR draft https://bugs.openjdk.java.net/browse/JDK-8254014 Regards, Vladimir ------------- PR: https://git.openjdk.java.net/jdk/pull/466 From rehn at openjdk.java.net Mon Oct 5 12:55:42 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 5 Oct 2020 12:55:42 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts [v2] In-Reply-To: References: Message-ID: On Mon, 5 Oct 2020 12:33:56 GMT, Robbin Ehn wrote: >> test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 97: >> >> >> Can we check for another frame like e.g. WB_WaitUnsafe? AbortVMOnSafepointTimeout is designed to provide a stack trace >> of the thread which is blocking the safepoint. > > Since top frame is always different (platform dependent, e.g. clock_nanosleep) I removed that. > And only top frame is in output. The other frames are only in hs_err. > > So I couldn't keep it. > > You have some idea how to? I update comment before mail was sent, still the old comment was sent, please see edited comment :) ------------- PR: https://git.openjdk.java.net/jdk/pull/465 From pchilanomate at openjdk.java.net Mon Oct 5 14:03:44 2020 From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo) Date: Mon, 5 Oct 2020 14:03:44 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts [v2] In-Reply-To: References: Message-ID: On Mon, 5 Oct 2020 10:58:01 GMT, Martin Doerr wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Update with input from reviews > > Thanks for reimplementing it to resolve problems with handshake all operations. > > LGTM > > Thanks! There is an update, please consider. Still looks good. ------------- PR: https://git.openjdk.java.net/jdk/pull/465 From mdoerr at openjdk.java.net Mon Oct 5 14:47:43 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Mon, 5 Oct 2020 14:47:43 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts [v2] In-Reply-To: References: Message-ID: On Mon, 5 Oct 2020 12:52:41 GMT, Robbin Ehn wrote: >> Since top frame is always different (platform dependent, e.g. clock_nanosleep) I removed that. >> And only top frame is in output. The other frames are only in hs_err. >> >> So I couldn't keep it. >> >> You have some idea how to? > > I update comment before mail was sent, still the old comment was sent, please see edited comment :) You're rigth, OutputAnalyzer can only see the top frame which is platform dependent. I think scanning the hs_err file would be too complicated. So I'm ok with omitting the check. ------------- PR: https://git.openjdk.java.net/jdk/pull/465 From mdoerr at openjdk.java.net Mon Oct 5 14:47:43 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Mon, 5 Oct 2020 14:47:43 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts [v2] In-Reply-To: References: Message-ID: On Fri, 2 Oct 2020 09:00:54 GMT, Robbin Ehn wrote: >> The issue is that this test doesn't consider Handshake All operation. >> Depending if/when such operation is scheduled it can lockup the VM thread. >> And the safepoint that should timeout never happens. >> See issue for more information. >> >> So I changed the test to "try timeout" the safepoint, but if there was no safepoint (blocked by a handshake all), we >> retry. We sleep unsafe much longer than the interval SafepointALot generates operations, which 'guarantees' we will >> timeout if there is no handshake all. (some extreme case of kernel scheduling causing a very long context switch could >> also make us not timeout) Passes t1, t3, and repeat runs of the test. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Update with input from reviews Looks good to me. Thanks. ------------- Marked as reviewed by mdoerr (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/465 From dcubed at openjdk.java.net Mon Oct 5 15:35:51 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 5 Oct 2020 15:35:51 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts [v2] In-Reply-To: <9CQIDFYeLLRibIAwy6T_XgWPoB4STYNycFra37XBvGw=.34931f06-646a-4b93-be55-594bfdea6931@github.com> References: <9CQIDFYeLLRibIAwy6T_XgWPoB4STYNycFra37XBvGw=.34931f06-646a-4b93-be55-594bfdea6931@github.com> Message-ID: On Fri, 2 Oct 2020 06:48:35 GMT, Robbin Ehn wrote: >> test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 71: >> >>> 69: ProcessBuilder pb = ProcessTools.createJavaProcessBuilder( >>> 70: "-XX:+UnlockDiagnosticVMOptions", >>> 71: "-XX:-UseBiasedLocking", >> >> I think "-XX:-UseBiasedLocking" is specified to make sure >> that Biased Locking is disabled even in test tasks where it >> is enabled by task specific flags. > > Yes. But now this test is fine with using biased locking. Okay thanks for the clarification. >> test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 79: >> >>> 77: } >>> 78: } >>> 79: output.shouldNotHaveExitValue(0); >> >> Looks like the test doesn't require that this mesg get printed: >> `System.out.println("This message would occur after some time.");` >> >> And it is set up to detect that the SafepointTimeout happened >> which is what we want the test to verify at the core. > > The line "This message would occur after some time." should never be print if VM is working. > If the VM fails for some reason and the timeout is not performed, line: > "Timed out while spinning to reach a safepoint." is never printed and the OutputAnalyzer fails the test. > If we did timeout and it was printed we know that we didn't print the other message, since the only thread that can > timeout is the one printing that message. > The second part verifies that the SIGILL was delivered. Okay, but then this message when you're reading the code is misleading: `System.out.println("This message would occur after some time.");` It should be printing something like: `System.out.println("This message only prints if something is broken.");` ------------- PR: https://git.openjdk.java.net/jdk/pull/465 From dcubed at openjdk.java.net Mon Oct 5 15:35:50 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 5 Oct 2020 15:35:50 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts [v2] In-Reply-To: References: Message-ID: On Fri, 2 Oct 2020 09:00:54 GMT, Robbin Ehn wrote: >> The issue is that this test doesn't consider Handshake All operation. >> Depending if/when such operation is scheduled it can lockup the VM thread. >> And the safepoint that should timeout never happens. >> See issue for more information. >> >> So I changed the test to "try timeout" the safepoint, but if there was no safepoint (blocked by a handshake all), we >> retry. We sleep unsafe much longer than the interval SafepointALot generates operations, which 'guarantees' we will >> timeout if there is no handshake all. (some extreme case of kernel scheduling causing a very long context switch could >> also make us not timeout) Passes t1, t3, and repeat runs of the test. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Update with input from reviews Changes requested by dcubed (Reviewer). src/hotspot/share/prims/whitebox.cpp line 77: > 75: #include "runtime/jniHandles.inline.hpp" > 76: #include "runtime/os.hpp" > 77: #include "runtime/safepoint.hpp" I don't think you need this include change anymore. test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 74: > 72: Integer waitTime = Integer.parseInt(args[0]); > 73: WhiteBox wb = WhiteBox.getWhiteBox(); > 74: // While no safepoint timeout. Perhaps: // Loop here to cause a safepoint timeout. ------------- PR: https://git.openjdk.java.net/jdk/pull/465 From dcubed at openjdk.java.net Mon Oct 5 15:43:46 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 5 Oct 2020 15:43:46 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts [v2] In-Reply-To: <9CQIDFYeLLRibIAwy6T_XgWPoB4STYNycFra37XBvGw=.34931f06-646a-4b93-be55-594bfdea6931@github.com> References: <9CQIDFYeLLRibIAwy6T_XgWPoB4STYNycFra37XBvGw=.34931f06-646a-4b93-be55-594bfdea6931@github.com> Message-ID: On Fri, 2 Oct 2020 06:37:21 GMT, Robbin Ehn wrote: >> test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 30: >> >>> 28: /* >>> 29: * @test TestAbortVMOnSafepointTimeout >>> 30: * @summary Check if VM can kill thread which doesn't reach safepoint. >> >> Not your bug, but this summary is wrong. Perhaps: >> `@summary Check if VM aborts when a thread doesn't reach safepoint.` > > The timeout shots a SIGILL on the 'slow' thread, it does not abort (it do abort if it can't send the signal). > Test also checks that the log says we have done this. Okay thanks for the clarification. ------------- PR: https://git.openjdk.java.net/jdk/pull/465 From dcubed at openjdk.java.net Mon Oct 5 15:47:49 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 5 Oct 2020 15:47:49 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts [v2] In-Reply-To: References: Message-ID: On Mon, 5 Oct 2020 15:33:01 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Update with input from reviews > > Changes requested by dcubed (Reviewer). Robbin replied: > David H. wrote: > > That all said, for the record, we really should have a handshake timeout mechanism the same as we have the safepoint > > timeout mechanism. > > We have a timeout mechanism but default off HandshakeTimeout. > But it doesn't fire SIGILL to troubled thread as safepoint does. What's the conclusion here? Are there going to be changes to the test to use the HandshakeTimeout option? Should the test have failed in a different way than it did? ------------- PR: https://git.openjdk.java.net/jdk/pull/465 From coleenp at openjdk.java.net Mon Oct 5 15:48:44 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 5 Oct 2020 15:48:44 GMT Subject: RFR: 8253433: Remove -XX:+Debugging product option In-Reply-To: References: Message-ID: On Thu, 1 Oct 2020 01:30:16 GMT, David Holmes wrote: >> The Debugging option shouldn't be used on the command line. There's a SuppressErrorAt option to ignore certain >> asserts, if there is some situation needing that. Debugging should never be used. >> Tested with tier1 tests on 4 platforms. > > Seems fine. Thanks. >This is really just about the one existing assignment. See the Command context class in debug.cpp. This variable probably shouldn?t be set/reset much of anywhere else. There's also a place in macroAssembler that sets it to true also, that calls os::print_location() where we also want to prevent crashes. We could rewrite that to use the debug.cpp find functions but doesn't seem worth doing. Thank you for the code reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/434 From coleenp at openjdk.java.net Mon Oct 5 15:48:45 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 5 Oct 2020 15:48:45 GMT Subject: Integrated: 8253433: Remove -XX:+Debugging product option In-Reply-To: References: Message-ID: On Wed, 30 Sep 2020 12:38:30 GMT, Coleen Phillimore wrote: > The Debugging option shouldn't be used on the command line. There's a SuppressErrorAt option to ignore certain > asserts, if there is some situation needing that. Debugging should never be used. > Tested with tier1 tests on 4 platforms. This pull request has now been integrated. Changeset: 4d29116d Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/4d29116d Stats: 15 lines in 4 files changed: 8 ins; 4 del; 3 mod 8253433: Remove -XX:+Debugging product option Reviewed-by: kbarrett, stuefe, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/434 From mdoerr at openjdk.java.net Mon Oct 5 15:55:40 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Mon, 5 Oct 2020 15:55:40 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding In-Reply-To: References: <3zdj-jaGgS6_W7mNPSnlZr6JECPKSybWxR1-yy5bZ8Q=.90bff33b-0da1-49cf-9833-ee08caf55c6f@github.com> Message-ID: <7fS4cztWp4ebESEXcO7lrRepZZSye1KBoQGOyYvUwfM=.df0e2d21-7867-431c-89b9-dcf0ab001640@github.com> On Wed, 30 Sep 2020 19:16:58 GMT, CoreyAshford wrote: >>> Did you try on x86? AOT is not supported on PPC64. >> >> I didn't. No wonder. Thank you! > >> Did you try on x86? AOT is not supported on PPC64. > > After looking at this a bit, I find that there seems to be an assumption in the code that if there is an intrinsic > symbol defined in aotCodeHeap.cpp using the SET_AOT_GLOBAL_SYMBOL_VALUE macro, it is required that the intrinsic is > implemented for every arch that implements AOT. In this case, there isn't an implementation for x86_64 (yet), so > that's why the failure is occurring. I was tempted to put in an arch-specific #if for ppc arch only, but I don't see > any arch-specific code in this area, and it doesn't make sense either because AOT isn't supported on ppc at all. > Another alternative is to remove the SET_AOT_GLOBAL_SYMBOL_VALUE for decodeBlock, since the implementation is not > defined (yet) for any arch which supports AOT. A third alternative would be to leave the macro call in, but comment it > out, saying to uncomment it when it's supported on all AOT-capable arches. Any thoughts? Nobody replied, so I suggest to leave aotCodeHeap.cpp as it was. AOT folks can add it when they need it. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From github.com+4146708+a74nh at openjdk.java.net Mon Oct 5 16:04:54 2020 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Mon, 5 Oct 2020 16:04:54 GMT Subject: RFR: 8221554: aarch64 cross-modifying code [v2] In-Reply-To: References: Message-ID: > The AArch64 port uses maybe_isb in places where an ISB might be required > because the code may have safepointed. These maybe_isbs are very conservative > and are used in many places are used when a safepoint has not happened. > > cross_modify_fence was added in common code to place a barrier in all the > places after a safepoint has occurred. All the uses of it are in common code, > yet it remains unimplemented on AArch64. > > This set of patches implements cross_modify_fence for AArch64 and reconsiders > every uses of maybe_isb, discarding many of them. In addition, it introduces > a new diagnostic option, which when enabled on AArch64 tests the correct > usage of the barriers. > > Advantage of this patch is threefold: > * Reducing the number of ISBs - giving a theoretical performance improvement. > * Use of common code instead of backend specific code. > * Additional test diagnostic options > > Patch 1: Split cross_modify_fence > ================================= > This is simply refactoring work split out to simplify the other two patches. > > instruction_fence() is provided by each target and simply places > a fence for the instruction stream. > > cross_modify_fence() is now a member of JavaThread and just calls > instruction_fence. This function will be extended in Patch 3. > > Patch 2: Use cross_modify_fence instead of maybe_isb > ==================================================== > > The [n] References refer to the comments for cross_modify_fence in > thread.hpp. > > This is all the existing uses of maybe_isb in the AArch64 target: > > 1) Instances of Java code calling a VM function > * This encapsulates the changes to: > ** MacroAssembler::call_VM_leaf_base() > ** generate_fast_get_int_field0() > ** stubGenerator_aarch64 generate_throw_exception() > ** sharedRuntime_aarch64 generate_handler_blob() > ** SharedRuntime::generate_resolve_blob() > ** C1 LIR_Assembler::rt_call > ** C1 StubAssembler::call_RT(): used by Used by generate_exception_throw, > generate_handle_exception, generate_code_for. > ** OptoRuntime::generate_exception_blob() > * Any changes will be caught due to calls to [2] or [3] by the VM function. > * Any calls that do not call [2] or [3] do not require an ISB. > * This patch is more optimal for these cases. > > 2) Instances of Java code calling a JNI function > * This encapsulates the changes to: > ** SharedRuntime::generate_native_wrapper() > ** TemplateInterpreterGenerator::generate_native_entry() > * A safepoint still in progress after the call with be caught by [4]. > * An ISB is still required for the case where there was a safepoint > but it completed during the call. This happens if the code doesn't > branch on safepoint_in_progress > * In the SharedRuntime version, the two possible calls to > reguard_yellow_pages and complete_monitor_unlocking_C are after the thread > goes back into it's original state, so are covered by [2] and [3], the > same as a normal VM call. > * This patch is only more optimal for the two post-JNI calls. > > 3) Patching functions > * This encapsulates the changes to: > ** patch_callers_callsite() (called by gen_c2i_adapter()) > * This results in code being patched, but does not safepoint > * Therefore an ISB is required. > * This patch introduces no change here. > > 4) C1 MacroAssembler::emit_static_call_stub() > * Calls ISB (not maybe_isb) > * By design, the patching doesn't require that the up-to-date > destination is required for proper functioning. > * However, the ISB makes it most likely that the new destination will > be picked up. > * This patch introduces no change here. > > Patch 3: Add cross modify fence verification > ============================================ > > The VerifyCrossModifyFence diagnostic flag enables confirmation to the correct > usage of instruction barriers. It can safely be enabled on any Java run. > > Enabling it will cause the following: > > * Once all threads have been brought to a safepoint, each thread will be > marked. > > * On a cross_modify_fence and safepoint_fence the mark for that thread > will be cleared. > > * On entry to a method and in a safepoint poll, then the thread is checked. > If it is marked, then the code will error. Alan Hayward has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: - AArch64: Add cross modify fence verification - AArch64: Use cross_modify_fence instead of maybe_isb - Split cross_modify_fence ------------- Changes: https://git.openjdk.java.net/jdk/pull/428/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=428&range=01 Stats: 170 lines in 25 files changed: 123 ins; 8 del; 39 mod Patch: https://git.openjdk.java.net/jdk/pull/428.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/428/head:pull/428 PR: https://git.openjdk.java.net/jdk/pull/428 From github.com+4146708+a74nh at openjdk.java.net Mon Oct 5 16:08:41 2020 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Mon, 5 Oct 2020 16:08:41 GMT Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: <35eLsMpWmcCUoiEWhnYdSpZNmvLy4ra56Qtd6eRW574=.4e7c9278-3e0d-457d-9c15-eef45bae9755@github.com> References: <35eLsMpWmcCUoiEWhnYdSpZNmvLy4ra56Qtd6eRW574=.4e7c9278-3e0d-457d-9c15-eef45bae9755@github.com> Message-ID: On Fri, 2 Oct 2020 09:15:46 GMT, Alan Hayward wrote: >>> _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on >>> [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ >>> Hi Alan, >>> >>> On 1/10/2020 2:30 am, Alan Hayward wrote: >>> >>> > The AArch64 port uses maybe_isb in places where an ISB might be required >>> > because the code may have safepointed. These maybe_isbs are very conservative >>> > and are used in many places are used when a safepoint has not happened. >>> > cross_modify_fence was added in common code to place a barrier in all the >>> > places after a safepoint has occurred. All the uses of it are in common code, >>> > yet it remains unimplemented on AArch64. >>> > This set of patches implements cross_modify_fence for AArch64 and reconsiders >>> > every uses of maybe_isb, discarding many of them. In addition, it introduces >>> > a new diagnostic option, which when enabled on AArch64 tests the correct >>> > usage of the barriers. >>> > Advantage of this patch is threefold: >>> > * Reducing the number of ISBs - giving a theoretical performance improvement. >>> > * Use of common code instead of backend specific code. >>> > * Additional test diagnostic options >>> > Patch 1: Split cross_modify_fence >>> > ================================= >>> > This is simply refactoring work split out to simplify the other two patches. >>> > instruction_fence() is provided by each target and simply places >>> > a fence for the instruction stream. >>> > cross_modify_fence() is now a member of JavaThread and just calls >>> > instruction_fence. This function will be extended in Patch 3. >>> >>> I don't agree with the change here. The cross_modify_fence() is not >>> related to thread API imo, it belongs in OrderAccess. The name was >>> deliberately selected to abstract away from the specific details of why >>> a given platform may need this fence: >>> >>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-March/037153.html >>> >>> "The name "instruction_pipeline" seems a bit implementation specific >>> about what HW architectural features need to be taken care of due to >>> cross-modifying code, which may or may not apply to a given platform. >>> Perhaps cross_modify_fence(), or something along those lines, would be >>> better. That makes it more clear what we are protecting against, as >>> opposed to what HW architectural features that might concern on a given >>> platform." >>> >>> @robehn , @fisk please chime in here. :) >>> >>> Thanks, >>> David >> >> I have no strong feeling on the names of these functions. >> The reason for moving it was that in the third part (the verification testing) it needs JavaThread. >> In an earlier version I simply had OrderAccess::cross_modify_fence(JavaThread *thread). But then OrderAccess is >> dependant on JavaThread, which felt wrong. Obvious solution was to add a wrapper in JavaThread that calls down to >> OrderAccess. Alternatively, I could switch the name of instruction_fence back to cross_modify_fence, and then think of >> a name for the added function in JavaThread. Alternatively alternatively, they could both be call cross_modify_fence. >> If the verification test patch was removed from the set, then most of the first patch wouldn't be needed either. > >> _Mailing list message from [Andrew Haley](mailto:aph at redhat.com) on [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ >> >> On 02/10/2020 09:08, Alan Hayward wrote: >> >> > I have no strong feeling on the names of these functions. >> > The reason for moving it was that in the third part (the verification testing) it needs JavaThread. >> >> Right, but you can get the JavaThread efficiently any time you want, >> so you don't need to pass it to cross_modify_fence(). >> > > Oh, ok, didn't spot that. > This would result in code in OrderAccess.cpp calling a function in JavaThread. > It feels that OrderAccess should be much lower level than JavaThread. But, that might be ok. Patch updated. * cross_modify_fence now calls cross_modify_fence_impl as suggested. * ISBs in the JNI calls have been removed. This means that it is currently unsafe to merge until https://github.com/openjdk/jdk/pull/296 has been merged. ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From github.com+51754783+coreyashford at openjdk.java.net Mon Oct 5 16:11:41 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Mon, 5 Oct 2020 16:11:41 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding In-Reply-To: <7fS4cztWp4ebESEXcO7lrRepZZSye1KBoQGOyYvUwfM=.df0e2d21-7867-431c-89b9-dcf0ab001640@github.com> References: <3zdj-jaGgS6_W7mNPSnlZr6JECPKSybWxR1-yy5bZ8Q=.90bff33b-0da1-49cf-9833-ee08caf55c6f@github.com> <7fS4cztWp4ebESEXcO7lrRepZZSye1KBoQGOyYvUwfM=.df0e2d21-7867-431c-89b9-dcf0ab001640@github.com> Message-ID: On Mon, 5 Oct 2020 15:53:26 GMT, Martin Doerr wrote: > Nobody replied, so I suggest to leave aotCodeHeap.cpp as it was. AOT folks can add it when they need it. Ok, I will drop that change from that PR. Thank you. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From minqi at openjdk.java.net Mon Oct 5 16:42:52 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Mon, 5 Oct 2020 16:42:52 GMT Subject: RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive [v8] In-Reply-To: References: Message-ID: <-7yd4KLXoqQqOIaMdI0m_6GxtCoo0LWHFBBPWvv2sBA=.30020f59-27f6-4cb1-ab22-13ed3355fc31@github.com> > This patch is reorganized after 8252725, which is separated from this patch to refactor jlink glugin code. The previous > webrev with hg can be found at: http://cr.openjdk.java.net/~minqi/2020/8247536/webrev-05. With 8252725 integrated, the > regeneration of holder classes is simply to call the new added GenerateJLIClassesHelper.cdsGenerateHolderClasses > function. Tests: tier1-4 Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: Move the check work to java, restore code in VM. Modified test code according to the changes. The invoke name verififcation is not implemented since not all the holder class are processed, not all the functions of processed holder classes are added. For holder class with DirectMethodHandle in its name, only the name in the DMH_METHOD_TYPE_MAP keyset is added, ithe line with other names just gets skipped silently. This makes the verification on invoke names difficul, a name not in the keyset should not fail the test. Also add a boolean to cdsGenerateHolderClasses to indicate call path. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/193/files - new: https://git.openjdk.java.net/jdk/pull/193/files/9b0f523b..125112b3 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=193&range=07 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=193&range=06-07 Stats: 87 lines in 3 files changed: 70 ins; 10 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/193.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/193/head:pull/193 PR: https://git.openjdk.java.net/jdk/pull/193 From iignatyev at openjdk.java.net Mon Oct 5 16:46:55 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Mon, 5 Oct 2020 16:46:55 GMT Subject: RFR: 8253750: use build-stable default seed for Utils.RANDOM_GENERATOR [v3] In-Reply-To: References: Message-ID: > Hi all, > > could you please review the patch which updates `jdk.test.lib.Utils` to use md5 hash-sum of `java.vm.version` property > as default seed for `Utils.RANDOM_GENERATOR`? > from JBS: >> using the same seed for all runs of a build will make it possible (easier) to compare results from different test runs >> (e.g. on different platforms, w/ different flags) and consequently will make test results analysis easier. the proposed >> solution is to use the seed based on Runtime.version() / "java.vm.version", which are different from build to build, if >> there is no seed specified by "jdk.test.lib.random.seed" property. > > the patch also updates `RandomGeneratorTest` test, so it expects now that the same values are generated if no seed is > provided. > testing: ? tier1 Igor Ignatyev has updated the pull request incrementally with one additional commit since the last revision: use random seed for personal/internal builds ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/391/files - new: https://git.openjdk.java.net/jdk/pull/391/files/381e14f7..07d85a20 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=391&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=391&range=01-02 Stats: 41 lines in 2 files changed: 19 ins; 4 del; 18 mod Patch: https://git.openjdk.java.net/jdk/pull/391.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/391/head:pull/391 PR: https://git.openjdk.java.net/jdk/pull/391 From iignatyev at openjdk.java.net Mon Oct 5 16:46:55 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Mon, 5 Oct 2020 16:46:55 GMT Subject: RFR: 8253750: use build-stable default seed for Utils.RANDOM_GENERATOR [v2] In-Reply-To: References: Message-ID: On Fri, 2 Oct 2020 23:10:13 GMT, Roger Riggs wrote: >> Igor Ignatyev has updated the pull request incrementally with one additional commit since the last revision: >> >> used Runtime.version() instead of ${java.vm.version} >> updated javadoc > > The code and description are reversed from what I suggested. > > The seed should *only* be computed from the version information if it is a promoted or weekly build. > All other builds should use a random seed. Hi @RogerRiggs I guess I have misinterpreted your sentence about system property being set (as `java.vm.version` is always set and the only other property which mattered in this context is `jdk.test.lib.random.seed`). in any case, I agree that for _personal_ builds, it's more desirable to have different seeds on each execution, especially given the fact that the version string is set at configure-time and not at build-time, so one might end up with the same version string for a very long time. I have reworked the code a bit, so now version-based seed is used only for _promotable_ builds (i.e. ones that have build number and it's greater than 0); local and remote/mach5 ad-hoc builds (by default) don't specify a build number, so a random seed value will be used for them. and as before, if `jdk.test.lib.random.seed` is set, its value will be used as seed oblivious to build type. ------------- PR: https://git.openjdk.java.net/jdk/pull/391 From rehn at openjdk.java.net Mon Oct 5 18:24:46 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 5 Oct 2020 18:24:46 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts [v2] In-Reply-To: References: Message-ID: On Mon, 5 Oct 2020 15:14:16 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Update with input from reviews > > src/hotspot/share/prims/whitebox.cpp line 77: > >> 75: #include "runtime/jniHandles.inline.hpp" >> 76: #include "runtime/os.hpp" >> 77: #include "runtime/safepoint.hpp" > > I don't think you need this include change anymore. Fixed > test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 74: > >> 72: Integer waitTime = Integer.parseInt(args[0]); >> 73: WhiteBox wb = WhiteBox.getWhiteBox(); >> 74: // While no safepoint timeout. > > Perhaps: // Loop here to cause a safepoint timeout. Fixed ------------- PR: https://git.openjdk.java.net/jdk/pull/465 From rehn at openjdk.java.net Mon Oct 5 18:24:45 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 5 Oct 2020 18:24:45 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts [v2] In-Reply-To: References: Message-ID: <_lTsmQHKLm72VMTs4jDuZcTprHyNmX38FlA7IyTY0rk=.a5b3f1db-15ad-4d0b-ab5d-7503f25015a6@github.com> On Fri, 2 Oct 2020 09:00:54 GMT, Robbin Ehn wrote: >> The issue is that this test doesn't consider Handshake All operation. >> Depending if/when such operation is scheduled it can lockup the VM thread. >> And the safepoint that should timeout never happens. >> See issue for more information. >> >> So I changed the test to "try timeout" the safepoint, but if there was no safepoint (blocked by a handshake all), we >> retry. We sleep unsafe much longer than the interval SafepointALot generates operations, which 'guarantees' we will >> timeout if there is no handshake all. (some extreme case of kernel scheduling causing a very long context switch could >> also make us not timeout) Passes t1, t3, and repeat runs of the test. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Update with input from reviews Pushing the small update in a minute. ------------- PR: https://git.openjdk.java.net/jdk/pull/465 From rehn at openjdk.java.net Mon Oct 5 18:24:47 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 5 Oct 2020 18:24:47 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts [v2] In-Reply-To: References: <9CQIDFYeLLRibIAwy6T_XgWPoB4STYNycFra37XBvGw=.34931f06-646a-4b93-be55-594bfdea6931@github.com> Message-ID: On Mon, 5 Oct 2020 15:31:01 GMT, Daniel D. Daugherty wrote: >> The line "This message would occur after some time." should never be print if VM is working. >> If the VM fails for some reason and the timeout is not performed, line: >> "Timed out while spinning to reach a safepoint." is never printed and the OutputAnalyzer fails the test. >> If we did timeout and it was printed we know that we didn't print the other message, since the only thread that can >> timeout is the one printing that message. >> The second part verifies that the SIGILL was delivered. > > Okay, but then this message when you're reading the code is misleading: > `System.out.println("This message would occur after some time.");` > It should be printing something like: > `System.out.println("This message only prints if something is broken.");` > > Update: Yes, I realize that this is an existing problem, but it's still reads wrong. I removed comment in last update, since it can't be printed. ------------- PR: https://git.openjdk.java.net/jdk/pull/465 From github.com+51754783+coreyashford at openjdk.java.net Mon Oct 5 18:29:58 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Mon, 5 Oct 2020 18:29:58 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: References: Message-ID: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> > This patch set encompasses the following commits: > > - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for > the intrinsic. The API is similar to the existing encodeBlock intrinsic. > - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic > implementation > - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. > - Adds a JMH microbenchmark for both Base64 encoding and encoding. > - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. CoreyAshford has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: - AOT: Revert change to aotCodeHeap.cpp for decodeBlock Don't add the SET_AOT_GLOBAL_SYMBOL_VALUE macro for decode block until all arches that implement AOT, implement the decodeBlock intrinsic. - Base64.java decodeBlock: Changes from PR review * Make comparison safer and consistent with the while loop * Update comment about the decodeBlock intrinsic so that it matches the new structure * Add comment about the lack of a length check on the destination buffer * As per issue 8138732, change HotSpotIntrinsicCandidate to IntrinsicCandidate - stubGenerator_ppc.cpp: Changes from PR review * Fix clearing of upper bits to clear 32 bits instead of 31 (due to misreading of clrldi instruction) * change and document loop_unrolls setting from 8 to 2 after re-running the benchmark * align unrolled loop on a 32-byte boundary * replace instruction used for checking isURL from a double word to single word instruction since the register is effectively 32 bits wide * cosmetic change to realign register comments. - TestBase64.java: Changes from PR review * Use Utils.toByteArrays() method instead of a locally-defined method * Generate the two non-Base64 tables dynamically rather than use static initialization * Added comments describing the two above-mentioned arrays - Expand the Base64 intrinsic regression test to cover decodeBlock This patch makes four significant changes: 1) The Power implementation of the decodeBlock intrinsic, at least, requires a decode length of at least 128 bytes, but the existing test cases are much shorter, maxing out at 111 bytes. So the patch adds a new input data file which has longer test cases in it. 2) The original test cases only covers the encoding of just the printable subset of the 7-bit ASCII characters. However, Base64 encoding requires being able to encode arbitrary binary data, i.e. it must handle all 256 8-bit byte encodings. To remedy this, but keep the original line-oriented style of the input data, I added another input file type that uses a simple ASCII hexadecimal encoding - two ASCII hex characters per 8-bit byte. When test0 is called, a new parameter is passed that specifies the type of the input file, which is either the original ASCII type or the hexadecimal format. So to test both longer input data and arbitrary 8-bit data, the newly added input test file has test cases which are both longer and encoded in ASCII hex so as to give full 8-bit capability. When reading this type of file, test0 calls a newly-added function to translate the ASCII hex to binary data. Except for the first line of input data, which contains all possible 8-bit values sequentially, the input data was generated using a random length (between 111 and 520 bytes) buffer filled with random 8-bit data, which should give adequate coverage. 3) The original test did not test that the decoder detects illegal Base64 bytes. This change chooses a random location in the encoded data to corrupt with a randomly-chosen byte which is illegal for the specific Base64 encoding that is chosen (i.e. standard or URLsafe). It then calls the decode function to verify that the illegal byte is detected and the proper exception is thrown. 4) The test iteration count was originally 100K, but that is far more than enough iterations to test the intrinsic. It takes 20K iterations on each instrinsic for HotSpot C2 to begin calling it. The test originally had three types of encodings to test and called the encode intrinsic four times for each iteration, which works out to 100K * 3 * 4 = 1.2M calls just to encode. Decode was called four times as well (now five because of the illegal byte test). I believe this is excessive and with the extra test data I have added, the test was timing out after ten minutes of execution. It appears that it is timing out, not because the intrinsics take a long time to run, but because test0 generates an enormous number of discarded data buffers for the GC system to recover (the test runs at about 39GB of virtual memory on my test machine). To remedy the timeout problem, I have changed the code so that a warmup function of 20K repetitions is performed on a fixed buffer, to activate the instrinsic(s). After the warmup, I have reduced the number of iterations to 5K on each test0 call. This should give adequate coverage. - Add JMH benchmark for Base64 variable length buffer decoding - Add Power9+ intrinsic implementation for Base64 decoding - Add HotSpot code to implement Base64 decodeBlock API - Add HotSpotIntrinsicCandidate and API for Base64 decoding ------------- Changes: https://git.openjdk.java.net/jdk/pull/293/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=293&range=01 Stats: 1873 lines in 23 files changed: 1846 ins; 4 del; 23 mod Patch: https://git.openjdk.java.net/jdk/pull/293.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/293/head:pull/293 PR: https://git.openjdk.java.net/jdk/pull/293 From github.com+51754783+coreyashford at openjdk.java.net Mon Oct 5 18:29:59 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Mon, 5 Oct 2020 18:29:59 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: References: <3zdj-jaGgS6_W7mNPSnlZr6JECPKSybWxR1-yy5bZ8Q=.90bff33b-0da1-49cf-9833-ee08caf55c6f@github.com> <7fS4cztWp4ebESEXcO7lrRepZZSye1KBoQGOyYvUwfM=.df0e2d21-7867-431c-89b9-dcf0ab001640@github.com> Message-ID: On Mon, 5 Oct 2020 16:08:55 GMT, CoreyAshford wrote: >> Nobody replied, so I suggest to leave aotCodeHeap.cpp as it was. AOT folks can add it when they need it. > >> Nobody replied, so I suggest to leave aotCodeHeap.cpp as it was. AOT folks can add it when they need it. > > Ok, I will drop that change from that PR. Thank you. I have rebased this PR and made the requested changes. The force push is only because of the rebase; I have preserved the commit history. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From rehn at openjdk.java.net Mon Oct 5 18:34:04 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 5 Oct 2020 18:34:04 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts [v2] In-Reply-To: <_lTsmQHKLm72VMTs4jDuZcTprHyNmX38FlA7IyTY0rk=.a5b3f1db-15ad-4d0b-ab5d-7503f25015a6@github.com> References: <_lTsmQHKLm72VMTs4jDuZcTprHyNmX38FlA7IyTY0rk=.a5b3f1db-15ad-4d0b-ab5d-7503f25015a6@github.com> Message-ID: On Mon, 5 Oct 2020 18:21:43 GMT, Robbin Ehn wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Update with input from reviews > > Pushing the small update in a minute. > Robbin replied: > > > David H. wrote: > > > That all said, for the record, we really should have a handshake timeout mechanism the same as we have the safepoint > > > timeout mechanism. > > > > > > We have a timeout mechanism but default off HandshakeTimeout. > > But it doesn't fire SIGILL to troubled thread as safepoint does. > > What's the conclusion here? Are there going to be changes to the > test to use the HandshakeTimeout option? Should the test have > failed in a different way than it did? In https://bugs.openjdk.java.net/browse/JDK-8198730 I'm have been looking into setting these (safepoint and handshake timeout ) to default 1 second. There were some impediments which now seems to have been resolved. If any of these operations takes longer you want to know it because A: you have a bug, or B: you have performance problems. So I don't think the default value is what any user wants. ------------- PR: https://git.openjdk.java.net/jdk/pull/465 From rehn at openjdk.java.net Mon Oct 5 18:34:04 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 5 Oct 2020 18:34:04 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts [v3] In-Reply-To: References: Message-ID: > The issue is that this test doesn't consider Handshake All operation. > Depending if/when such operation is scheduled it can lockup the VM thread. > And the safepoint that should timeout never happens. > See issue for more information. > > So I changed the test to "try timeout" the safepoint, but if there was no safepoint (blocked by a handshake all), we > retry. We sleep unsafe much longer than the interval SafepointALot generates operations, which 'guarantees' we will > timeout if there is no handshake all. (some extreme case of kernel scheduling causing a very long context switch could > also make us not timeout) Passes t1, t3, and repeat runs of the test. Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Fixed include and comment - Merge branch 'master' into 8253794 - Update with input from reviews - Fixed test ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/465/files - new: https://git.openjdk.java.net/jdk/pull/465/files/90fb3106..b63b6a09 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=465&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=465&range=01-02 Stats: 8155 lines in 239 files changed: 3444 ins; 1770 del; 2941 mod Patch: https://git.openjdk.java.net/jdk/pull/465.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/465/head:pull/465 PR: https://git.openjdk.java.net/jdk/pull/465 From dcubed at openjdk.java.net Mon Oct 5 18:45:51 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 5 Oct 2020 18:45:51 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts [v3] In-Reply-To: References: Message-ID: On Mon, 5 Oct 2020 18:34:04 GMT, Robbin Ehn wrote: >> The issue is that this test doesn't consider Handshake All operation. >> Depending if/when such operation is scheduled it can lockup the VM thread. >> And the safepoint that should timeout never happens. >> See issue for more information. >> >> So I changed the test to "try timeout" the safepoint, but if there was no safepoint (blocked by a handshake all), we >> retry. We sleep unsafe much longer than the interval SafepointALot generates operations, which 'guarantees' we will >> timeout if there is no handshake all. (some extreme case of kernel scheduling causing a very long context switch could >> also make us not timeout) Passes t1, t3, and repeat runs of the test. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev > excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since > the last revision: > - Fixed include and comment > - Merge branch 'master' into 8253794 > - Update with input from reviews > - Fixed test Thumbs up. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/465 From rehn at openjdk.java.net Mon Oct 5 18:52:46 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 5 Oct 2020 18:52:46 GMT Subject: RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts [v3] In-Reply-To: References: Message-ID: <1HFCvpyTXeU3CT2i_HPePoYxkQcUkFY_CRh-NCp1Pbw=.30f77ae5-d0f9-43f7-a6a6-0454478562d0@github.com> On Mon, 5 Oct 2020 18:43:05 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev >> excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since >> the last revision: >> - Fixed include and comment >> - Merge branch 'master' into 8253794 >> - Update with input from reviews >> - Fixed test > > Thumbs up. Thanks for review @pchilano, @dcubed-ojdk, @TheRealMDoerr. Update was trivial so integrating in a bit. ------------- PR: https://git.openjdk.java.net/jdk/pull/465 From rehn at openjdk.java.net Mon Oct 5 19:21:45 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 5 Oct 2020 19:21:45 GMT Subject: Integrated: 8253794: TestAbortVMOnSafepointTimeout never timeouts In-Reply-To: References: Message-ID: On Thu, 1 Oct 2020 14:35:45 GMT, Robbin Ehn wrote: > The issue is that this test doesn't consider Handshake All operation. > Depending if/when such operation is scheduled it can lockup the VM thread. > And the safepoint that should timeout never happens. > See issue for more information. > > So I changed the test to "try timeout" the safepoint, but if there was no safepoint (blocked by a handshake all), we > retry. We sleep unsafe much longer than the interval SafepointALot generates operations, which 'guarantees' we will > timeout if there is no handshake all. (some extreme case of kernel scheduling causing a very long context switch could > also make us not timeout) Passes t1, t3, and repeat runs of the test. This pull request has now been integrated. Changeset: c9d0407e Author: Robbin Ehn URL: https://git.openjdk.java.net/jdk/commit/c9d0407e Stats: 67 lines in 3 files changed: 22 ins; 36 del; 9 mod 8253794: TestAbortVMOnSafepointTimeout never timeouts Reviewed-by: pchilanomate, dcubed, mdoerr ------------- PR: https://git.openjdk.java.net/jdk/pull/465 From david.holmes at oracle.com Thu Oct 1 21:50:28 2020 From: david.holmes at oracle.com (David Holmes) Date: Fri, 2 Oct 2020 07:50:28 +1000 Subject: RFR: 8248238: Implementation of JEP: Windows AArch64 Support [v12] In-Reply-To: References: Message-ID: <1b89dc62-87fe-baa0-24c7-1f07bbdd5e48@oracle.com> Hi, On 2/10/2020 1:48 am, Ludovic Henry wrote: > Hi, > > As we now have a whole bunch of reviews (thank you all!), we would need a sponsor to get it merged. The JEP is not yet targeted so we have to wait for that formality. But once that happens I can sponsor for you. Also note that the PR references the wrong JEP so can you please edit the description to fix that. Meanwhile I'll see if I can take this for a spin through our internal testing. Cheers, David ----- > Thank you :) > > ------------- > > PR: https://github.com/openjdk/jdk/pull/212 > From luhenry at microsoft.com Thu Oct 1 21:56:53 2020 From: luhenry at microsoft.com (Ludovic Henry) Date: Thu, 1 Oct 2020 21:56:53 +0000 Subject: RFR: 8248238: Implementation of JEP: Windows AArch64 Support [v12] In-Reply-To: <1b89dc62-87fe-baa0-24c7-1f07bbdd5e48@oracle.com> References: <1b89dc62-87fe-baa0-24c7-1f07bbdd5e48@oracle.com> Message-ID: Hi David, > The JEP is not yet targeted so we have to wait for that formality. But once that happens I can sponsor for you. Perfect, I didn't know about the need for the JEP to be targeted before the merge. > Also note that the PR references the wrong JEP so can you please edit the description to fix that. I'll work with @Monica to update the PR's description to point to https://openjdk.java.net/jeps/391 instead. Thank you! Ludovic From daniel.daugherty at oracle.com Thu Oct 1 22:05:03 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 1 Oct 2020 18:05:03 -0400 Subject: RFR: 8248238: Implementation of JEP: Windows AArch64 Support [v12] In-Reply-To: References: <1b89dc62-87fe-baa0-24c7-1f07bbdd5e48@oracle.com> Message-ID: So I'm confused... this PR is associated with this bug ID: > > Issue > > * JDK-8248238 : > Implementation of JEP: Windows AArch64 Support > and JDK-8248238 is associated with this JEP: > JDK-8248496 JEP > 388: Windows/AArch64 Port Am I missing something here? Dan On 10/1/20 5:56 PM, Ludovic Henry wrote: > Hi David, > >> The JEP is not yet targeted so we have to wait for that formality. But once that happens I can sponsor for you. > Perfect, I didn't know about the need for the JEP to be targeted before the merge. > >> Also note that the PR references the wrong JEP so can you please edit the description to fix that. > I'll work with @Monica to update the PR's description to point to https://openjdk.java.net/jeps/391 instead. > > Thank you! > Ludovic > From luhenry at microsoft.com Thu Oct 1 22:14:16 2020 From: luhenry at microsoft.com (Ludovic Henry) Date: Thu, 1 Oct 2020 22:14:16 +0000 Subject: RFR: 8248238: Implementation of JEP: Windows AArch64 Support [v12] In-Reply-To: References: <1b89dc62-87fe-baa0-24c7-1f07bbdd5e48@oracle.com> Message-ID: It?s me who made a mistake. This PR should be associated with JEP 388 as you are rightly pointing out. From: Daniel D. Daugherty Sent: Thursday, October 1, 2020 3:05 PM To: Ludovic Henry ; David Holmes ; David Holmes ; Andrew Haley ; Chris Plummer ; Magnus Ihse Bursie ; build-dev at openjdk.java.net; core-libs-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; serviceability-dev at openjdk.java.net; shenandoah-dev at openjdk.java.net; Monica Beckwith Subject: Re: RFR: 8248238: Implementation of JEP: Windows AArch64 Support [v12] So I'm confused... this PR is associated with this bug ID: Issue * JDK-8248238: Implementation of JEP: Windows AArch64 Support and JDK-8248238 is associated with this JEP: JDK-8248496 JEP 388: Windows/AArch64 Port Am I missing something here? Dan On 10/1/20 5:56 PM, Ludovic Henry wrote: Hi David, The JEP is not yet targeted so we have to wait for that formality. But once that happens I can sponsor for you. Perfect, I didn't know about the need for the JEP to be targeted before the merge. Also note that the PR references the wrong JEP so can you please edit the description to fix that. I'll work with @Monica to update the PR's description to point to https://openjdk.java.net/jeps/391 instead. Thank you! Ludovic From rriggs at openjdk.java.net Mon Oct 5 20:59:44 2020 From: rriggs at openjdk.java.net (Roger Riggs) Date: Mon, 5 Oct 2020 20:59:44 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> References: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> Message-ID: On Mon, 5 Oct 2020 18:29:58 GMT, CoreyAshford wrote: >> This patch set encompasses the following commits: >> >> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for >> the intrinsic. The API is similar to the existing encodeBlock intrinsic. >> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic >> implementation >> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. >> - Adds a JMH microbenchmark for both Base64 encoding and encoding. >> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. > > CoreyAshford has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains ten commits: > - AOT: Revert change to aotCodeHeap.cpp for decodeBlock > > Don't add the SET_AOT_GLOBAL_SYMBOL_VALUE macro for decode block until all > arches that implement AOT, implement the decodeBlock intrinsic. > - Base64.java decodeBlock: Changes from PR review > > * Make comparison safer and consistent with the while loop > * Update comment about the decodeBlock intrinsic so that it matches the new structure > * Add comment about the lack of a length check on the destination buffer > * As per issue 8138732, change HotSpotIntrinsicCandidate to IntrinsicCandidate > - stubGenerator_ppc.cpp: Changes from PR review > > * Fix clearing of upper bits to clear 32 bits instead of 31 (due to misreading of clrldi instruction) > * change and document loop_unrolls setting from 8 to 2 after re-running the benchmark > * align unrolled loop on a 32-byte boundary > * replace instruction used for checking isURL from a double word to single > word instruction since the register is effectively 32 bits wide > * cosmetic change to realign register comments. > - TestBase64.java: Changes from PR review > > * Use Utils.toByteArrays() method instead of a locally-defined method > * Generate the two non-Base64 tables dynamically rather than use static initialization > * Added comments describing the two above-mentioned arrays > - Expand the Base64 intrinsic regression test to cover decodeBlock > > This patch makes four significant changes: > > 1) The Power implementation of the decodeBlock intrinsic, at least, > requires a decode length of at least 128 bytes, but the existing test cases > are much shorter, maxing out at 111 bytes. So the patch adds a new input > data file which has longer test cases in it. > > 2) The original test cases only covers the encoding of just the printable > subset of the 7-bit ASCII characters. However, Base64 encoding requires > being able to encode arbitrary binary data, i.e. it must handle all 256 > 8-bit byte encodings. To remedy this, but keep the original line-oriented > style of the input data, I added another input file type that uses a simple > ASCII hexadecimal encoding - two ASCII hex characters per 8-bit byte. When > test0 is called, a new parameter is passed that specifies the type of the > input file, which is either the original ASCII type or the hexadecimal > format. So to test both longer input data and arbitrary 8-bit data, the > newly added input test file has test cases which are both longer and > encoded in ASCII hex so as to give full 8-bit capability. When reading > this type of file, test0 calls a newly-added function to translate the > ASCII hex to binary data. Except for the first line of input data, which > contains all possible 8-bit values sequentially, the input data was > generated using a random length (between 111 and 520 bytes) buffer filled > with random 8-bit data, which should give adequate coverage. > > 3) The original test did not test that the decoder detects illegal Base64 > bytes. This change chooses a random location in the encoded data to > corrupt with a randomly-chosen byte which is illegal for the specific > Base64 encoding that is chosen (i.e. standard or URLsafe). It then calls > the decode function to verify that the illegal byte is detected and the > proper exception is thrown. > > 4) The test iteration count was originally 100K, but that is far more than > enough iterations to test the intrinsic. It takes 20K iterations on each > instrinsic for HotSpot C2 to begin calling it. The test originally had > three types of encodings to test and called the encode intrinsic four times > for each iteration, which works out to 100K * 3 * 4 = 1.2M calls just to > encode. Decode was called four times as well (now five because of the > illegal byte test). I believe this is excessive and with the extra test > data I have added, the test was timing out after ten minutes of execution. > It appears that it is timing out, not because the intrinsics take a long > time to run, but because test0 generates an enormous number of discarded > data buffers for the GC system to recover (the test runs at about 39GB of > virtual memory on my test machine). To remedy the timeout problem, I have > changed the code so that a warmup function of 20K repetitions is performed > on a fixed buffer, to activate the instrinsic(s). After the warmup, I have > reduced the number of iterations to 5K on each test0 call. This should > give adequate coverage. > - Add JMH benchmark for Base64 variable length buffer decoding > - Add Power9+ intrinsic implementation for Base64 decoding > - Add HotSpot code to implement Base64 decodeBlock API > - Add HotSpotIntrinsicCandidate and API for Base64 decoding Core Libs changes and the test look fine. ------------- Marked as reviewed by rriggs (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/293 From rriggs at openjdk.java.net Mon Oct 5 21:07:42 2020 From: rriggs at openjdk.java.net (Roger Riggs) Date: Mon, 5 Oct 2020 21:07:42 GMT Subject: RFR: 8253750: use build-stable default seed for Utils.RANDOM_GENERATOR [v3] In-Reply-To: References: Message-ID: On Mon, 5 Oct 2020 16:46:55 GMT, Igor Ignatyev wrote: >> Hi all, >> >> could you please review the patch which updates `jdk.test.lib.Utils` to use md5 hash-sum of `java.vm.version` property >> as default seed for `Utils.RANDOM_GENERATOR`? >> from JBS: >>> using the same seed for all runs of a build will make it possible (easier) to compare results from different test runs >>> (e.g. on different platforms, w/ different flags) and consequently will make test results analysis easier. the proposed >>> solution is to use the seed based on Runtime.version() / "java.vm.version", which are different from build to build, if >>> there is no seed specified by "jdk.test.lib.random.seed" property. >> >> the patch also updates `RandomGeneratorTest` test, so it expects now that the same values are generated if no seed is >> provided. >> testing: >> ? tier1 >> ? `test/lib-test/jdk/test/lib/` against personal build on linux,windows,macos-x64 >> ? `test/lib-test/jdk/test/lib/` against CI build on linux,windows,macos-x64 > > Igor Ignatyev has updated the pull request incrementally with one additional commit since the last revision: > > use random seed for personal/internal builds Can you take a look at the existing jdk.test.lib.RandomFactory and see if we can avoid a separate random number generator and property? ------------- PR: https://git.openjdk.java.net/jdk/pull/391 From iignatyev at openjdk.java.net Mon Oct 5 21:29:40 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Mon, 5 Oct 2020 21:29:40 GMT Subject: RFR: 8253750: use build-stable default seed for Utils.RANDOM_GENERATOR [v3] In-Reply-To: References: Message-ID: On Mon, 5 Oct 2020 21:04:52 GMT, Roger Riggs wrote: > Can you take a look at the existing jdk.test.lib.RandomFactory and see if we can avoid a separate random number > generator and property? I am not adding a new separate random number generator or a new property, `jdk.test.lib.Utils.getRandomInstance` has been around for a long time and is used by lots of tests (most of them are in `/test/hotspot/jtreg`). merging `Utils.getRandomInstance` and `RandomFactory` (or rather replacing one w/ another) is tracked by [JDK-8212077](https://bugs.openjdk.java.net/browse/JDK-8212077). ------------- PR: https://git.openjdk.java.net/jdk/pull/391 From github.com+51754783+coreyashford at openjdk.java.net Mon Oct 5 21:47:42 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Mon, 5 Oct 2020 21:47:42 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: References: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> Message-ID: On Mon, 5 Oct 2020 20:57:25 GMT, Roger Riggs wrote: >> CoreyAshford has updated the pull request with a new target base due to a merge or a rebase. The pull request now >> contains ten commits: >> - AOT: Revert change to aotCodeHeap.cpp for decodeBlock >> >> Don't add the SET_AOT_GLOBAL_SYMBOL_VALUE macro for decode block until all >> arches that implement AOT, implement the decodeBlock intrinsic. >> - Base64.java decodeBlock: Changes from PR review >> >> * Make comparison safer and consistent with the while loop >> * Update comment about the decodeBlock intrinsic so that it matches the new structure >> * Add comment about the lack of a length check on the destination buffer >> * As per issue 8138732, change HotSpotIntrinsicCandidate to IntrinsicCandidate >> - stubGenerator_ppc.cpp: Changes from PR review >> >> * Fix clearing of upper bits to clear 32 bits instead of 31 (due to misreading of clrldi instruction) >> * change and document loop_unrolls setting from 8 to 2 after re-running the benchmark >> * align unrolled loop on a 32-byte boundary >> * replace instruction used for checking isURL from a double word to single >> word instruction since the register is effectively 32 bits wide >> * cosmetic change to realign register comments. >> - TestBase64.java: Changes from PR review >> >> * Use Utils.toByteArrays() method instead of a locally-defined method >> * Generate the two non-Base64 tables dynamically rather than use static initialization >> * Added comments describing the two above-mentioned arrays >> - Expand the Base64 intrinsic regression test to cover decodeBlock >> >> This patch makes four significant changes: >> >> 1) The Power implementation of the decodeBlock intrinsic, at least, >> requires a decode length of at least 128 bytes, but the existing test cases >> are much shorter, maxing out at 111 bytes. So the patch adds a new input >> data file which has longer test cases in it. >> >> 2) The original test cases only covers the encoding of just the printable >> subset of the 7-bit ASCII characters. However, Base64 encoding requires >> being able to encode arbitrary binary data, i.e. it must handle all 256 >> 8-bit byte encodings. To remedy this, but keep the original line-oriented >> style of the input data, I added another input file type that uses a simple >> ASCII hexadecimal encoding - two ASCII hex characters per 8-bit byte. When >> test0 is called, a new parameter is passed that specifies the type of the >> input file, which is either the original ASCII type or the hexadecimal >> format. So to test both longer input data and arbitrary 8-bit data, the >> newly added input test file has test cases which are both longer and >> encoded in ASCII hex so as to give full 8-bit capability. When reading >> this type of file, test0 calls a newly-added function to translate the >> ASCII hex to binary data. Except for the first line of input data, which >> contains all possible 8-bit values sequentially, the input data was >> generated using a random length (between 111 and 520 bytes) buffer filled >> with random 8-bit data, which should give adequate coverage. >> >> 3) The original test did not test that the decoder detects illegal Base64 >> bytes. This change chooses a random location in the encoded data to >> corrupt with a randomly-chosen byte which is illegal for the specific >> Base64 encoding that is chosen (i.e. standard or URLsafe). It then calls >> the decode function to verify that the illegal byte is detected and the >> proper exception is thrown. >> >> 4) The test iteration count was originally 100K, but that is far more than >> enough iterations to test the intrinsic. It takes 20K iterations on each >> instrinsic for HotSpot C2 to begin calling it. The test originally had >> three types of encodings to test and called the encode intrinsic four times >> for each iteration, which works out to 100K * 3 * 4 = 1.2M calls just to >> encode. Decode was called four times as well (now five because of the >> illegal byte test). I believe this is excessive and with the extra test >> data I have added, the test was timing out after ten minutes of execution. >> It appears that it is timing out, not because the intrinsics take a long >> time to run, but because test0 generates an enormous number of discarded >> data buffers for the GC system to recover (the test runs at about 39GB of >> virtual memory on my test machine). To remedy the timeout problem, I have >> changed the code so that a warmup function of 20K repetitions is performed >> on a fixed buffer, to activate the instrinsic(s). After the warmup, I have >> reduced the number of iterations to 5K on each test0 call. This should >> give adequate coverage. >> - Add JMH benchmark for Base64 variable length buffer decoding >> - Add Power9+ intrinsic implementation for Base64 decoding >> - Add HotSpot code to implement Base64 decodeBlock API >> - Add HotSpotIntrinsicCandidate and API for Base64 decoding > > Core Libs changes and the test look fine. 8248188: Add IntrinsicCandidate and API for Base64 decoding, add Power64LE intrinsic implementation. This patch set encompasses the following commits: Adds a new intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for the intrinsic. The API is similar to the existing encodeBlock intrinsic. Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic implementation. Adds a Power64LE-specific implementation of the decodeBlock intrinsic. Adds a JMH microbenchmark for both Base64 encoding and encoding. Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From rriggs at openjdk.java.net Mon Oct 5 22:14:40 2020 From: rriggs at openjdk.java.net (Roger Riggs) Date: Mon, 5 Oct 2020 22:14:40 GMT Subject: RFR: 8253750: use build-stable default seed for Utils.RANDOM_GENERATOR [v3] In-Reply-To: References: Message-ID: On Mon, 5 Oct 2020 21:27:00 GMT, Igor Ignatyev wrote: >> Can you take a look at the existing jdk.test.lib.RandomFactory and see if we can avoid a separate random number >> generator and property? > >> Can you take a look at the existing jdk.test.lib.RandomFactory and see if we can avoid a separate random number >> generator and property? > > I am not adding a new separate random number generator or a new property, `jdk.test.lib.Utils.getRandomInstance` has > been around for a long time and is used by lots of tests (most of them are in `/test/hotspot/jtreg`). merging > `Utils.getRandomInstance` and `RandomFactory` (or rather replacing one w/ another) is tracked by > [JDK-8212077](https://bugs.openjdk.java.net/browse/JDK-8212077). I was thinking that the code added to Utils.SEED initialization could just have easily been added to RandomFactory.getSystemSeed() and not have to be changed later. ------------- PR: https://git.openjdk.java.net/jdk/pull/391 From iignatyev at openjdk.java.net Mon Oct 5 23:33:40 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Mon, 5 Oct 2020 23:33:40 GMT Subject: RFR: 8253750: use build-stable default seed for Utils.RANDOM_GENERATOR [v3] In-Reply-To: References: Message-ID: On Mon, 5 Oct 2020 22:12:21 GMT, Roger Riggs wrote: >>> Can you take a look at the existing jdk.test.lib.RandomFactory and see if we can avoid a separate random number >>> generator and property? >> >> I am not adding a new separate random number generator or a new property, `jdk.test.lib.Utils.getRandomInstance` has >> been around for a long time and is used by lots of tests (most of them are in `/test/hotspot/jtreg`). merging >> `Utils.getRandomInstance` and `RandomFactory` (or rather replacing one w/ another) is tracked by >> [JDK-8212077](https://bugs.openjdk.java.net/browse/JDK-8212077). > > I was thinking that the code added to Utils.SEED initialization could just have easily been added to > RandomFactory.getSystemSeed() and not have to be changed later. well, `RandomFactory::getSystemSeed` reads `seed` property, while `Utils.SEED` reads `jdk.test.lib.random.seed`, and there are tests which use these properties to specify seeds, I'd prefer to leave changing these tests to [8212077](https://bugs.openjdk.java.net/browse/JDK-8212077). o/c we can add a new method, e.g. `RandomFactory::getBuildStableSeed`, or refactor `RandomFactory::getSystemSeed` to take property name as an argument, but I don't think it's really worth it as 8212077 is starting to slowly bubble up in my working queue. ------------- PR: https://git.openjdk.java.net/jdk/pull/391 From minqi at openjdk.java.net Tue Oct 6 00:45:53 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Tue, 6 Oct 2020 00:45:53 GMT Subject: RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive [v9] In-Reply-To: References: Message-ID: <5zY0JGVR6RGcAnTicYrgLIW1NL0f5-tkR1MMDjNYBTs=.0087b87b-b1f6-497a-8f4b-927246ee71cc@github.com> > This patch is reorganized after 8252725, which is separated from this patch to refactor jlink glugin code. The previous > webrev with hg can be found at: http://cr.openjdk.java.net/~minqi/2020/8247536/webrev-05. With 8252725 integrated, the > regeneration of holder classes is simply to call the new added GenerateJLIClassesHelper.cdsGenerateHolderClasses > function. Tests: tier1-4 Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: Moved and renamed cdsGenerateHolderClasses from GenerateJLIClassesHelp to CDS as generateLambdaFormHolderClasses. Added input verification function in CDS before class generation. Added more test scenarios. Removed trailing unused ending words for output of lambda form trace line in case of DumpLoadedClassList. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/193/files - new: https://git.openjdk.java.net/jdk/pull/193/files/125112b3..52764a6e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=193&range=08 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=193&range=07-08 Stats: 290 lines in 10 files changed: 143 ins; 100 del; 47 mod Patch: https://git.openjdk.java.net/jdk/pull/193.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/193/head:pull/193 PR: https://git.openjdk.java.net/jdk/pull/193 From david.holmes at oracle.com Tue Oct 6 01:28:02 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 6 Oct 2020 11:28:02 +1000 Subject: RFR: 8253899: Make IsClassUnloadingEnabled signature match specification In-Reply-To: References: <5UPvhVjwK3TOKrd2rSzeCz-xT0tXsCUkMT9R8tgFR8I=.f1723d22-6883-4ad3-af55-a7437ef905de@github.com> Message-ID: On 5/10/2020 10:47 pm, Vladimir Kempik wrote: > On Fri, 2 Oct 2020 15:26:30 GMT, Vladimir Kempik wrote: > >>>> Okay but look at the example that documentation gives: >>>> >>>>> For example, if the jvmtiParamInfo returned by GetExtensionEvents indicates that there is a jint parameter, the event >>>>> handler should be declared: ``` >>>>> void JNICALL myHandler(jvmtiEnv* jvmti_env, jint myInt, ...) >>>>> ``` >>>> >>>> The myInt is explicit, just as our "jboolean* enabled" is explicit. I think they key point is that the signature must >>>> end with "..." which it does. >>>> I don't see anything here that needs to be fixed. >>> >>> Hello David. On majority of platforms this would be fine. >>> >>> But on some platforms, variadic arguments and non variadic arguments are passed differently ( for example on >>> macos-aarch64, variadic args are passed always on stack, non variadic on registers (and on stack for 9th+ arg) , that >>> causes issues. If you still see no issues here we can delay and make this changeset part of JEP-391. >>> But since this changeset isn't much macos-aarch64 specific, I thought it would be good to integrate it separately from >>> jep-391. >>> Regards, Vladimir >> >>> _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on >>> [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ >>> Hi Vladimir, >>> >>> On 2/10/2020 5:37 pm, Vladimir Kempik wrote: >>> >>>> On Fri, 2 Oct 2020 07:27:17 GMT, David Holmes wrote: >>>>> Okay but look at the example that documentation gives: >>>>>> For example, if the jvmtiParamInfo returned by GetExtensionEvents indicates that there is a jint parameter, the event >>>>>> handler should be declared: ``` >>>>>> void JNICALL myHandler(jvmtiEnv* jvmti_env, jint myInt, ...) >>>>>> ``` >>>>> >>>>> >>>>> The myInt is explicit, just as our "jboolean* enabled" is explicit. I think they key point is that the signature must >>>>> end with "..." which it does. >>>>> I don't see anything here that needs to be fixed. >>>> >>>> >>>> Hello David. On majority of platforms this would be fine. >>>> But on some platforms, variadic arguments and non variadic arguments are passed differently ( for example on >>>> macos-aarch64, variadic args are passed always on stack, non variadic on registers (and on stack for 9th+ arg) , that >>>> causes issues. >>> >>> Okay - I see the potential for a problme here but ... >>> >>>> If you still see no issues here we can delay and make this changeset part of JEP-391. >>>> But since this changeset isn't much macos-aarch64 specific, I thought it would be good to integrate it separately from >>>> jep-391. >>> >>> ... this change actually goes against the example in the spec, so if you >>> make this change it indicates the spec needs to be updated too. >>> >>> Cheers, >>> David >>> ----- >> >> Hello David >> >> I really believe the problem is in document here ( in examples) >> first, the doc clearly specify the type >> >> typedef jvmtiError (JNICALL *jvmtiExtensionFunction) >> (jvmtiEnv* jvmti_env, >> ...); >> >> then in examples it declares the function not matching this spec. >> >> Is it a good idea to update the docs in a separate bug ? >> >> Thanks, Vladimir > > Hello David > I have created CSR draft > https://bugs.openjdk.java.net/browse/JDK-8254014 Thanks. I have updated and reviewed the CSR request. David > Regards, Vladimir > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/466 > From dholmes at openjdk.java.net Tue Oct 6 02:02:45 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 6 Oct 2020 02:02:45 GMT Subject: RFR: JDK-8247589: Implementation of Alpine Linux/x64 Port [v2] In-Reply-To: References: <6jqlCPXe69fPRvYFrytJsECkaa9tJ1hYWISNgyPP4Eg=.40944ef5-93b0-4db4-948b-80bb7898e9e8@github.com> Message-ID: On Fri, 18 Sep 2020 10:56:56 GMT, Aleksei Voitylov wrote: >> thank you Alan, Erik, and David! When the JEP becomes Targeted, I'll use this PR to integrate the changes. > > I added the contributors that could be found in the portola project commits. If anyone knows some other contributors I > missed, I'll be happy to stand corrected. @voitylov For future reference please don't force-push commits on open PRs as it breaks the commit history. I can no longer just look at the two most recent commits and see what they added relative to what I had previously reviewed. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/49 From dholmes at openjdk.java.net Tue Oct 6 02:28:51 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 6 Oct 2020 02:28:51 GMT Subject: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v10] In-Reply-To: References: Message-ID: On Mon, 5 Oct 2020 11:43:52 GMT, Erik ?sterlund wrote: >> This PR the implementation of "JEP 376: ZGC: Concurrent Thread-Stack Processing" (cf. >> https://openjdk.java.net/jeps/376). >> Basically, this patch modifies the epilog safepoint when returning from a frame (supporting interpreter frames, c1, c2, >> and native wrapper frames), to compare the stack pointer against a thread-local value. This turns return polls into >> more of a swiss army knife that can be used to poll for safepoints, handshakes, but also returns into not yet safe to >> expose frames, denoted by a "stack watermark". ZGC will leave frames (and other thread oops) in a state of a mess in >> the GC checkpoint safepoints, rather than processing all threads and their stacks. Processing is initialized >> automagically when threads wake up for a safepoint, or get poked by a handshake or safepoint. Said initialization >> processes a few (3) frames and other thread oops. The rest - the bulk of the frame processing, is deferred until it is >> actually needed. It is needed when a frame is exposed to either 1) execution (returns or unwinding due to exception >> handling), or 2) stack walker APIs. A hook is then run to go and finish the lazy processing of frames. Mutator and GC >> threads can compete for processing. The processing is therefore performed under a per-thread lock. Note that disarming >> of the poll word (that the returns are comparing against) is only performed by the thread itself. So sliding the >> watermark up will require one runtime call for a thread to note that nothing needs to be done, and then update the poll >> word accordingly. Downgrading the poll word concurrently by other threads was simply not worth the complexity it >> brought (and is only possible on TSO machines). So left that one out. > > Erik ?sterlund has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains 16 commits: > - Review: Deal with new assert from mainline > - Merge branch 'master' into 8253180_conc_stack_scanning > - Review: StackWalker hook > - Review: Kim CR 1 and exception handling fix > - Review: Move barrier detach > - Review: Remove assert that has outstayed its welcome > - Merge branch 'master' into 8253180_conc_stack_scanning > - Review: Albert CR2 and defensive programming > - Review: StefanK CR 3 > - Review: Per CR 1 > - ... and 6 more: https://git.openjdk.java.net/jdk/compare/9604ee82...e633cb94 src/hotspot/share/runtime/safepointMechanism.cpp line 89: > 87: // > 88: // The call has been carefully placed here to cater for a few situations: > 89: // 1) After we exit from block after a global pool Typo: pool -> poll ------------- PR: https://git.openjdk.java.net/jdk/pull/296 From dholmes at openjdk.java.net Tue Oct 6 02:42:49 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 6 Oct 2020 02:42:49 GMT Subject: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v10] In-Reply-To: References: Message-ID: <5ssr6WzpUH1emI5RNWd7rBRlhHkFA3EKh_yhnI-XALo=.c3537a8c-669e-44ff-8335-40cca1e36aa9@github.com> On Mon, 5 Oct 2020 11:43:52 GMT, Erik ?sterlund wrote: >> This PR the implementation of "JEP 376: ZGC: Concurrent Thread-Stack Processing" (cf. >> https://openjdk.java.net/jeps/376). >> Basically, this patch modifies the epilog safepoint when returning from a frame (supporting interpreter frames, c1, c2, >> and native wrapper frames), to compare the stack pointer against a thread-local value. This turns return polls into >> more of a swiss army knife that can be used to poll for safepoints, handshakes, but also returns into not yet safe to >> expose frames, denoted by a "stack watermark". ZGC will leave frames (and other thread oops) in a state of a mess in >> the GC checkpoint safepoints, rather than processing all threads and their stacks. Processing is initialized >> automagically when threads wake up for a safepoint, or get poked by a handshake or safepoint. Said initialization >> processes a few (3) frames and other thread oops. The rest - the bulk of the frame processing, is deferred until it is >> actually needed. It is needed when a frame is exposed to either 1) execution (returns or unwinding due to exception >> handling), or 2) stack walker APIs. A hook is then run to go and finish the lazy processing of frames. Mutator and GC >> threads can compete for processing. The processing is therefore performed under a per-thread lock. Note that disarming >> of the poll word (that the returns are comparing against) is only performed by the thread itself. So sliding the >> watermark up will require one runtime call for a thread to note that nothing needs to be done, and then update the poll >> word accordingly. Downgrading the poll word concurrently by other threads was simply not worth the complexity it >> brought (and is only possible on TSO machines). So left that one out. > > Erik ?sterlund has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains 16 commits: > - Review: Deal with new assert from mainline > - Merge branch 'master' into 8253180_conc_stack_scanning > - Review: StackWalker hook > - Review: Kim CR 1 and exception handling fix > - Review: Move barrier detach > - Review: Remove assert that has outstayed its welcome > - Merge branch 'master' into 8253180_conc_stack_scanning > - Review: Albert CR2 and defensive programming > - Review: StefanK CR 3 > - Review: Per CR 1 > - ... and 6 more: https://git.openjdk.java.net/jdk/compare/9604ee82...e633cb94 src/hotspot/share/runtime/stackWatermark.cpp line 223: > 221: void StackWatermark::yield_processing() { > 222: update_watermark(); > 223: MutexUnlocker mul(&_lock, Mutex::_no_safepoint_check_flag); This seems a little dubious - is it just a heuristic? There is no guarantee that unlocking the Mutex will allow another thread to claim it before this thread re-locks it. src/hotspot/share/runtime/stackWatermark.hpp line 91: > 89: JavaThread* _jt; > 90: StackWatermarkFramesIterator* _iterator; > 91: Mutex _lock; How are you guaranteeing that the Mutex is unused at the time the StackWatermark is deleted? ------------- PR: https://git.openjdk.java.net/jdk/pull/296 From dholmes at openjdk.java.net Tue Oct 6 02:59:45 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 6 Oct 2020 02:59:45 GMT Subject: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v8] In-Reply-To: References: <4sawJHiIuc7oH5ETjrwJtJE3gkB1U2VBMVJdPmxJrg4=.e4e9b4d3-a118-4870-9b5b-f23b351093e2@github.com> Message-ID: On Thu, 1 Oct 2020 10:12:54 GMT, Erik ?sterlund wrote: >>> _Mailing list message from [Kim Barrett](mailto:kim.barrett at oracle.com) on >>> [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ >>> I've only looked at scattered pieces, but what I've looked at seemed to be >>> in good shape. Only a few minor comments. >>> >>> ------------------------------------------------------------------------------ >>> src/hotspot/share/runtime/frame.cpp >>> 456 // for(StackFrameStream fst(thread); !fst.is_done(); fst.next()) { >>> >>> Needs to be updated for the new constructor arguments. Just in general, the >>> class documentation seems to need some updating for this change. >> >> Fixed. >> >>> ------------------------------------------------------------------------------ >>> src/hotspot/share/runtime/frame.cpp >>> 466 StackFrameStream(JavaThread *thread, bool update, bool process_frames); >>> >>> Something to consider is that bool parameters like that, especially when >>> there are multiple, are error prone. An alternative is distinct enums, which >>> likely also obviates the need for comments in calls. >> >> Coleen also had the same comment, and we agreed to file a follow-up RFE to clean that up. This also applies to all the >> existing parameters passed in. So I would still like to do that, but in a follow-up RFE. Said RFE will also make the >> parameter use in RegisterMap and vframeStream explicit. >>> ------------------------------------------------------------------------------ >>> src/hotspot/share/runtime/thread.hpp >>> 956 void set_processed_thread(Thread *thread) { _processed_thread = thread; } >>> >>> I think this should assert that either _processed_thread or thread are NULL. >>> Or maybe the RememberProcessedThread constructor should be asserting that >>> _cur_thr->processed_thread() is NULL. >> >> Fixed. >> >>> ------------------------------------------------------------------------------ >> >> Thanks for the review Kim! > > In my last PR update, I included a fix to an exception handling problem that I encountered after lots of stress testing > that I have been running for a while now. I managed to catch the issue, get a reliable reproducer, and fix it. > The root problem is that the hook I had placed in SharedRuntime::exception_handler_for_return_address has been ignored. > The reason is that the stack is not walkable at this point. The hook then just ignores it. This had some unexpected > consequences. After looking closer at this code, I found that if we did have a walkable stack when we call > SharedRuntime::raw_exception_handler_for_return_address, that would have been the only hook we need at all for > exception handling. It is always the common root point where we unwind into a caller frame due to an exception throwing > into the caller, and we need to look up the rethrow handler of the caller. However, we are indeed not walkable here. To > deal with this, I have rearranged the exceptino hooks a bit. First of all, I have deleted all before_unwind hooks for > exception handling, because they should not be needed if the after_unwind hook is reliably called on the caller side > instead. And those hooks do indeed need to be there, because we do not always have a point where we can call > before_unwind (e.g. C1 unwind exception code, that just unwinds and looks up the rethrow handler via > SharedRuntime::exception_handler_for_return_address). I have then traced all paths from > SharedRuntime::raw_exception_handler_for_return_address into runtime rethrow handlers called, for each rethrow > exception handler PC exposed in the function. They are: > * OptoRuntime::rethrow_C when unwinding into C2 code > * exception_handler_for_pc_helper via Runtime1::handle_exception_from_callee_id when unwinding into C1 code > * JavaCallWrapper::~JavaCallWrapper when unwinding into a Java call stub > * InterpreterRuntime::exception_handler_for_exception when unwinding into an interpreted method > * Deoptimization::fetch_unroll_info (with exec_mode == Unpack_exception) when unwinding into a deoptimized nmethod > > Each rethrow handler returned has a corresponding comment saying which rethrow runtime rethrow handler it will end up > in, once the stack has been walkable and we have transferred control into the caller. And all of those runtime hooks > now have an after_unwind() hook. The good news is that now the responsibility for who calls the unwind hook for > exception is clearer: it is never done by the callee, and always done by the caller, in its rethrow handler, at which > point the stack is walkable. In order to avoid further issues where an unwind hook is ignored, I have changed them to > assert that there is a last_Java_frame present. Previously I did not assert that, because there was shared code between > runtime native transitions and the native wrapper, that called an unwind hook. This prevented the unwind hooks to > assert this, as compiler threads would still perform native transitions, that just ignored the request. I moved the > problematic hook for native up one level in the hierarchy to a path where it is only called by native wrappers (where > we always have a last_Java_frame), so that I can finally assert that the unwind hooks always are called at points where > we have a last_Java_frame. This makes me feel confident that I do not have another hook that is being accidentally > ignored. However, the relationship for the various exception handling code executed in the caller, the callee, and > between the two (before we are walkable) is rather complicated. So it would be good to have someone that knows the > exception code very well have a look at this, to make sure I have not missed anything. I have rerun all testing and > done a load of stress testing to sanity check this. The reproducer I eventually found that reproduced the issue with > 100% success rate, was run many times with the new patch, and no longer reproduces any issue. Hi Erik, Can you give an overview of the use of the "poll word" and its relation to the "poll page" please? Thanks, David ------------- PR: https://git.openjdk.java.net/jdk/pull/296 From dholmes at openjdk.java.net Tue Oct 6 02:59:48 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 6 Oct 2020 02:59:48 GMT Subject: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v10] In-Reply-To: References: Message-ID: On Mon, 5 Oct 2020 11:43:52 GMT, Erik ?sterlund wrote: >> This PR the implementation of "JEP 376: ZGC: Concurrent Thread-Stack Processing" (cf. >> https://openjdk.java.net/jeps/376). >> Basically, this patch modifies the epilog safepoint when returning from a frame (supporting interpreter frames, c1, c2, >> and native wrapper frames), to compare the stack pointer against a thread-local value. This turns return polls into >> more of a swiss army knife that can be used to poll for safepoints, handshakes, but also returns into not yet safe to >> expose frames, denoted by a "stack watermark". ZGC will leave frames (and other thread oops) in a state of a mess in >> the GC checkpoint safepoints, rather than processing all threads and their stacks. Processing is initialized >> automagically when threads wake up for a safepoint, or get poked by a handshake or safepoint. Said initialization >> processes a few (3) frames and other thread oops. The rest - the bulk of the frame processing, is deferred until it is >> actually needed. It is needed when a frame is exposed to either 1) execution (returns or unwinding due to exception >> handling), or 2) stack walker APIs. A hook is then run to go and finish the lazy processing of frames. Mutator and GC >> threads can compete for processing. The processing is therefore performed under a per-thread lock. Note that disarming >> of the poll word (that the returns are comparing against) is only performed by the thread itself. So sliding the >> watermark up will require one runtime call for a thread to note that nothing needs to be done, and then update the poll >> word accordingly. Downgrading the poll word concurrently by other threads was simply not worth the complexity it >> brought (and is only possible on TSO machines). So left that one out. > > Erik ?sterlund has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains 16 commits: > - Review: Deal with new assert from mainline > - Merge branch 'master' into 8253180_conc_stack_scanning > - Review: StackWalker hook > - Review: Kim CR 1 and exception handling fix > - Review: Move barrier detach > - Review: Remove assert that has outstayed its welcome > - Merge branch 'master' into 8253180_conc_stack_scanning > - Review: Albert CR2 and defensive programming > - Review: StefanK CR 3 > - Review: Per CR 1 > - ... and 6 more: https://git.openjdk.java.net/jdk/compare/9604ee82...e633cb94 src/hotspot/share/runtime/safepointMechanism.cpp line 101: > 99: uintptr_t SafepointMechanism::compute_poll_word(bool armed, uintptr_t stack_watermark) { > 100: if (armed) { > 101: log_debug(stackbarrier)("Computed armed at %d", Thread::current()->osthread()->thread_id()); s/at/for/ ? ------------- PR: https://git.openjdk.java.net/jdk/pull/296 From thartmann at openjdk.java.net Tue Oct 6 05:51:43 2020 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Tue, 6 Oct 2020 05:51:43 GMT Subject: RFR: 8254010: GrowableArrayView::print fails to compile In-Reply-To: References: Message-ID: On Mon, 5 Oct 2020 10:52:59 GMT, Stefan Karlsson wrote: >> When adding some debugging code, I've noticed that the (currently unused) GrowableArrayView::print fails to compile. >> The fix is to use the `%d` format specifier for the int fields `_len` and `_max` and cast `this` to `intptr_t`. >> Thanks, >> Tobias > > Marked as reviewed by stefank (Reviewer). @stefank, thanks for the review! ------------- PR: https://git.openjdk.java.net/jdk/pull/502 From thartmann at openjdk.java.net Tue Oct 6 05:51:44 2020 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Tue, 6 Oct 2020 05:51:44 GMT Subject: Integrated: 8254010: GrowableArrayView::print fails to compile In-Reply-To: References: Message-ID: <-gh9tabP8mYiIbgyV5-CScRMk17IEksrsXhDfHUTuLQ=.ff81fb85-fb23-42af-bd98-7e738f29bdb9@github.com> On Mon, 5 Oct 2020 10:04:17 GMT, Tobias Hartmann wrote: > When adding some debugging code, I've noticed that the (currently unused) GrowableArrayView::print fails to compile. > The fix is to use the `%d` format specifier for the int fields `_len` and `_max` and cast `this` to `intptr_t`. > Thanks, > Tobias This pull request has now been integrated. Changeset: 17285472 Author: Tobias Hartmann URL: https://git.openjdk.java.net/jdk/commit/17285472 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod 8254010: GrowableArrayView::print fails to compile Reviewed-by: stefank ------------- PR: https://git.openjdk.java.net/jdk/pull/502 From eosterlund at openjdk.java.net Tue Oct 6 07:22:58 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 6 Oct 2020 07:22:58 GMT Subject: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v11] In-Reply-To: References: Message-ID: > This PR the implementation of "JEP 376: ZGC: Concurrent Thread-Stack Processing" (cf. > https://openjdk.java.net/jeps/376). > Basically, this patch modifies the epilog safepoint when returning from a frame (supporting interpreter frames, c1, c2, > and native wrapper frames), to compare the stack pointer against a thread-local value. This turns return polls into > more of a swiss army knife that can be used to poll for safepoints, handshakes, but also returns into not yet safe to > expose frames, denoted by a "stack watermark". ZGC will leave frames (and other thread oops) in a state of a mess in > the GC checkpoint safepoints, rather than processing all threads and their stacks. Processing is initialized > automagically when threads wake up for a safepoint, or get poked by a handshake or safepoint. Said initialization > processes a few (3) frames and other thread oops. The rest - the bulk of the frame processing, is deferred until it is > actually needed. It is needed when a frame is exposed to either 1) execution (returns or unwinding due to exception > handling), or 2) stack walker APIs. A hook is then run to go and finish the lazy processing of frames. Mutator and GC > threads can compete for processing. The processing is therefore performed under a per-thread lock. Note that disarming > of the poll word (that the returns are comparing against) is only performed by the thread itself. So sliding the > watermark up will require one runtime call for a thread to note that nothing needs to be done, and then update the poll > word accordingly. Downgrading the poll word concurrently by other threads was simply not worth the complexity it > brought (and is only possible on TSO machines). So left that one out. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: Review: David CR 1 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/296/files - new: https://git.openjdk.java.net/jdk/pull/296/files/e633cb94..2816b76b Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=296&range=10 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=296&range=09-10 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/296.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/296/head:pull/296 PR: https://git.openjdk.java.net/jdk/pull/296 From eosterlund at openjdk.java.net Tue Oct 6 07:23:02 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 6 Oct 2020 07:23:02 GMT Subject: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v10] In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 02:26:16 GMT, David Holmes wrote: >> Erik ?sterlund has updated the pull request with a new target base due to a merge or a rebase. The pull request now >> contains 16 commits: >> - Review: Deal with new assert from mainline >> - Merge branch 'master' into 8253180_conc_stack_scanning >> - Review: StackWalker hook >> - Review: Kim CR 1 and exception handling fix >> - Review: Move barrier detach >> - Review: Remove assert that has outstayed its welcome >> - Merge branch 'master' into 8253180_conc_stack_scanning >> - Review: Albert CR2 and defensive programming >> - Review: StefanK CR 3 >> - Review: Per CR 1 >> - ... and 6 more: https://git.openjdk.java.net/jdk/compare/9604ee82...e633cb94 > > src/hotspot/share/runtime/safepointMechanism.cpp line 89: > >> 87: // >> 88: // The call has been carefully placed here to cater for a few situations: >> 89: // 1) After we exit from block after a global pool > > Typo: pool -> poll Fixed. > src/hotspot/share/runtime/stackWatermark.cpp line 223: > >> 221: void StackWatermark::yield_processing() { >> 222: update_watermark(); >> 223: MutexUnlocker mul(&_lock, Mutex::_no_safepoint_check_flag); > > This seems a little dubious - is it just a heuristic? There is no guarantee that unlocking the Mutex will allow another > thread to claim it before this thread re-locks it. It is indeed just a heuristic. There is no need for a guarantee. ------------- PR: https://git.openjdk.java.net/jdk/pull/296 From eosterlund at openjdk.java.net Tue Oct 6 07:26:48 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 6 Oct 2020 07:26:48 GMT Subject: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v10] In-Reply-To: <5ssr6WzpUH1emI5RNWd7rBRlhHkFA3EKh_yhnI-XALo=.c3537a8c-669e-44ff-8335-40cca1e36aa9@github.com> References: <5ssr6WzpUH1emI5RNWd7rBRlhHkFA3EKh_yhnI-XALo=.c3537a8c-669e-44ff-8335-40cca1e36aa9@github.com> Message-ID: On Tue, 6 Oct 2020 02:40:12 GMT, David Holmes wrote: >> Erik ?sterlund has updated the pull request with a new target base due to a merge or a rebase. The pull request now >> contains 16 commits: >> - Review: Deal with new assert from mainline >> - Merge branch 'master' into 8253180_conc_stack_scanning >> - Review: StackWalker hook >> - Review: Kim CR 1 and exception handling fix >> - Review: Move barrier detach >> - Review: Remove assert that has outstayed its welcome >> - Merge branch 'master' into 8253180_conc_stack_scanning >> - Review: Albert CR2 and defensive programming >> - Review: StefanK CR 3 >> - Review: Per CR 1 >> - ... and 6 more: https://git.openjdk.java.net/jdk/compare/9604ee82...e633cb94 > > src/hotspot/share/runtime/stackWatermark.hpp line 91: > >> 89: JavaThread* _jt; >> 90: StackWatermarkFramesIterator* _iterator; >> 91: Mutex _lock; > > How are you guaranteeing that the Mutex is unused at the time the StackWatermark is deleted? The StackWatermarks are deleted when the thread is deleted (and its destructor runs). Hence, I'm relying on the Threads SMR project here. Anyone that pokes around at the StackWatermark is either the current thread, or a thread that has a ThreadsListHandle containing the thread, making it safe to access that thread without it racingly being deleted. > src/hotspot/share/runtime/safepointMechanism.cpp line 101: > >> 99: uintptr_t SafepointMechanism::compute_poll_word(bool armed, uintptr_t stack_watermark) { >> 100: if (armed) { >> 101: log_debug(stackbarrier)("Computed armed at %d", Thread::current()->osthread()->thread_id()); > > s/at/for/ ? Fixed. ------------- PR: https://git.openjdk.java.net/jdk/pull/296 From eosterlund at openjdk.java.net Tue Oct 6 07:37:45 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 6 Oct 2020 07:37:45 GMT Subject: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v8] In-Reply-To: References: <4sawJHiIuc7oH5ETjrwJtJE3gkB1U2VBMVJdPmxJrg4=.e4e9b4d3-a118-4870-9b5b-f23b351093e2@github.com> Message-ID: On Tue, 6 Oct 2020 02:57:00 GMT, David Holmes wrote: > Hi Erik, > Can you give an overview of the use of the "poll word" and its relation to the "poll page" please? > Thanks, > David Hi David, Thanks for reviewing this code. There are various polls in the VM. We have runtime transitions, interpreter transitions, transitions at returns, native wrappers, transitions in nmethods... and sometimes they are a bit different. The "poll word" encapsulates enough information to be able to poll for returns (stack watermark barrier), or poll for normal handshakes/safepoints, with a conditional branch. So really, we could use the "poll word" for every single poll. A low order bit is a boolean saying if handshake/safepoint is armed, and the rest of the word denotes the watermark for which frame has armed returns. The "poll page" is for polls that do not use conditional branches, but instead uses an indirect load. It is used still in nmethod loop polls, because I experimentally found it to perform worse with conditional branches on one machine, and did not want to risk regressions. It is also used for VM configurations that do not yet support stack watermark barriers, such as Graal, PPC, S390 and 32 bit platforms. They will hopefully eventually support this mechanism, but having the poll page allows a more smooth transition. And unless it is crystal clear that the performance of the conditional branch loop poll really is fast enough on sufficiently many machines, we might keep it until that changes. Hope this makes sense. Thanks, ------------- PR: https://git.openjdk.java.net/jdk/pull/296 From github.com+670087+jrziviani at openjdk.java.net Tue Oct 6 08:19:44 2020 From: github.com+670087+jrziviani at openjdk.java.net (Ziviani) Date: Tue, 6 Oct 2020 08:19:44 GMT Subject: Integrated: 8253565: PPC64: Fix duplicate if condition in vm_version_ppc.cpp In-Reply-To: References: Message-ID: On Thu, 24 Sep 2020 16:38:31 GMT, Ziviani wrote: > This is a very small change. There're two `if (UseRTMLocking) {` conditions, one just after the other. This code simply > merge them. > https://bugs.openjdk.java.net/browse/JDK-8253565 This pull request has now been integrated. Changeset: 91997838 Author: Jose Ricardo Ziviani Committer: Martin Doerr URL: https://git.openjdk.java.net/jdk/commit/91997838 Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod 8253565: PPC64: Fix duplicate if condition in vm_version_ppc.cpp Reviewed-by: mdoerr ------------- PR: https://git.openjdk.java.net/jdk/pull/338 From aph at redhat.com Tue Oct 6 10:02:14 2020 From: aph at redhat.com (Andrew Haley) Date: Tue, 6 Oct 2020 11:02:14 +0100 Subject: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v11] In-Reply-To: References: Message-ID: <37965823-78b0-9cae-422f-42daf664f4fd@redhat.com> On 06/10/2020 08:22, Erik ?sterlund wrote: >> This PR the implementation of "JEP 376: ZGC: Concurrent Thread-Stack Processing" (cf. >> https://openjdk.java.net/jeps/376). One small thing: the couple of uses of lea(InternalAddress) should really be adr; this generates much better code. diff --git a/src/hotspot/cpu/aarch64/c1_CodeStubs_aarch64.cpp b/src/hotspot/cpu/aarch64/c1_CodeStubs_aarch64.cpp index ce3c97d6746..119bc979e0a 100644 --- a/src/hotspot/cpu/aarch64/c1_CodeStubs_aarch64.cpp +++ b/src/hotspot/cpu/aarch64/c1_CodeStubs_aarch64.cpp @@ -41,7 +41,7 @@ void C1SafepointPollStub::emit_code(LIR_Assembler* ce) { __ bind(_entry); InternalAddress safepoint_pc(ce->masm()->pc() - ce->masm()->offset() + safepoint_offset()); - __ lea(rscratch1, safepoint_pc); + __ adr(rscratch1, safepoint_pc); __ str(rscratch1, Address(rthread, JavaThread::saved_exception_pc_offset())); assert(SharedRuntime::polling_page_return_handler_blob() != NULL, diff --git a/src/hotspot/cpu/aarch64/c2_safepointPollStubTable_aarch64.cpp b/src/hotspot/cpu/aarch64/c2_safepointPollStubTable_aarch64.cpp index 1b627172e2d..fb36406fbde 100644 --- a/src/hotspot/cpu/aarch64/c2_safepointPollStubTable_aarch64.cpp +++ b/src/hotspot/cpu/aarch64/c2_safepointPollStubTable_aarch64.cpp @@ -39,7 +39,7 @@ void C2SafepointPollStubTable::emit_stub_impl(MacroAssembler& masm, C2SafepointP __ bind(entry->_stub_label); InternalAddress safepoint_pc(masm.pc() - masm.offset() + entry->_safepoint_offset); - __ lea(rscratch1, safepoint_pc); + __ adr(rscratch1, safepoint_pc); __ str(rscratch1, Address(rthread, JavaThread::saved_exception_pc_offset())); __ far_jump(callback_addr); } -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From rrich at openjdk.java.net Tue Oct 6 10:52:55 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Tue, 6 Oct 2020 10:52:55 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v6] In-Reply-To: References: Message-ID: > Hi, > > this is the continuation of the review of the implementation for: > > https://bugs.openjdk.java.net/browse/JDK-8227745 > https://bugs.openjdk.java.net/browse/JDK-8233915 > > It allows for JIT optimizations based on escape analysis even if JVMTI agents acquire capabilities to access references > to objects that are subject to such optimizations, e.g. scalar replacement. The implementation reverts such > optimizations just before access very much as when switching from JIT compiled execution to the interpreter, aka > "deoptimization". Webrev.8 was the last one before before the transition to Git/Github: > > http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ > > Thanks, Richard. Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: - Merge branch 'master' into JDK-8227745 - Merge branch 'master' into JDK-8227745 - Make parameter current_thread of JvmtiEnvBase::check_top_frame() a JavaThread* again. With Asynchronous handshakes the type was changed from JavaThread* to Thread* but this is not necessary as check_top_frame() is not executed during a handshake / safepoint (robehn confirmed). - Merge branch 'master' into JDK-8227745 - EATests.java: bugfix to prevent ObjectCollectedException - Better encapsulation of JvmtiDeferredUpdates. Moved jvmtiDeferredLocalVariableSet to jvmtiDeferredUpdates.hpp - EscapeBarrier: moved method comments. - Shuffled parameters of EscapeBarrier constructors to better match each other - Moved class EscapeBarrier and class JvmtiDeferredUpdates into dedicated files. - Merge branch 'master' into JDK-8227745 - ... and 2 more: https://git.openjdk.java.net/jdk/compare/17285472...1c586cfb ------------- Changes: https://git.openjdk.java.net/jdk/pull/119/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=119&range=05 Stats: 5788 lines in 52 files changed: 5568 ins; 116 del; 104 mod Patch: https://git.openjdk.java.net/jdk/pull/119.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/119/head:pull/119 PR: https://git.openjdk.java.net/jdk/pull/119 From eosterlund at openjdk.java.net Tue Oct 6 12:17:05 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 6 Oct 2020 12:17:05 GMT Subject: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v12] In-Reply-To: References: Message-ID: > This PR the implementation of "JEP 376: ZGC: Concurrent Thread-Stack Processing" (cf. > https://openjdk.java.net/jeps/376). > Basically, this patch modifies the epilog safepoint when returning from a frame (supporting interpreter frames, c1, c2, > and native wrapper frames), to compare the stack pointer against a thread-local value. This turns return polls into > more of a swiss army knife that can be used to poll for safepoints, handshakes, but also returns into not yet safe to > expose frames, denoted by a "stack watermark". ZGC will leave frames (and other thread oops) in a state of a mess in > the GC checkpoint safepoints, rather than processing all threads and their stacks. Processing is initialized > automagically when threads wake up for a safepoint, or get poked by a handshake or safepoint. Said initialization > processes a few (3) frames and other thread oops. The rest - the bulk of the frame processing, is deferred until it is > actually needed. It is needed when a frame is exposed to either 1) execution (returns or unwinding due to exception > handling), or 2) stack walker APIs. A hook is then run to go and finish the lazy processing of frames. Mutator and GC > threads can compete for processing. The processing is therefore performed under a per-thread lock. Note that disarming > of the poll word (that the returns are comparing against) is only performed by the thread itself. So sliding the > watermark up will require one runtime call for a thread to note that nothing needs to be done, and then update the poll > word accordingly. Downgrading the poll word concurrently by other threads was simply not worth the complexity it > brought (and is only possible on TSO machines). So left that one out. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: Review: Andrew CR 1 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/296/files - new: https://git.openjdk.java.net/jdk/pull/296/files/2816b76b..54fe1f8b Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=296&range=11 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=296&range=10-11 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/296.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/296/head:pull/296 PR: https://git.openjdk.java.net/jdk/pull/296 From coleenp at openjdk.java.net Tue Oct 6 12:17:59 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 6 Oct 2020 12:17:59 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp Message-ID: This change moves the significant amount of stack overflow related code (with ascii art!) out of thread files into a new file. Many of the functions are static functions and some go through JavaThread::_stack_overflow_state where needed. All functions are moved and not modified except for qualification. I also added a delegating constructor to JavaThread::JavaThread so reordered the assignments as initializers from JavaThread::initialize. Tested with tier1-6 and builds on arm32, ppc, s390 and zero. ------------- Commit messages: - 8253717: Relocate stack overflow code out of thread.hpp/cpp - 8253717: Relocate stack overflow code out of thread.hpp/cpp Changes: https://git.openjdk.java.net/jdk/pull/522/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=522&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253717 Stats: 1265 lines in 50 files changed: 607 ins; 497 del; 161 mod Patch: https://git.openjdk.java.net/jdk/pull/522.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/522/head:pull/522 PR: https://git.openjdk.java.net/jdk/pull/522 From eosterlund at openjdk.java.net Tue Oct 6 12:21:51 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 6 Oct 2020 12:21:51 GMT Subject: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v8] In-Reply-To: References: <4sawJHiIuc7oH5ETjrwJtJE3gkB1U2VBMVJdPmxJrg4=.e4e9b4d3-a118-4870-9b5b-f23b351093e2@github.com> Message-ID: On Tue, 6 Oct 2020 07:35:16 GMT, Erik ?sterlund wrote: >> Hi Erik, >> Can you give an overview of the use of the "poll word" and its relation to the "poll page" please? >> Thanks, >> David > >> Hi Erik, >> Can you give an overview of the use of the "poll word" and its relation to the "poll page" please? >> Thanks, >> David > > Hi David, > > Thanks for reviewing this code. > > There are various polls in the VM. We have runtime transitions, interpreter transitions, transitions at returns, native > wrappers, transitions in nmethods... and sometimes they are a bit different. > The "poll word" encapsulates enough information to be able to poll for returns (stack watermark barrier), or poll for > normal handshakes/safepoints, with a conditional branch. So really, we could use the "poll word" for every single poll. > A low order bit is a boolean saying if handshake/safepoint is armed, and the rest of the word denotes the watermark for > which frame has armed returns. The "poll page" is for polls that do not use conditional branches, but instead uses an > indirect load. It is used still in nmethod loop polls, because I experimentally found it to perform worse with > conditional branches on one machine, and did not want to risk regressions. It is also used for VM configurations that > do not yet support stack watermark barriers, such as Graal, PPC, S390 and 32 bit platforms. They will hopefully > eventually support this mechanism, but having the poll page allows a more smooth transition. And unless it is crystal > clear that the performance of the conditional branch loop poll really is fast enough on sufficiently many machines, we > might keep it until that changes. Hope this makes sense. Thanks, > _Mailing list message from [Andrew Haley](mailto:aph at redhat.com) on [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ > > On 06/10/2020 08:22, Erik ?sterlund wrote: > > > > This PR the implementation of "JEP 376: ZGC: Concurrent Thread-Stack Processing" (cf. > > > https://openjdk.java.net/jeps/376). > > One small thing: the couple of uses of lea(InternalAddress) should really be adr; > this generates much better code. Hi Andrew, Thanks for having a look. I applied your patch. Having said that, this is run on the safepoint slow path, so should be a rather cold path, where threads have to wear coats and gloves. But it does not hurt to optimize the encoding further, I suppose. Thanks, ------------- PR: https://git.openjdk.java.net/jdk/pull/296 From rehn at openjdk.java.net Tue Oct 6 12:54:57 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 6 Oct 2020 12:54:57 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 12:13:00 GMT, Coleen Phillimore wrote: > This change moves the significant amount of stack overflow related code (with ascii art!) out of thread files into a > new file. Many of the functions are static functions and some go through JavaThread::_stack_overflow_state where > needed. All functions are moved and not modified except for qualification. I also added a delegating constructor to > JavaThread::JavaThread so reordered the assignments as initializers from JavaThread::initialize. > Tested with tier1-6 and builds on arm32, ppc, s390 and zero. Thanks ------------- Marked as reviewed by rehn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/522 From rriggs at openjdk.java.net Tue Oct 6 13:38:08 2020 From: rriggs at openjdk.java.net (Roger Riggs) Date: Tue, 6 Oct 2020 13:38:08 GMT Subject: RFR: 8253750: use build-stable default seed for Utils.RANDOM_GENERATOR [v3] In-Reply-To: References: Message-ID: On Mon, 5 Oct 2020 16:46:55 GMT, Igor Ignatyev wrote: >> Hi all, >> >> could you please review the patch which updates `jdk.test.lib.Utils` to use md5 hash-sum of `java.vm.version` property >> as default seed for `Utils.RANDOM_GENERATOR`? >> from JBS: >>> using the same seed for all runs of a build will make it possible (easier) to compare results from different test runs >>> (e.g. on different platforms, w/ different flags) and consequently will make test results analysis easier. the proposed >>> solution is to use the seed based on Runtime.version() / "java.vm.version", which are different from build to build, if >>> there is no seed specified by "jdk.test.lib.random.seed" property. >> >> the patch also updates `RandomGeneratorTest` test, so it expects now that the same values are generated if no seed is >> provided. >> testing: >> ? tier1 >> ? `test/lib-test/jdk/test/lib/` against personal build on linux,windows,macos-x64 >> ? `test/lib-test/jdk/test/lib/` against CI build on linux,windows,macos-x64 > > Igor Ignatyev has updated the pull request incrementally with one additional commit since the last revision: > > use random seed for personal/internal builds Marked as reviewed by rriggs (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/391 From rriggs at openjdk.java.net Tue Oct 6 13:38:09 2020 From: rriggs at openjdk.java.net (Roger Riggs) Date: Tue, 6 Oct 2020 13:38:09 GMT Subject: RFR: 8253750: use build-stable default seed for Utils.RANDOM_GENERATOR [v3] In-Reply-To: References: Message-ID: On Mon, 5 Oct 2020 23:30:37 GMT, Igor Ignatyev wrote: >> I was thinking that the code added to Utils.SEED initialization could just have easily been added to >> RandomFactory.getSystemSeed() and not have to be changed later. > > well, `RandomFactory::getSystemSeed` reads `seed` property, while `Utils.SEED` reads `jdk.test.lib.random.seed`, and > there are tests which use these properties to specify seeds, I'd prefer to leave changing these tests to > [8212077](https://bugs.openjdk.java.net/browse/JDK-8212077). o/c we can add a new method, e.g. > `RandomFactory::getBuildStableSeed`, or refactor `RandomFactory::getSystemSeed` to take property name as an argument, > but I don't think it's really worth it as 8212077 is starting to slowly bubble up in my working queue. If the factories duplication will get combined/resolved with 8212077, I'll leave that up to you. ------------- PR: https://git.openjdk.java.net/jdk/pull/391 From akozlov at openjdk.java.net Tue Oct 6 13:40:22 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 6 Oct 2020 13:40:22 GMT Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on macOS [v2] In-Reply-To: References: Message-ID: <6CNTkYXWf4_yX13e6G4fusS8fijamDb13jtPDdHNY6g=.0459ff64-960a-4dd5-9715-07ed0ba88012@github.com> > Please review an updated RFR from https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041463.html > > On macOS, MAP_JIT cannot be used with MAP_FIXED[1]. So pd_reserve_memory have to provide MAP_JIT for mmap(NULL, > PROT_NONE), the function was made aware of exec permissions. > For executable and data regions, pd_commit_memory only unlocks the memory with mprotect, this should make no difference > compared with old code. > For data regions, pd_uncommit_memory still uses a new overlapping anonymous mmap which returns pages to the OS and > immediately reflects this in diagnostic tools like ps. For executable regions it would require MAP_FIXED|MAP_JIT, so > instead madvise(MADV_FREE)+mprotect(PROT_NONE) are used. They should also allow OS to reclaim pages, but apparently > this does not happen immediately. In practice, it should not be a problem for executable regions, as codecache does not > shrink (if I haven't missed anything, by the implementation and in principle). Tested: > * local tier1 > * jdk-submit > * codesign[2] with hardened runtime and allow-jit but without > allow-unsigned-executable-memory entitlements[3] produce a working bundle. > > (adding GC group as suggested by @dholmes-ora) > > > [1] https://github.com/apple/darwin-xnu/blob/master/bsd/kern/kern_mman.c#L227 > [2] > > codesign \ > --sign - \ > --options runtime \ > --entitlements ents.plist \ > --timestamp \ > $J/bin/* $J/lib/server/*.dylib $J/lib/*.dylib > [3] > > > > > com.apple.security.cs.allow-jit > > com.apple.security.cs.disable-library-validation > > com.apple.security.cs.allow-dyld-environment-variables > > > Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: - Minimal working example, no uncommit - Merge remote-tracking branch 'upstream/master' into 8234930 - Revert "Use MAP_JIT for CodeCache pages" This reverts commit 114d9cffd62cab42790b65091648fe75345c4533. - Use MAP_JIT for CodeCache pages ------------- Changes: https://git.openjdk.java.net/jdk/pull/294/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=294&range=01 Stats: 48 lines in 6 files changed: 31 ins; 0 del; 17 mod Patch: https://git.openjdk.java.net/jdk/pull/294.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/294/head:pull/294 PR: https://git.openjdk.java.net/jdk/pull/294 From rrich at openjdk.java.net Tue Oct 6 14:09:13 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Tue, 6 Oct 2020 14:09:13 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v3] In-Reply-To: References: Message-ID: On Tue, 29 Sep 2020 13:57:26 GMT, Richard Reingruber wrote: >>> >>> >>> The minor updates in response to my comments are fine. >>> >>> The more major updates ... I can't really comment on. >> >> Thanks for looking at the changes and for giving feedback. > > Hi Serguei, > > thanks for providing feedback! I've pushed the changes based on it now but I > have not yet merged master again. This needs a little work... > > Please find my replies to your comments below. > > Thanks, Richard. > >> Could you consider to place the classes EscapeBarrier and JvmtiDeferredUpdates >> into theyr own .hpp/.cpp files? The class JvmtiDeferredUpdates would be better >> to put into the folder 'prims' then. > > Done. In addition I moved preexisting class jvmtiDeferredLocalVariableSet and > class jvmtiDeferredLocalVariable from runtime/vframe_hp.hpp to > prims/jvmtiDeferredUpdates.hpp. Please let me know if not ok. > >> src/hotspot/share/opto/macro.cpp: >> >> ``` >> @@ -1091,11 +1091,11 @@ >> bool PhaseMacroExpand::eliminate_allocate_node(AllocateNode *alloc) { >> // Don't do scalar replacement if the frame can be popped by JVMTI: >> // if reallocation fails during deoptimization we'll pop all >> // interpreter frames for this compiled frame and that won't play >> // nice with JVMTI popframe. >> - if (!EliminateAllocations || JvmtiExport::can_pop_frame() || !alloc->_is_non_escaping) { >> + if (!EliminateAllocations || !alloc->_is_non_escaping) { >> return false; >> } >> ``` >> >> I wonder if the comment is still correct after you removed the check for JvmtiExport::can_pop_frame(). > > Good catch. I fixed it previously with > https://github.com/openjdk/jdk/pull/119/commits/18dd54b4e6f17ca723e4ae1a1e8dc57e81878dd3 > >> src/hotspot/share/runtime/deoptimization.hpp: >> >> ``` >> + EscapeBarrier(JavaThread* calling_thread, JavaThread* deoptee_thread, bool barrier_active) >> + : _calling_thread(calling_thread), _deoptee_thread(deoptee_thread), >> + _barrier_active(barrier_active && (JVMCI_ONLY(UseJVMCICompiler) NOT_JVMCI(false) >> + COMPILER2_PRESENT(|| DoEscapeAnalysis))) >> . . . . . . . . . >> + >> + // Revert ea based optimizations for all java threads >> + EscapeBarrier(JavaThread* calling_thread, bool barrier_active) >> + : _calling_thread(calling_thread), _deoptee_thread(NULL), >> ``` >> >> Nit: would better to make the parameter deoptee_thread to be the 3rd to better mach the seconf constructor. > > I have shuffled the parameters and moved barrier_active at first position. Would > that be ok? > >> >> ``` >> + bool all_threads() const { return _deoptee_thread == NULL; } // Should revert optimizations for all >> threads. + bool self_deopt() const { return _calling_thread == _deoptee_thread; } // Current thread deoptimizes >> its own objects. + bool barrier_active() const { return _barrier_active; } // Inactive barriers are >> created if no local objects can escape. ``` >> >> I'd suggest to put comments in a line before function definitions as it is done for other declarations/definitions. > > Done. // Note that there are quite a few locations with the comment on the same line ;) > >> src/hotspot/share/runtime/deoptimization.cpp: >> >> ``` >> @@ -349,12 +408,12 @@ >> >> // Now that the vframeArray has been created if we have any deferred local writes >> // added by jvmti then we can free up that structure as the data is now in the >> // vframeArray >> >> - if (thread->deferred_locals() != NULL) { >> - GrowableArray* list = thread->deferred_locals(); >> + if (JvmtiDeferredUpdates::deferred_locals(thread) != NULL) { >> + GrowableArray* list = JvmtiDeferredUpdates::deferred_locals(thread); >> int i = 0; >> do { >> // Because of inlining we could have multiple vframes for a single frame >> // and several of the vframes could have deferred writes. Find them all. >> if (list->at(i)->id() == array->original().id()) { >> >> @@ -365,13 +424,14 @@ >> } else { >> i++; >> } >> } while ( i < list->length() ); >> if (list->length() == 0) { >> - thread->set_deferred_locals(NULL); >> - // free the list and elements back to C heap. >> - delete list; >> + JvmtiDeferredUpdates* updates = thread->deferred_updates(); >> + thread->set_deferred_updates(NULL); >> + // free deferred updates. >> + delete updates; >> } >> ``` >> >> It is not clear why the 'list' is not deleted anymore. If it is intentional then could you, please, add a comment with >> an explanation? > > 'list' is now embedded in JvmtiDeferredUpdates. It es deleted as part of the > JvmtiDeferredUpdates instance when there are no more deferred updates. > > class JvmtiDeferredUpdates : public CHeapObj { > > [...] > > // Deferred updates of locals, expressions, and monitors > GrowableArray _deferred_locals_updates; > > [...] > > }; > > I introduced JvmtiDeferredUpdates because this patch introduces a new type of > deferred update: _relock_count_after_wait. > > I tried to improve the encapsulation of class JvmtiDeferredUpdates and > simplified the location you are referring to. > > So when is memory for deferred updates freed? > > (A) Deferred local variable updates are deleted when the compiled target frame is > replaced with corresponding interpreter frames. > See JvmtiDeferredUpdates::delete_updates_for_frame(). > > (B) A thread's JvmtiDeferredUpdates instance is deleted if all updates where > delivered. All updates where delivered when JvmtiDeferredUpdates::count() > returns 0. This is checked whenever updates are delivered. See call sites in > JvmtiDeferredUpdates::delete_updates_for_frame() and > JvmtiDeferredUpdates::get_and_reset_relock_count_after_wait(). > > (C) Besides (B) a thread's JvmtiDeferredUpdates instance is also deleted when > the thread is destroyed. All not yet delivered updates are deleted then > too. See JavaThread::~JavaThread() and JvmtiDeferredUpdates::~JvmtiDeferredUpdates(). > >> If you are okay to separate the EscapeBarrier class into its own hpp/cpp files >> then the class EscapeBarrierSuspendHandshake is better to be colocated with >> it. > > Done. > >> The below functions EscapeBarrier::sync_and_suspend_one() and do_thread() make a call to the set_obj_deopt_flag() which >> seems to be a duplication. At least, it is not clear why this duplication exist and so, needs to be explained in a >> comment. ``` >> +void EscapeBarrier::sync_and_suspend_one() { >> + assert(_calling_thread != NULL, "calling thread must not be NULL"); >> + assert(_deoptee_thread != NULL, "deoptee thread must not be NULL"); >> + assert(barrier_active(), "should not call"); >> + >> + // Sync with other threads that might be doing deoptimizations >> + { >> + // Need to switch to _thread_blocked for the wait() call >> + ThreadBlockInVM tbivm(_calling_thread); >> + MonitorLocker ml(_calling_thread, EscapeBarrier_lock, Mutex::_no_safepoint_check_flag); >> + while (_self_deoptimization_in_progress || _deoptee_thread->is_obj_deopt_suspend()) { >> + ml.wait(); >> + } >> + >> + if (self_deopt()) { >> + _self_deoptimization_in_progress = true; >> + return; >> + } >> + >> + // set suspend flag for target thread >> + _deoptee_thread->set_obj_deopt_flag(); >> + } >> + >> + // suspend target thread >> + EscapeBarrierSuspendHandshake sh(NULL, "EscapeBarrierSuspendOne"); >> + Handshake::execute_direct(&sh, _deoptee_thread); >> + assert(!_deoptee_thread->has_last_Java_frame() || _deoptee_thread->frame_anchor()->walkable(), >> + "stack should be walkable now"); >> +} >> . . . . . >> +class EscapeBarrierSuspendHandshake : public HandshakeClosure { >> + JavaThread* _excluded_thread; >> + public: >> + EscapeBarrierSuspendHandshake(JavaThread* excluded_thread, const char* name) : >> + HandshakeClosure(name), >> + _excluded_thread(excluded_thread) {} >> + void do_thread(Thread* th) { >> + if (th->is_Java_thread() && !th->is_hidden_from_external_view() && (th != _excluded_thread)) { >> + th->set_obj_deopt_flag(); >> + } >> + } >> +}; >> ``` > > I previously removed the set_obj_deopt_flag() call from > EscapeBarrierSuspendHandshake::do_thread() in [1]. For synchronization it is > better to set_obj_deopt_flag() before the handshake (see comment in > EscapeBarrier::sync_and_suspend_all()). > > [1] https://github.com/openjdk/jdk/pull/119/commits/18dd54b4e6f17ca723e4ae1a1e8dc57e81878dd3 > >> /src/hotspot/share/prims/jvmtiImpl.cpp: >> >> ``` >> 421 // Constructor for non-object getter >> 422 VM_GetOrSetLocal::VM_GetOrSetLocal(JavaThread* thread, jint depth, jint index, BasicType type) >> 423 : _thread(thread) >> 424 , _calling_thread(NULL) >> 425 , _depth(depth) >> 426 , _index(index) >> 427 , _type(type) >> 428 , _jvf(NULL) >> 429 , _set(false) >> 430 , _eb(NULL, NULL, type == T_OBJECT) >> 431 , _result(JVMTI_ERROR_NONE) >> 432 { >> 433 } >> 434 >> 435 // Constructor for object or non-object setter >> 436 VM_GetOrSetLocal::VM_GetOrSetLocal(JavaThread* thread, jint depth, jint index, BasicType type, jvalue value) >> 437 : _thread(thread) >> 438 , _calling_thread(NULL) >> 439 , _depth(depth) >> 440 , _index(index) >> 441 , _type(type) >> 442 , _value(value) >> 443 , _jvf(NULL) >> 444 , _set(true) >> 445 , _eb(JavaThread::current(), thread, type == T_OBJECT) >> 446 , _result(JVMTI_ERROR_NONE) >> 447 { >> 448 } >> 449 >> 450 // Constructor for object getter >> 451 VM_GetOrSetLocal::VM_GetOrSetLocal(JavaThread* thread, JavaThread* calling_thread, jint depth, int index) >> 452 : _thread(thread) >> 453 , _calling_thread(calling_thread) >> 454 , _depth(depth) >> 455 , _index(index) >> 456 , _type(T_OBJECT) >> 457 , _jvf(NULL) >> 458 , _set(false) >> 459 , _eb(calling_thread, thread, true) >> 460 , _result(JVMTI_ERROR_NONE) >> 461 { >> 462 } >> ``` >> >> I think, false has to be passed to the constructors of non-object getters instead of expression: >> "type == T_OBJECT". >> The type can not be T_OBJECT for non-object getters. > > I used to do that. Then I changed it because the c++ compiler can fold the > comparison to "false" and if somebody changes the non-object getter to get > objects too then it would still be correct. > > Let me know if you still think it is better to pass false. Maybe add an > assertion type == T_OBJECT then? > >> Q: Is an EscapeBarrier useful if false is passed as the barrier_active parameter? > > The EscapeBarrier is not needed then. In the case of the non-object getter above > I'd hope that most of the constructor/desctructor of EscapeBarrier is eliminated > by the c++ compiler then. > > Besides the changes you suggested I have made a bugfix in > test/jdk/com/sun/jdi/EATests.java to prevent ObjectCollectedException. > > Thanks, Richard. Hi Serguei (@sspitsyn) are you ok with the changes I made based on your comments? Will you further review the change? Thanks, Richard. ------------- PR: https://git.openjdk.java.net/jdk/pull/119 From rriggs at openjdk.java.net Tue Oct 6 15:16:11 2020 From: rriggs at openjdk.java.net (Roger Riggs) Date: Tue, 6 Oct 2020 15:16:11 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: References: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> Message-ID: On Mon, 5 Oct 2020 21:45:06 GMT, CoreyAshford wrote: >> Core Libs changes and the test look fine. > > 8248188: Add IntrinsicCandidate and API for Base64 decoding, add Power64LE intrinsic implementation. > > This patch set encompasses the following commits: > > Adds a new intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for the > intrinsic. The API is similar to the existing encodeBlock intrinsic. > Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic > implementation. > Adds a Power64LE-specific implementation of the decodeBlock intrinsic. > > Adds a JMH microbenchmark for both Base64 encoding and encoding. > > Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. fyi, there is no need to do a rebase. The preferred way is to do a merge. When the changes are integrated, all of the individual commits are squashed to create a single commit. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From akozlov at openjdk.java.net Tue Oct 6 15:27:09 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 6 Oct 2020 15:27:09 GMT Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on macOS [v2] In-Reply-To: References: <6iVRP-20baz0_46SouR-dj9SyspR5QvaL9iJMdeipDE=.92688b4e-ebd3-4681-8e63-a4aee752c407@github.com> Message-ID: On Tue, 29 Sep 2020 07:12:35 GMT, Stefan Karlsson wrote: >> @tstuefe My patch to remove MAP_FIXED from the memory reservation path should make it possible to revert all the >> os::reserve_memory changes in this patch. > >> @tstuefe My patch to remove MAP_FIXED from the memory reservation path should make it possible to revert all the >> os::reserve_memory changes in this patch. > > FWIW, I now understand that the motivation for the changes in os::reserve_memory is *not* that *it* uses MAP_FIXED. > Instead the change there is done so that os::commit_memory doesn't have to mix mmap + MAP_FIXED + MAP_JIT. This is also > the reason why os::uncommit_memory needs to be changed as well. So, ignore my comment above. Hi @tstuefe, Recent refactors interfered with the previous version of the patch, I found it is a bit simpler to start from scratch. https://github.com/openjdk/jdk/pull/294/commits/f8664ca7dcc1cfdfb9a1f032035f2cde77048649 is a minimal patch that allows MAP_JIT. I cannot see any way to simplify the interface now. Please tell me if I miss something. Also, `VirtualSpace` now choose from `reserve_memory_with_fd` and anything that is used for reserve with exec. What I found in the AIX implementation, the commit/uncommit are for memory previously reserved https://github.com/openjdk/jdk/blob/master/src/hotspot/os/aix/os_aix.cpp#L2227. If we attribute reserved region with exec, then uncommit can query bookkeeping info and get the type of memory. >From options of internal bookkeeping info or an asking the system for the mmap flags, I prefer the first one. It is faster and we can bookkeep only minimal info, like a list of executable regions. That should reduce the amount of data and make a check for the type in commit/uncommit faster. After looking around, it does look like every place where commit with exec is used, we know enough to use reserve/uncommit with exec. Taking this to the extreme, I still think a specialized set of reserve/commit/uncommit for executable regions would look natural. For example, commit with exec is used in less than five places. I'll do a little research there. ------------- PR: https://git.openjdk.java.net/jdk/pull/294 From minqi at openjdk.java.net Tue Oct 6 15:32:23 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Tue, 6 Oct 2020 15:32:23 GMT Subject: RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive [v10] In-Reply-To: References: Message-ID: <2MoJTYvZmY9SqzULTgd67lvOwsUoGKp7AuK7hkx6_T4=.998d347b-c887-4f86-93f3-cd36a84088dc@github.com> > This patch is reorganized after 8252725, which is separated from this patch to refactor jlink glugin code. The previous > webrev with hg can be found at: http://cr.openjdk.java.net/~minqi/2020/8247536/webrev-05. With 8252725 integrated, the > regeneration of holder classes is simply to call the new added GenerateJLIClassesHelper.cdsGenerateHolderClasses > function. Tests: tier1-4 Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: Fixed comments with correct class and method name in CDS, removed unused variables after last change. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/193/files - new: https://git.openjdk.java.net/jdk/pull/193/files/52764a6e..686e211b Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=193&range=09 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=193&range=08-09 Stats: 4 lines in 2 files changed: 0 ins; 3 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/193.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/193/head:pull/193 PR: https://git.openjdk.java.net/jdk/pull/193 From github.com+70893615+jasontatton-aws at openjdk.java.net Tue Oct 6 15:46:12 2020 From: github.com+70893615+jasontatton-aws at openjdk.java.net (Jason Tatton) Date: Tue, 6 Oct 2020 15:46:12 GMT Subject: RFR: 8173585: Intrinsify StringLatin1.indexOf(char) [v2] In-Reply-To: References: <_T0873dC5tfUtGn9r1_Y21JkPKRt-za3MM9hPN2GQKQ=.b865fe53-5417-424f-81b6-1566a330640e@github.com> <-DRWR4_f5u6DsSGHAuPnpHrhaG8Una8BXf4zDekQjLM=.469b08b6-a8b2-4a6b-8ab8-1a40810aede0@github.com> Message-ID: On Fri, 2 Oct 2020 08:44:48 GMT, Jason Tatton wrote: >> Hi Jason, >> >> thanks for bringing String.indexOf() for latin strings up to date with the Unicode version. >> >> Your changes look good except a few minor issues I've commented on right in the code. >> >> I'd only like to ask you if you could possibly improve your test a little bit. As far as I understand, your search text >> is a consecutive sequence of "abc" characters, so you'll always find the character your searching for within the next >> three characters of the source text. This won't exercise the loops of your intrinsic. Maybe you can also add some test >> versions where the search character will be found beyond the first 32/64 characters after "fromIndex"? > > @simonis Thank you for the corrections, I have ammended them in the latest comit as follows: > > Changes to unit test: > - main test adjusted such that Strings gennerated are much longer (up to 2048 characters) and of the form: `azaza`, > `aazaazaa`, `aaazaaazaaa`, etc with `'z'` being the search character searched for. Multiple instances of the search > character are included in the String in order to validate that the starting offset is correctly handleded. Results are > compared to non intrinsified version of the code. Longer strings means that the looping functionality of the various > paths is entered into. > - Run configurations introduced such that it checks behaviour where use of SSE and AVX instructions are restricted. > - Tier4InvocationThreshold adjusted so as to ensure C2 code iis invoked. > > Other changes: > - newlines added at end of files > > @vnkozlov here are the performance numbers as requested. I have included performance of the UTF16 version of the > intrinsic for reference: > | UseAVX= | UseSSE= | Benchmark | Mode | Cnt | Score | Error | Units | > |---------|---------|-----------------------------------|------|-----|-------------|-------------|-------| > | | 0 | IndexOfBenchmark.latin1_long_char | avgt | 5 | **447,493.398** | ? 4,666.386 | ns/op | > | 0 | | IndexOfBenchmark.latin1_long_char | avgt | 5 | **104,735.941** | ? 2,484.403 | ns/op | > | 1 | | IndexOfBenchmark.latin1_long_char | avgt | 5 | **104,342.844** | ? 2,656.343 | ns/op | > | 2 | | IndexOfBenchmark.latin1_long_char | avgt | 5 | **61,000.418** | ? 1,543.951 | ns/op | > | 3 | | IndexOfBenchmark.latin1_long_char | avgt | 5 | **60,607.988** | ? 1,466.354 | ns/op | > | | 0 | IndexOfBenchmark.utf16_long_char | avgt | 5 | 672,475.302 | ? 4,998.596 | ns/op | > | 0 | | IndexOfBenchmark.utf16_long_char | avgt | 5 | 175,521.654 | ? 7,549.094 | ns/op | > | 1 | | IndexOfBenchmark.utf16_long_char | avgt | 5 | 172,514.981 | ? 3,561.040 | ns/op | > | 2 | | IndexOfBenchmark.utf16_long_char | avgt | 5 | 120,725.748 | ? 2,004.400 | ns/op | > | 3 | | IndexOfBenchmark.utf16_long_char | avgt | 5 | 120,664.623 | ? 1,988.419 | ns/op | > > I think the results are as expected, we see improvements in performance as the range of SSE and AVX instructions which > can be used is expanded upon. Note that no improvement is observed with UseAVX=3 because there is no AVX-512 code in > these intrinsics. Hi All, Just wondering if there is anything you'd like me to do in order to assist with moving this patch forward? Thanks, Jason ------------- PR: https://git.openjdk.java.net/jdk/pull/71 From stuefe at openjdk.java.net Tue Oct 6 16:41:13 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 6 Oct 2020 16:41:13 GMT Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on macOS [v2] In-Reply-To: References: <6iVRP-20baz0_46SouR-dj9SyspR5QvaL9iJMdeipDE=.92688b4e-ebd3-4681-8e63-a4aee752c407@github.com> Message-ID: <_XaA5cQEInPMn5Q5gj2y7AFCRprFQiYfI6BeUN49FhA=.9f17ae05-b37e-4f40-a83f-fd34aa812575@github.com> On Tue, 6 Oct 2020 15:23:56 GMT, Anton Kozlov wrote: >>> @tstuefe My patch to remove MAP_FIXED from the memory reservation path should make it possible to revert all the >>> os::reserve_memory changes in this patch. >> >> FWIW, I now understand that the motivation for the changes in os::reserve_memory is *not* that *it* uses MAP_FIXED. >> Instead the change there is done so that os::commit_memory doesn't have to mix mmap + MAP_FIXED + MAP_JIT. This is also >> the reason why os::uncommit_memory needs to be changed as well. So, ignore my comment above. > > Hi @tstuefe, > > Recent refactors interfered with the previous version of the patch, I found it is a bit simpler to start from scratch. > https://github.com/openjdk/jdk/pull/294/commits/f8664ca7dcc1cfdfb9a1f032035f2cde77048649 is a minimal patch that allows > MAP_JIT. I cannot see any way to simplify the interface now. Please tell me if I miss something. Also, `VirtualSpace` > now choose from `reserve_memory_with_fd` and anything that is used for reserve with exec. What I found in the AIX > implementation, the commit/uncommit are for memory previously reserved > https://github.com/openjdk/jdk/blob/master/src/hotspot/os/aix/os_aix.cpp#L2227. If we attribute reserved region with > exec, then uncommit can query bookkeeping info and get the type of memory. From options of internal bookkeeping info > or an asking the system for the mmap flags, I prefer the first one. It is faster and we can bookkeep only minimal info, > like a list of executable regions. That should reduce the amount of data and make a check for the type in > commit/uncommit faster. After looking around, it does look like every place where commit with exec is used, we know > enough to use reserve/uncommit with exec. Taking this to the extreme, I still think a specialized set of > reserve/commit/uncommit for executable regions would look natural. For example, commit with exec is used in less than > five places. I'll do a little research there. > Hi @tstuefe, > > Recent refactors interfered with the previous version of the patch, I found it is a bit simpler to start from scratch. > [f8664ca](https://github.com/openjdk/jdk/commit/f8664ca7dcc1cfdfb9a1f032035f2cde77048649) is a minimal patch that > allows MAP_JIT. I cannot see any way to simplify the interface now. Please tell me if I miss something. Also, > `VirtualSpace` now choose from `reserve_memory_with_fd` and anything that is used for reserve with exec. What I found > in the AIX implementation, the commit/uncommit are for memory previously reserved > https://github.com/openjdk/jdk/blob/master/src/hotspot/os/aix/os_aix.cpp#L2227. If we attribute reserved region with > exec, then uncommit can query bookkeeping info and get the type of memory. On AIX there is no explicit commit, committing happens on touch and uncommitting unfortunately does not really work (there is a disclaim() but its buggy). All commit does is to check the input parameters and maybe explictly touch the area. > > From options of internal bookkeeping info or an asking the system for the mmap flags, I prefer the first one. It is > faster and we can bookkeep only minimal info, like a list of executable regions. That should reduce the amount of data > and make a check for the type in commit/uncommit faster. After looking around, it does look like every place where > commit with exec is used, we know enough to use reserve/uncommit with exec. Taking this to the extreme, I still think a > specialized set of reserve/commit/uncommit for executable regions would look natural. For example, commit with exec is > used in less than five places. I'll do a little research there. I thought a lot about this. As you said, I believe now that specifying exec (or other parameters, see below) is not necessary on a per-commit; on per-mapping level it should be enough. I also see at least three separate cases where we establish a mapping and later need mapping-specific information somewhere until the next interaction - be it commit/uncommit or release: 1) On AIX, where we decide at os::reserve_memory time to use either SystemV or mmap and later need to know 2) On Linux when TPH are active the information "Use TPH" is handed down to os::commit via the weird alignment_hint parameter and that os::realign_memory() function. Following that parameter flow, it is just basically a way to flag the code to do one extra madvice at commit time. 3) Your case now, where we need to know if the mapping was supposed to be executable at commit time. I really think a generic solution would be best. One simple variant would be to return not a pointer but a handle and let the platform store behind that handle whatever it needs to keep on a per-mapping base: handle_t os::reserve_memory(size, input flags); void* start_address = os::get_reserved_memory_start_address(handle_t) bool os::commit_memory(handle_t, address, size) and so on. The only problem with that is that it would cause a lot of call sites to change, and the callers need to hold on to that mapping handle. To be clear, I do not think you should do this with this patch, but I would like your opinion, since you looked at the code closely. I'll review the current version presently. ------------- PR: https://git.openjdk.java.net/jdk/pull/294 From dcubed at openjdk.java.net Tue Oct 6 16:47:11 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Tue, 6 Oct 2020 16:47:11 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 12:13:00 GMT, Coleen Phillimore wrote: > This change moves the significant amount of stack overflow related code (with ascii art!) out of thread files into a > new file. Many of the functions are static functions and some go through JavaThread::_stack_overflow_state where > needed. All functions are moved and not modified except for qualification. I also added a delegating constructor to > JavaThread::JavaThread so reordered the assignments as initializers from JavaThread::initialize. > Tested with tier1-6 and builds on arm32, ppc, s390 and zero. Really nice refactoring and cleanup. Thumbs up. Any idea whether the change in compilation unit will have any performance effects? src/hotspot/os/linux/os_linux.cpp line 2034: > 2032: if (!_stack_is_executable) { > 2033: for (JavaThreadIteratorWithHandle jtiwh; JavaThread *jt = jtiwh.next(); ) { > 2034: StackOverflow* sto = jt->stack_overflow_state(); So why use `sto` here when you use `overflow_state` above? The GitHub review UI makes this difference more obvious than webrev... Update: In other source files, you use `overflow_state`. src/hotspot/share/runtime/stackOverflow.cpp line 2: > 1: /* > 2: * Copyright (c) 1997, 2020, Oracle and/or its affiliates. All rights reserved. You have this copyright as a range of years while stackOverflow.hpp is just 2020. src/hotspot/share/runtime/stackOverflow.hpp line 37: > 35: friend class JavaThread; > 36: public: > 37: // State of the stack guard pages for this thread. The "this thread" part no longer reads as well... I don't have a suggested rewording... src/hotspot/share/runtime/thread.cpp line 2954: > 2952: > 2953: void JavaThread::frames_do(void f(frame*, const RegisterMap* map)) { > 2954: // ignore is there is no stack typo - s/is there/if there/ ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/522 From stuefe at openjdk.java.net Tue Oct 6 16:48:18 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 6 Oct 2020 16:48:18 GMT Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on macOS [v2] In-Reply-To: <6CNTkYXWf4_yX13e6G4fusS8fijamDb13jtPDdHNY6g=.0459ff64-960a-4dd5-9715-07ed0ba88012@github.com> References: <6CNTkYXWf4_yX13e6G4fusS8fijamDb13jtPDdHNY6g=.0459ff64-960a-4dd5-9715-07ed0ba88012@github.com> Message-ID: On Tue, 6 Oct 2020 13:40:22 GMT, Anton Kozlov wrote: >> Please review an updated RFR from https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041463.html >> >> On macOS, MAP_JIT cannot be used with MAP_FIXED[1]. So pd_reserve_memory have to provide MAP_JIT for mmap(NULL, >> PROT_NONE), the function was made aware of exec permissions. >> For executable and data regions, pd_commit_memory only unlocks the memory with mprotect, this should make no difference >> compared with old code. >> For data regions, pd_uncommit_memory still uses a new overlapping anonymous mmap which returns pages to the OS and >> immediately reflects this in diagnostic tools like ps. For executable regions it would require MAP_FIXED|MAP_JIT, so >> instead madvise(MADV_FREE)+mprotect(PROT_NONE) are used. They should also allow OS to reclaim pages, but apparently >> this does not happen immediately. In practice, it should not be a problem for executable regions, as codecache does not >> shrink (if I haven't missed anything, by the implementation and in principle). Tested: >> * local tier1 >> * jdk-submit >> * codesign[2] with hardened runtime and allow-jit but without >> allow-unsigned-executable-memory entitlements[3] produce a working bundle. >> >> (adding GC group as suggested by @dholmes-ora) >> >> >> [1] https://github.com/apple/darwin-xnu/blob/master/bsd/kern/kern_mman.c#L227 >> [2] >> >> codesign \ >> --sign - \ >> --options runtime \ >> --entitlements ents.plist \ >> --timestamp \ >> $J/bin/* $J/lib/server/*.dylib $J/lib/*.dylib >> [3] >> >> >> >> >> com.apple.security.cs.allow-jit >> >> com.apple.security.cs.disable-library-validation >> >> com.apple.security.cs.allow-dyld-environment-variables >> >> >> > > Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains four commits: > - Minimal working example, no uncommit > - Merge remote-tracking branch 'upstream/master' into 8234930 > - Revert "Use MAP_JIT for CodeCache pages" > > This reverts commit 114d9cffd62cab42790b65091648fe75345c4533. > - Use MAP_JIT for CodeCache pages src/hotspot/os/bsd/os_bsd.cpp line 2010: > 2008: return ::mprotect(addr, size, PROT_NONE) == 0; > 2009: #elif defined(__APPLE__) > 2010: if (false) { I'm confused, how would this work? ------------- PR: https://git.openjdk.java.net/jdk/pull/294 From redestad at openjdk.java.net Tue Oct 6 17:20:12 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Tue, 6 Oct 2020 17:20:12 GMT Subject: RFR: 8254084: Remove TemplateTable::pd_initialize Message-ID: TemplateTable::pd_initialize is empty on all platforms, and has been so for some time. I suggest removing them. ------------- Commit messages: - Remove TemplateTable::pd_initialize Changes: https://git.openjdk.java.net/jdk/pull/529/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=529&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254084 Stats: 33 lines in 6 files changed: 0 ins; 33 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/529.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/529/head:pull/529 PR: https://git.openjdk.java.net/jdk/pull/529 From minqi at openjdk.java.net Tue Oct 6 17:35:18 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Tue, 6 Oct 2020 17:35:18 GMT Subject: RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive [v11] In-Reply-To: References: Message-ID: > This patch is reorganized after 8252725, which is separated from this patch to refactor jlink glugin code. The previous > webrev with hg can be found at: http://cr.openjdk.java.net/~minqi/2020/8247536/webrev-05. With 8252725 integrated, the > regeneration of holder classes is simply to call the new added GenerateJLIClassesHelper.cdsGenerateHolderClasses > function. Tests: tier1-4 Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: Removed unused imports. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/193/files - new: https://git.openjdk.java.net/jdk/pull/193/files/686e211b..5d32a547 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=193&range=10 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=193&range=09-10 Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/193.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/193/head:pull/193 PR: https://git.openjdk.java.net/jdk/pull/193 From github.com+51754783+coreyashford at openjdk.java.net Tue Oct 6 17:45:08 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Tue, 6 Oct 2020 17:45:08 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: References: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> Message-ID: On Tue, 6 Oct 2020 15:13:46 GMT, Roger Riggs wrote: > fyi, there is no need to do a rebase. The preferred way is to do a merge. > When the changes are integrated, all of the individual commits are squashed to create a single commit. Good to know, thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From burban at openjdk.java.net Tue Oct 6 18:17:21 2020 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Tue, 6 Oct 2020 18:17:21 GMT Subject: RFR: 8254072: AArch64: Get rid of --disable-warnings-as-errors on Windows+ARM64 build Message-ID: I organized this PR so that each commit contains the warning emitted by MSVC as commit message and its relevant fix. Verified on * Linux+ARM64: `{hotspot,jdk,langtools}:tier1`, no failures. * Windows+ARM64: `{hotspot,jdk,langtools}:tier1`, no (new) failures. * internal macOS+ARM64 port: build without `--disable-warnings-as-errors` still works. Just mentioning this here, because it's yet another toolchain (Xcode / clang) that needs to be kept happy [going forward](https://openjdk.java.net/jeps/391). ------------- Commit messages: - ./src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp(1123): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data - ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1312): warning C4267: 'argument': conversion from 'size_t' to 'unsigned int', possible loss of data - ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp(2472): warning C4312: 'type cast': conversion from 'unsigned int' to 'address' of greater size - ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp(1527): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data - ./src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp(2901): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data - ./src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp(2756): warning C4146: unary minus operator applied to unsigned type, result still unsigned - ./src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp(1837): warning C4267: 'initializing': conversion from 'size_t' to 'unsigned int', possible loss of data - ./src/hotspot/cpu/aarch64/immediate_aarch64.cpp(72): warning C4146: unary minus operator applied to unsigned type, result still unsigned - ./src/hotspot/cpu/aarch64/frame_aarch64.cpp(716): warning C4146: unary minus operator applied to unsigned type, result still unsigned - ./src/hotspot/cpu/aarch64/frame_aarch64.cpp(686): warning C4477: 'printf' : format string '%016lx' requires an argument of type 'unsigned long', but variadic argument 1 has type 'uintptr_t' - ... and 4 more: https://git.openjdk.java.net/jdk/compare/91997838...3e92e29f Changes: https://git.openjdk.java.net/jdk/pull/530/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=530&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254072 Stats: 22 lines in 9 files changed: 1 ins; 0 del; 21 mod Patch: https://git.openjdk.java.net/jdk/pull/530.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/530/head:pull/530 PR: https://git.openjdk.java.net/jdk/pull/530 From rriggs at openjdk.java.net Tue Oct 6 18:17:17 2020 From: rriggs at openjdk.java.net (Roger Riggs) Date: Tue, 6 Oct 2020 18:17:17 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: References: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> Message-ID: On Tue, 6 Oct 2020 17:42:36 GMT, CoreyAshford wrote: >> fyi, there is no need to do a rebase. The preferred way is to do a merge. >> When the changes are integrated, all of the individual commits are squashed to create a single commit. > >> fyi, there is no need to do a rebase. The preferred way is to do a merge. >> When the changes are integrated, all of the individual commits are squashed to create a single commit. > > Good to know, thanks. We'll need a HotSpot Reviewer to approve too. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From mchung at openjdk.java.net Tue Oct 6 18:18:11 2020 From: mchung at openjdk.java.net (Mandy Chung) Date: Tue, 6 Oct 2020 18:18:11 GMT Subject: RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive [v11] In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 17:35:18 GMT, Yumin Qi wrote: >> This patch is reorganized after 8252725, which is separated from this patch to refactor jlink glugin code. The previous >> webrev with hg can be found at: http://cr.openjdk.java.net/~minqi/2020/8247536/webrev-05. With 8252725 integrated, the >> regeneration of holder classes is simply to call the new added GenerateJLIClassesHelper.cdsGenerateHolderClasses >> function. Tests: tier1-4 > > Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: > > Removed unused imports. src/java.base/share/classes/java/lang/invoke/GenerateJLIClassesHelper.java line 63: > 61: if (TRACE_RESOLVE) { > 62: System.out.println(traceLF + (resolvedMember != null ? " (success)" : " (fail)")); > 63: } I suggest not to change the existing code. Instead, have `CDS::traceLambdaFormInvoker` to take individual parameters `Class holder, String name, String shortenSignature` (rather than the formatted string). Something like: if (CDS.isDumpLoadedClassList()) { CDS.traceLambdaFormInvoker(holder, name, shortenSignature(basicTypeSignature(type)); } This also gives flexibility to CDS to decide on what format to write to the class list (like this case, you drop the text "success/fail") In addition, the conditional check on `CDS.isDumpLoadedClassList()` is hard to relate to why CDS traces these events. I see Ioi's comment on this method name too. I agree with Ioi that `isDumpingClassList` makes more sense. src/java.base/share/classes/java/lang/invoke/GenerateJLIClassesHelper.java line 74: > 72: System.out.println(traceSP + (salvage != null ? " (salvaged)" : " (generated)")); > 73: } > 74: CDS.traceLambdaFormInvoker(traceSP); I suggest leaving the existing code unchanged. Instead, add the following: if (CDS.isDumpingClassList()) { CDS.traceSpeciesType(cn); } The above uses Ioi's suggested method name which reads better. src/java.base/share/classes/jdk/internal/misc/CDS.java line 83: > 81: * check if -XX:+DumpLoadedClassList and given file is open > 82: */ > 83: public static boolean isDumpLoadedClassList() { I agree with Ioi's suggestion to rename this to `isDumpingClassList` which describes what the VM is doing. ------------- PR: https://git.openjdk.java.net/jdk/pull/193 From mdoerr at openjdk.java.net Tue Oct 6 18:19:05 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Tue, 6 Oct 2020 18:19:05 GMT Subject: RFR: 8254084: Remove TemplateTable::pd_initialize In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 17:14:42 GMT, Claes Redestad wrote: > TemplateTable::pd_initialize is empty on all platforms, and has been so for some time. I suggest removing them. Thanks for cleaning this up. Should get removed from templateTable.hpp, too. ------------- Changes requested by mdoerr (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/529 From github.com+51754783+coreyashford at openjdk.java.net Tue Oct 6 18:22:07 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Tue, 6 Oct 2020 18:22:07 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: References: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> Message-ID: <5HeCT4pIrQCiAKTemQtl9pu8IDADDLgsQZ2eoDeSbhw=.d6a76689-97ca-4a0e-9ac7-5398b3774de4@github.com> On Tue, 6 Oct 2020 18:13:05 GMT, Roger Riggs wrote: > We'll need a HotSpot Reviewer to approve too. I have requested help from the team here. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From coleenp at openjdk.java.net Tue Oct 6 18:45:06 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 6 Oct 2020 18:45:06 GMT Subject: RFR: 8254084: Remove TemplateTable::pd_initialize In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 17:14:42 GMT, Claes Redestad wrote: > TemplateTable::pd_initialize is empty on all platforms, and has been so for some time. I suggest removing them. Echo @TheRealMDoerr wrt templateTable.hpp. Also trivial. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/529 From coleenp at openjdk.java.net Tue Oct 6 18:51:13 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 6 Oct 2020 18:51:13 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp In-Reply-To: References: Message-ID: <9Jtsh5LwxmOjwWfTbm3_yLvjcx3A70euu2qOs6Yh8UU=.18e07b33-1d63-4093-9c19-0a2384a699b8@github.com> On Tue, 6 Oct 2020 16:25:58 GMT, Daniel D. Daugherty wrote: >> This change moves the significant amount of stack overflow related code (with ascii art!) out of thread files into a >> new file. Many of the functions are static functions and some go through JavaThread::_stack_overflow_state where >> needed. All functions are moved and not modified except for qualification. I also added a delegating constructor to >> JavaThread::JavaThread so reordered the assignments as initializers from JavaThread::initialize. >> Tested with tier1-6 and builds on arm32, ppc, s390 and zero. > > src/hotspot/os/linux/os_linux.cpp line 2034: > >> 2032: if (!_stack_is_executable) { >> 2033: for (JavaThreadIteratorWithHandle jtiwh; JavaThread *jt = jtiwh.next(); ) { >> 2034: StackOverflow* sto = jt->stack_overflow_state(); > > So why use `sto` here when you use `overflow_state` above? > The GitHub review UI makes this difference more obvious than webrev... > > Update: In other source files, you use `overflow_state`. I'll change it to overflow_state. I should be consistent. Fixed. > src/hotspot/share/runtime/stackOverflow.cpp line 2: > >> 1: /* >> 2: * Copyright (c) 1997, 2020, Oracle and/or its affiliates. All rights reserved. > > You have this copyright as a range of years while stackOverflow.hpp is just 2020. oops. fixed. > src/hotspot/share/runtime/stackOverflow.hpp line 37: > >> 35: friend class JavaThread; >> 36: public: >> 37: // State of the stack guard pages for this thread. > > The "this thread" part no longer reads as well... > I don't have a suggested rewording... // State of the stack guard pages for the containing thread. ? > src/hotspot/share/runtime/thread.cpp line 2954: > >> 2952: >> 2953: void JavaThread::frames_do(void f(frame*, const RegisterMap* map)) { >> 2954: // ignore is there is no stack > > typo - s/is there/if there/ Fixed. That code block showed up in the diff since I moved the function just above the one that called it. ------------- PR: https://git.openjdk.java.net/jdk/pull/522 From redestad at openjdk.java.net Tue Oct 6 18:54:16 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Tue, 6 Oct 2020 18:54:16 GMT Subject: RFR: 8254084: Remove TemplateTable::pd_initialize [v2] In-Reply-To: References: Message-ID: > TemplateTable::pd_initialize is empty on all platforms, and has been so for some time. I suggest removing them. Claes Redestad has updated the pull request incrementally with one additional commit since the last revision: Cleanup templateTable.hpp ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/529/files - new: https://git.openjdk.java.net/jdk/pull/529/files/b50edaf9..0575812c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=529&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=529&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/529.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/529/head:pull/529 PR: https://git.openjdk.java.net/jdk/pull/529 From redestad at openjdk.java.net Tue Oct 6 18:54:17 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Tue, 6 Oct 2020 18:54:17 GMT Subject: RFR: 8254084: Remove TemplateTable::pd_initialize [v2] In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 18:16:06 GMT, Martin Doerr wrote: > Thanks for cleaning this up. Should get removed from templateTable.hpp, Fixed ------------- PR: https://git.openjdk.java.net/jdk/pull/529 From neliasso at openjdk.java.net Tue Oct 6 18:55:11 2020 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Tue, 6 Oct 2020 18:55:11 GMT Subject: RFR: 8252847: Optimize primitive arrayCopy stubs using AVX-512 masked instructions [v5] In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 20:48:10 GMT, Vladimir Kozlov wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> 8252847 : Modifying file permission to resolve jcheck failure. > > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 1264: > >> 1262: } >> 1263: >> 1264: #ifndef PRODUCT > > macroAssembler_x86.hpp become big. May be we should start thing about splitting arraycopy stubs into separate file. But lets do that in a another change. It is good that the AVX3 case is separated out in this change - makes it easy to follow. ------------- PR: https://git.openjdk.java.net/jdk/pull/61 From coleenp at openjdk.java.net Tue Oct 6 19:03:07 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 6 Oct 2020 19:03:07 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 16:44:52 GMT, Daniel D. Daugherty wrote: >> This change moves the significant amount of stack overflow related code (with ascii art!) out of thread files into a >> new file. Many of the functions are static functions and some go through JavaThread::_stack_overflow_state where >> needed. All functions are moved and not modified except for qualification. I also added a delegating constructor to >> JavaThread::JavaThread so reordered the assignments as initializers from JavaThread::initialize. >> Tested with tier1-6 and builds on arm32, ppc, s390 and zero. > > Really nice refactoring and cleanup. Thumbs up. > > Any idea whether the change in compilation unit will have any performance effects? To answer the performance question, since it's not an actual indirection, the compiler should be smart enough to adjust the offset of the field to reflect its offset in stack_overflow_state. So this wouldn't make a difference in generated code. Even if it did, there are few places where stack overflow is accessed via thread and these are generally during exception handling. The stack overflow checking code mostly uses the static shadow, yellow and red sizes. ------------- PR: https://git.openjdk.java.net/jdk/pull/522 From coleenp at openjdk.java.net Tue Oct 6 19:18:24 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 6 Oct 2020 19:18:24 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v2] In-Reply-To: References: Message-ID: > This change moves the significant amount of stack overflow related code (with ascii art!) out of thread files into a > new file. Many of the functions are static functions and some go through JavaThread::_stack_overflow_state where > needed. All functions are moved and not modified except for qualification. I also added a delegating constructor to > JavaThread::JavaThread so reordered the assignments as initializers from JavaThread::initialize. > Tested with tier1-6 and builds on arm32, ppc, s390 and zero. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: 8253717: Relocate stack overflow code out of thread.hpp/cpp ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/522/files - new: https://git.openjdk.java.net/jdk/pull/522/files/9905e355..75ddb562 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=522&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=522&range=00-01 Stats: 7 lines in 4 files changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/522.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/522/head:pull/522 PR: https://git.openjdk.java.net/jdk/pull/522 From dcubed at openjdk.java.net Tue Oct 6 19:42:10 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Tue, 6 Oct 2020 19:42:10 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v2] In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 19:18:24 GMT, Coleen Phillimore wrote: >> This change moves the significant amount of stack overflow related code (with ascii art!) out of thread files into a >> new file. Many of the functions are static functions and some go through JavaThread::_stack_overflow_state where >> needed. All functions are moved and not modified except for qualification. I also added a delegating constructor to >> JavaThread::JavaThread so reordered the assignments as initializers from JavaThread::initialize. >> Tested with tier1-6 and builds on arm32, ppc, s390 and zero. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > 8253717: Relocate stack overflow code out of thread.hpp/cpp Thumbs up! ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/522 From mdoerr at openjdk.java.net Tue Oct 6 20:00:06 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Tue, 6 Oct 2020 20:00:06 GMT Subject: RFR: 8254084: Remove TemplateTable::pd_initialize [v2] In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 18:54:16 GMT, Claes Redestad wrote: >> TemplateTable::pd_initialize is empty on all platforms, and has been so for some time. I suggest removing them. > > Claes Redestad has updated the pull request incrementally with one additional commit since the last revision: > > Cleanup templateTable.hpp Marked as reviewed by mdoerr (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/529 From akozlov at openjdk.java.net Tue Oct 6 20:13:16 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 6 Oct 2020 20:13:16 GMT Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on macOS [v3] In-Reply-To: References: Message-ID: > Please review an updated RFR from https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041463.html > > On macOS, MAP_JIT cannot be used with MAP_FIXED[1]. So pd_reserve_memory have to provide MAP_JIT for mmap(NULL, > PROT_NONE), the function was made aware of exec permissions. > For executable and data regions, pd_commit_memory only unlocks the memory with mprotect, this should make no difference > compared with old code. > For data regions, pd_uncommit_memory still uses a new overlapping anonymous mmap which returns pages to the OS and > immediately reflects this in diagnostic tools like ps. For executable regions it would require MAP_FIXED|MAP_JIT, so > instead madvise(MADV_FREE)+mprotect(PROT_NONE) are used. They should also allow OS to reclaim pages, but apparently > this does not happen immediately. In practice, it should not be a problem for executable regions, as codecache does not > shrink (if I haven't missed anything, by the implementation and in principle). Tested: > * local tier1 > * jdk-submit > * codesign[2] with hardened runtime and allow-jit but without > allow-unsigned-executable-memory entitlements[3] produce a working bundle. > > (adding GC group as suggested by @dholmes-ora) > > > [1] https://github.com/apple/darwin-xnu/blob/master/bsd/kern/kern_mman.c#L227 > [2] > > codesign \ > --sign - \ > --options runtime \ > --entitlements ents.plist \ > --timestamp \ > $J/bin/* $J/lib/server/*.dylib $J/lib/*.dylib > [3] > > > > > com.apple.security.cs.allow-jit > > com.apple.security.cs.disable-library-validation > > com.apple.security.cs.allow-dyld-environment-variables > > > Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: Bookkeeping without interface changes ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/294/files - new: https://git.openjdk.java.net/jdk/pull/294/files/f8664ca7..0016bc4a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=294&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=294&range=01-02 Stats: 105 lines in 6 files changed: 75 ins; 12 del; 18 mod Patch: https://git.openjdk.java.net/jdk/pull/294.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/294/head:pull/294 PR: https://git.openjdk.java.net/jdk/pull/294 From neliasso at openjdk.java.net Tue Oct 6 20:21:09 2020 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Tue, 6 Oct 2020 20:21:09 GMT Subject: RFR: 8173585: Intrinsify StringLatin1.indexOf(char) [v4] In-Reply-To: References: <_T0873dC5tfUtGn9r1_Y21JkPKRt-za3MM9hPN2GQKQ=.b865fe53-5417-424f-81b6-1566a330640e@github.com> Message-ID: On Fri, 2 Oct 2020 08:40:58 GMT, Jason Tatton wrote: >> This is an implementation of the indexOf(char) intrinsic for StringLatin1 (1 byte encoded Strings). It is provided for >> x86 and ARM64. The implementation is greatly inspired by the indexOf(char) intrinsic for StringUTF16. To incorporate it >> I had to make a small change to StringLatin1.java (refactor of functionality to intrisified private method) as well as >> code for C2. Submitted to: hotspot-compiler-dev and core-libs-dev as this patch contains a change to hotspot and >> java/lang/StringLatin1.java https://bugs.openjdk.java.net/browse/JDK-8173585 >> >> Details of testing: >> ============ >> I have created a jtreg test ?compiler/intrinsics/string/TestStringLatin1IndexOfChar? to cover this new intrinsic. Note >> that, particularly for the x86 implementation of the intrinsic, the code path taken is dependent upon the length of the >> input String. Hence the test has been designed to cover all these cases. In summary they are: >> - A ?short? string of < 16 characters. >> - A SIMD String of 16 ? 31 characters. >> - A AVX2 SIMD String of 32 characters+. >> >> Hardware used for testing: >> ----------------------------- >> >> - Intel Xeon CPU E5-2680 (JVM did not recognize this as having AVX2 support) ? Intel i7 processor (with AVX2 support). >> - AWS Graviton 2 (ARM 64 processor). >> >> I also ran; ?run-test-tier1? and ?run-test-tier2? for: x86_64 and aarch64. >> >> Possible future enhancements: >> ==================== >> For the x86 implementation there may be two further improvements we can make in order to improve performance of both >> the StringUTF16 and StringLatin1 indexOf(char) intrinsics: >> 1. Make use of AVX-512 instructions. >> 2. For ?short? Strings (see below), I think it may be possible to modify the existing algorithm to still use SSE SIMD >> instructions instead of a loop. >> Benchmark results: >> ============ >> **Without** the new StringLatin1 indexOf(char) intrinsic: >> >> | Benchmark | Mode | Cnt | Score | Error | Units | >> | ------------- | ------------- |------------- |------------- |------------- |------------- | >> | IndexOfBenchmark.latin1_mixed_char | avgt | 5 | **26,389.129** | ? 182.581 | ns/op | >> | IndexOfBenchmark.utf16_mixed_char | avgt | 5 | 17,885.383 | ? 435.933 | ns/op | >> >> >> **With** the new StringLatin1 indexOf(char) intrinsic: >> >> | Benchmark | Mode | Cnt | Score | Error | Units | >> | ------------- | ------------- |------------- |------------- |------------- |------------- | >> | IndexOfBenchmark.latin1_mixed_char | avgt | 5 | **17,875.185** | ? 407.716 | ns/op | >> | IndexOfBenchmark.utf16_mixed_char | avgt | 5 | 18,292.802 | ? 167.306 | ns/op | >> >> >> The objective of the patch is to bring the performance of StringLatin1 indexOf(char) in line with StringUTF16 >> indexOf(char) for x86 and ARM64. We can see above that this has been achieved. Similar results were obtained when >> running on ARM. > > Jason Tatton has updated the pull request incrementally with one additional commit since the last revision: > > 8173585: Intrinsify StringLatin1.indexOf(char) > > Rewrite of unit test and newlines added to end of files > > Changes to unit test: > - main test adjusted such that Strings gennerated are much longer (up to > 2048 characters) and of the form: azaza, aazaazaa, aaazaaazaaa, etc with > 'z' being the search character searched for. Multiple instances of the > search character are included in the String in order to validate that > the starting offset is correctly handleded. Results are compared to non > intrinsified version of the code. Longer strings means that the looping > functionality of the various paths is entered into. > - Run configurations introduced such that it checks behaviour where use > of SSE and AVX instructions are restricted. > - Tier4InvocationThreshold adjusted so as to ensure C2 code iis invoked. > > Other changes: > - newlines added at end of files test/hotspot/jtreg/compiler/intrinsics/string/TestStringLatin1IndexOfChar.java line 25: > 23: import jdk.test.lib.Asserts; > 24: > 25: public class TestStringLatin1IndexOfChar{ Can you please add testing for these edge cases: - when the search char is the first char - when the search char is the last char - when the string has length 1 ------------- PR: https://git.openjdk.java.net/jdk/pull/71 From redestad at openjdk.java.net Tue Oct 6 20:27:14 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Tue, 6 Oct 2020 20:27:14 GMT Subject: Integrated: 8254084: Remove TemplateTable::pd_initialize In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 17:14:42 GMT, Claes Redestad wrote: > TemplateTable::pd_initialize is empty on all platforms, and has been so for some time. I suggest removing them. This pull request has now been integrated. Changeset: 6712f8ca Author: Claes Redestad URL: https://git.openjdk.java.net/jdk/commit/6712f8ca Stats: 35 lines in 7 files changed: 0 ins; 34 del; 1 mod 8254084: Remove TemplateTable::pd_initialize Reviewed-by: mdoerr, coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/529 From iignatyev at openjdk.java.net Tue Oct 6 20:38:09 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Tue, 6 Oct 2020 20:38:09 GMT Subject: Integrated: 8253750: use build-stable default seed for Utils.RANDOM_GENERATOR In-Reply-To: References: Message-ID: On Tue, 29 Sep 2020 00:06:02 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review the patch which updates `jdk.test.lib.Utils` to use md5 hash-sum of `java.vm.version` property > as default seed for `Utils.RANDOM_GENERATOR`? > from JBS: >> using the same seed for all runs of a build will make it possible (easier) to compare results from different test runs >> (e.g. on different platforms, w/ different flags) and consequently will make test results analysis easier. the proposed >> solution is to use the seed based on Runtime.version() / "java.vm.version", which are different from build to build, if >> there is no seed specified by "jdk.test.lib.random.seed" property. > > the patch also updates `RandomGeneratorTest` test, so it expects now that the same values are generated if no seed is > provided. > testing: > ? tier1 > ? `test/lib-test/jdk/test/lib/` against personal build on linux,windows,macos-x64 > ? `test/lib-test/jdk/test/lib/` against CI build on linux,windows,macos-x64 This pull request has now been integrated. Changeset: ac772cd9 Author: Igor Ignatyev URL: https://git.openjdk.java.net/jdk/commit/ac772cd9 Stats: 58 lines in 2 files changed: 44 ins; 5 del; 9 mod 8253750: use build-stable default seed for Utils.RANDOM_GENERATOR Reviewed-by: rriggs ------------- PR: https://git.openjdk.java.net/jdk/pull/391 From iignatyev at openjdk.java.net Tue Oct 6 20:38:08 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Tue, 6 Oct 2020 20:38:08 GMT Subject: RFR: 8253750: use build-stable default seed for Utils.RANDOM_GENERATOR [v3] In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 13:35:14 GMT, Roger Riggs wrote: >> Igor Ignatyev has updated the pull request incrementally with one additional commit since the last revision: >> >> use random seed for personal/internal builds > > Marked as reviewed by rriggs (Reviewer). Thanks, Roger. ------------- PR: https://git.openjdk.java.net/jdk/pull/391 From minqi at openjdk.java.net Tue Oct 6 20:46:17 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Tue, 6 Oct 2020 20:46:17 GMT Subject: RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive [v12] In-Reply-To: References: Message-ID: <9emWKl6fr-GA5LN0uHhuEd5D123QcoCiHQR1M9bAbag=.cc4b6129-8b33-47e4-a421-9e6b4817933b@github.com> > This patch is reorganized after 8252725, which is separated from this patch to refactor jlink glugin code. The previous > webrev with hg can be found at: http://cr.openjdk.java.net/~minqi/2020/8247536/webrev-05. With 8252725 integrated, the > regeneration of holder classes is simply to call the new added GenerateJLIClassesHelper.cdsGenerateHolderClasses > function. Tests: tier1-4 Yumin Qi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 23 commits: - Added new separate function to CDS for logging species and modified the existing function to log lambda form invokers. Changed isDumpLoadedClassList to a reasonable name isDumpingClassList as read only in CDS. - Merge branch 'master' of https://github.com/openjdk/jdk into jdk-8247536 - Removed unused imports. - Fixed comments with correct class and method name in CDS, removed unused variables after last change. - Moved and renamed cdsGenerateHolderClasses from GenerateJLIClassesHelp to CDS as generateLambdaFormHolderClasses. Added input verification function in CDS before class generation. Added more test scenarios. Removed trailing unused ending words for output of lambda form trace line in case of DumpLoadedClassList. - Move the check work to java, restore code in VM. Modified test code according to the changes. The invoke name verififcation is not implemented since not all the holder class are processed, not all the functions of processed holder classes are added. For holder class with DirectMethodHandle in its name, only the name in the DMH_METHOD_TYPE_MAP keyset is added, ithe line with other names just gets skipped silently. This makes the verification on invoke names difficul, a name not in the keyset should not fail the test. Also add a boolean to cdsGenerateHolderClasses to indicate call path. - Remove trailing word of line which is not used in holder class regeneration. There is a trailing LF (Line Feed) so trim white spaces from both front and end of the line or it will fail method type validation. - In case of exception happens during reloading class, CHECK will return without free the allocated buffer for class bytes so moved the buffer allocation and freeing to caller. Also removed test 6 since there is not guarantee that we can give a signature which will always fail. Additional changes to GenerateJLIClassesHelper according to review suggestion. - Merge branch 'master' of https://github.com/openjdk/jdk into jdk-8247536 - Merge branch 'master' of https://git.openjdk.java.net/jdk into jdk-8247536 - ... and 13 more: https://git.openjdk.java.net/jdk/compare/82fe023b...f5584dcf ------------- Changes: https://git.openjdk.java.net/jdk/pull/193/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=193&range=11 Stats: 567 lines in 21 files changed: 545 ins; 14 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/193.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/193/head:pull/193 PR: https://git.openjdk.java.net/jdk/pull/193 From iignatyev at openjdk.java.net Tue Oct 6 20:53:12 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Tue, 6 Oct 2020 20:53:12 GMT Subject: RFR: 8254095: remove jdk.test.lib.Utils::distro() method Message-ID: Hi all, could you please review this trivial cleanup? from JBS: > jdk.test.lib.Utils::distro() is not used by any of the tests and can be removed. Thanks, -- Igor ------------- Commit messages: - 8254095: remove jdk.test.lib.Utils::distro() method Changes: https://git.openjdk.java.net/jdk/pull/532/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=532&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254095 Stats: 11 lines in 1 file changed: 0 ins; 11 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/532.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/532/head:pull/532 PR: https://git.openjdk.java.net/jdk/pull/532 From iignatyev at openjdk.java.net Tue Oct 6 21:01:11 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Tue, 6 Oct 2020 21:01:11 GMT Subject: RFR: 8254096: remove jdk.test.lib.Utils::getMandatoryProperty(String) method Message-ID: Hi all, could you please review this small and trivial cleanup that removes `getMandatoryProperty` method from `jdk.test.lib.Utils` as it's unused? Thanks, -- Igor ------------- Commit messages: - 8254096: remove jdk.test.lib.Utils::getMandatoryProperty(String) method Changes: https://git.openjdk.java.net/jdk/pull/533/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=533&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254096 Stats: 13 lines in 1 file changed: 0 ins; 13 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/533.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/533/head:pull/533 PR: https://git.openjdk.java.net/jdk/pull/533 From akozlov at openjdk.java.net Tue Oct 6 21:06:10 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 6 Oct 2020 21:06:10 GMT Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on macOS [v3] In-Reply-To: <_XaA5cQEInPMn5Q5gj2y7AFCRprFQiYfI6BeUN49FhA=.9f17ae05-b37e-4f40-a83f-fd34aa812575@github.com> References: <6iVRP-20baz0_46SouR-dj9SyspR5QvaL9iJMdeipDE=.92688b4e-ebd3-4681-8e63-a4aee752c407@github.com> <_XaA5cQEInPMn5Q5gj2y7AFCRprFQiYfI6BeUN49FhA=.9f17ae05-b37e-4f40-a83f-fd34aa812575@github.com> Message-ID: On Tue, 6 Oct 2020 16:38:47 GMT, Thomas Stuefe wrote: >> Hi @tstuefe, >> >> Recent refactors interfered with the previous version of the patch, I found it is a bit simpler to start from scratch. >> https://github.com/openjdk/jdk/pull/294/commits/f8664ca7dcc1cfdfb9a1f032035f2cde77048649 is a minimal patch that allows >> MAP_JIT. I cannot see any way to simplify the interface now. Please tell me if I miss something. Also, `VirtualSpace` >> now choose from `reserve_memory_with_fd` and anything that is used for reserve with exec. What I found in the AIX >> implementation, the commit/uncommit are for memory previously reserved >> https://github.com/openjdk/jdk/blob/master/src/hotspot/os/aix/os_aix.cpp#L2227. If we attribute reserved region with >> exec, then uncommit can query bookkeeping info and get the type of memory. From options of internal bookkeeping info >> or an asking the system for the mmap flags, I prefer the first one. It is faster and we can bookkeep only minimal info, >> like a list of executable regions. That should reduce the amount of data and make a check for the type in >> commit/uncommit faster. After looking around, it does look like every place where commit with exec is used, we know >> enough to use reserve/uncommit with exec. Taking this to the extreme, I still think a specialized set of >> reserve/commit/uncommit for executable regions would look natural. For example, commit with exec is used in less than >> five places. I'll do a little research there. > >> Hi @tstuefe, >> >> Recent refactors interfered with the previous version of the patch, I found it is a bit simpler to start from scratch. >> [f8664ca](https://github.com/openjdk/jdk/commit/f8664ca7dcc1cfdfb9a1f032035f2cde77048649) is a minimal patch that >> allows MAP_JIT. I cannot see any way to simplify the interface now. Please tell me if I miss something. Also, >> `VirtualSpace` now choose from `reserve_memory_with_fd` and anything that is used for reserve with exec. What I found >> in the AIX implementation, the commit/uncommit are for memory previously reserved >> https://github.com/openjdk/jdk/blob/master/src/hotspot/os/aix/os_aix.cpp#L2227. If we attribute reserved region with >> exec, then uncommit can query bookkeeping info and get the type of memory. > > On AIX there is no explicit commit, committing happens on touch and uncommitting unfortunately does not really work > (there is a disclaim() but its buggy). All commit does is to check the input parameters and maybe explictly touch the > area. >> >> From options of internal bookkeeping info or an asking the system for the mmap flags, I prefer the first one. It is >> faster and we can bookkeep only minimal info, like a list of executable regions. That should reduce the amount of data >> and make a check for the type in commit/uncommit faster. After looking around, it does look like every place where >> commit with exec is used, we know enough to use reserve/uncommit with exec. Taking this to the extreme, I still think a >> specialized set of reserve/commit/uncommit for executable regions would look natural. For example, commit with exec is >> used in less than five places. I'll do a little research there. > > I thought a lot about this. > > As you said, I believe now that specifying exec (or other parameters, see below) is not necessary on a per-commit; on > per-mapping level it should be enough. > I also see at least three separate cases where we establish a mapping and later need mapping-specific information > somewhere until the next interaction - be it commit/uncommit or release: > 1) On AIX, where we decide at os::reserve_memory time to use either SystemV or mmap and later need to know > 2) On Linux when TPH are active the information "Use TPH" is handed down to os::commit via the weird alignment_hint > parameter and that os::realign_memory() function. Following that parameter flow, it is just basically a way to flag the > code to do one extra madvice at commit time. 3) Your case now, where we need to know if the mapping was supposed to be > executable at commit time. I really think a generic solution would be best. One simple variant would be to return not > a pointer but a handle and let the platform store behind that handle whatever it needs to keep on a per-mapping base: > handle_t os::reserve_memory(size, input flags); > void* start_address = os::get_reserved_memory_start_address(handle_t) > bool os::commit_memory(handle_t, address, size) > > and so on. The only problem with that is that it would cause a lot of call sites to change, and the callers need to > hold on to that mapping handle. > To be clear, I do not think you should do this with this patch, but I would like your opinion, since you looked at the > code closely. > I'll review the current version presently. Sorry, I had not highlighted that was a proof-of-concept patch to show API changes. I've pushed another PoC with bookkeeping and no API changes at all. But I don't like the new one either. In the new patch, there is a list of (potentially) executable regions that is updated on commit, when the actual desired (non)exec mode become known. If we support mixed exec/non-exec commits in a mapping, then after non-exec commit a part of the mapping cannot be reversed to a potentially executable one (as we've lost MAP_JIT). Then it can produce some unexpected results under _some_ conditions in runtime, while API users can be unconscious about potential issues. Good API should not allow that. > specifying exec ... on per-mapping level it should be enough. With this, it is possible to simplify the implementation without API changes. But it will still be 1) reserve and be prepare for the first exec or non-exec commit 2) on commit, finish reserve and turn the mapping to the exec or non-exec. All this instead of taking direct parameter "this is a executable mapping" on reserve. The current "commit only knows about exec" is just a leak of implementation details, as before it was only required to know executable mode. Providing exec parameter to reserve will just bring consistency to the interface. Or, a separate interface for exec (code) mappings will serve the same and will be better, as it will simplify the general non-code reserve/commit interface. > I also see at least three separate cases where we establish a mapping and later need mapping-specific information > somewhere until the next interaction - be it commit/uncommit or release: [ AIX SystemV or mmap, Linux THP, macOS > MAP_JIT for code ] Could you explain how the choice between SysV and mmap is made on AIX? It looks like develop(uintx, Use64KPagesThreshold, 0, \ "4K/64K page allocation threshold.") \ ... if (os::vm_page_size() == 4*K) { return reserve_mmaped_memory(bytes, NULL /* requested_addr */); } else { if (bytes >= Use64KPagesThreshold) { return reserve_shmated_memory(bytes, NULL /* requested_addr */); } else { return reserve_mmaped_memory(bytes, NULL /* requested_addr */); } } (there only two calls to reserve_shmated_memory and both of them are like above. Is SysV SHM used in product builds?) For now, the AIX case looks a bit different. The choice is made by the platform and the shared code cannot control this. So yes, I cannot see how to avoid handle_t or similar. In contrast, THP and MAP_JIT are the way to implement a request from the shared code. Even for THP, shared code seems to know why it should "realign" (not sure why commit has an alignment_hint parameter, while it is possible to realign after a regular commit). I assume there is enough context in the shared code that can be provided for platform functions, without a handle_t. And the same context should anyway be provided to reserve function, so handle_t can be filled with all necessary information. ------------- PR: https://git.openjdk.java.net/jdk/pull/294 From dholmes at openjdk.java.net Tue Oct 6 22:49:09 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 6 Oct 2020 22:49:09 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v2] In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 19:18:24 GMT, Coleen Phillimore wrote: >> This change moves the significant amount of stack overflow related code (with ascii art!) out of thread files into a >> new file. Many of the functions are static functions and some go through JavaThread::_stack_overflow_state where >> needed. All functions are moved and not modified except for qualification. I also added a delegating constructor to >> JavaThread::JavaThread so reordered the assignments as initializers from JavaThread::initialize. >> Tested with tier1-6 and builds on arm32, ppc, s390 and zero. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > 8253717: Relocate stack overflow code out of thread.hpp/cpp Hi Coleen, The code reorganisation seems okay though I'm not clear on the motivation as stackoverflow protection is a feature of JavaThreads. I have a couple of minor comments. Thanks, David src/hotspot/share/jvmci/vmStructs_jvmci.cpp line 183: > 181: nonstatic_field(JavaThread, _should_post_on_exceptions_flag, > int) \ 182: nonstatic_field(JavaThread, > _jni_environment, JNIEnv) \ 183: > nonstatic_field(JavaThread, _stack_overflow_state._reserved_stack_activation, > address) \ This doesn't look right. I thought these had to be direct entries for fields in given classes - not indirections?? src/hotspot/share/runtime/thread.cpp line 1669: > 1667: _on_thread_list(false), > 1668: DEBUG_ONLY(_java_call_counter(0) COMMA) > 1669: _entry_point(nullptr), I didn't realize we had made the NULL -> nullptr switch yet ?? src/hotspot/share/runtime/thread.cpp line 2953: > 2951: // Verification > 2952: > 2953: void JavaThread::frames_do(void f(frame*, const RegisterMap* map)) { Where does this come from ?? src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/GraalHotSpotVMConfig.java line 398: > 396: public final int jvmciCountersThreadOffset = getFieldOffset("JavaThread::_jvmci_counters", Integer.class, > "jlong*"); 397: public final int doingUnsafeAccessOffset = getFieldOffset("JavaThread::_doing_unsafe_access", > Integer.class, "bool", Integer.MAX_VALUE, JVMCI || JDK >= 14); 398: public final int > javaThreadReservedStackActivationOffset = JDK <= 8 ? 0 : > getFieldOffset("JavaThread::_stack_overflow_state._reserved_stack_activation", Integer.class, "address"); // JDK-8046936 Again unclear this can actually work. src/hotspot/share/runtime/stackOverflow.hpp line 33: > 31: class JavaThread; > 32: > 33: class StackOverflow { Can we add a descriptive comment of what this class actually represents please - and the fact it is tied to JavaThreads only. Ideally this would be within the logical namespace of JavaThread so that it doesn't appear to be a standalone/independent entity. Also the name isn't quite right as it doesn't represent an actual stack-overflow but the protection/state mechanism around that - so perhaps StackOverflowState or StackOverflowProtection ? ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/522 From bchristi at openjdk.java.net Tue Oct 6 22:51:06 2020 From: bchristi at openjdk.java.net (Brent Christian) Date: Tue, 6 Oct 2020 22:51:06 GMT Subject: RFR: 8254095: remove jdk.test.lib.Utils::distro() method In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 20:47:10 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this trivial cleanup? from JBS: >> jdk.test.lib.Utils::distro() is not used by any of the tests and can be removed. > > Thanks, > -- Igor Looks good to me. ------------- PR: https://git.openjdk.java.net/jdk/pull/532 From bchristi at openjdk.java.net Tue Oct 6 23:01:06 2020 From: bchristi at openjdk.java.net (Brent Christian) Date: Tue, 6 Oct 2020 23:01:06 GMT Subject: RFR: 8254095: remove jdk.test.lib.Utils::distro() method In-Reply-To: References: Message-ID: <_vHhJCsLIGcTr4cEV7zZ-F54HCPs_dApITgRI0SphH8=.ab94bb15-1534-4cfc-8fad-60b2db81ee03@github.com> On Tue, 6 Oct 2020 20:47:10 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this trivial cleanup? from JBS: >> jdk.test.lib.Utils::distro() is not used by any of the tests and can be removed. > > Thanks, > -- Igor Marked as reviewed by bchristi (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/532 From iignatyev at openjdk.java.net Tue Oct 6 23:01:06 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Tue, 6 Oct 2020 23:01:06 GMT Subject: RFR: 8254095: remove jdk.test.lib.Utils::distro() method In-Reply-To: <_vHhJCsLIGcTr4cEV7zZ-F54HCPs_dApITgRI0SphH8=.ab94bb15-1534-4cfc-8fad-60b2db81ee03@github.com> References: <_vHhJCsLIGcTr4cEV7zZ-F54HCPs_dApITgRI0SphH8=.ab94bb15-1534-4cfc-8fad-60b2db81ee03@github.com> Message-ID: On Tue, 6 Oct 2020 22:56:03 GMT, Brent Christian wrote: >> Hi all, >> >> could you please review this trivial cleanup? from JBS: >>> jdk.test.lib.Utils::distro() is not used by any of the tests and can be removed. >> >> Thanks, >> -- Igor > > Marked as reviewed by bchristi (Reviewer). Thanks, Brent. ------------- PR: https://git.openjdk.java.net/jdk/pull/532 From iignatyev at openjdk.java.net Tue Oct 6 23:01:07 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Tue, 6 Oct 2020 23:01:07 GMT Subject: Integrated: 8254095: remove jdk.test.lib.Utils::distro() method In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 20:47:10 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this trivial cleanup? from JBS: >> jdk.test.lib.Utils::distro() is not used by any of the tests and can be removed. > > Thanks, > -- Igor This pull request has now been integrated. Changeset: 2a0389a8 Author: Igor Ignatyev URL: https://git.openjdk.java.net/jdk/commit/2a0389a8 Stats: 11 lines in 1 file changed: 0 ins; 11 del; 0 mod 8254095: remove jdk.test.lib.Utils::distro() method Reviewed-by: bchristi ------------- PR: https://git.openjdk.java.net/jdk/pull/532 From coleenp at openjdk.java.net Tue Oct 6 23:08:17 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 6 Oct 2020 23:08:17 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v2] In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 22:07:55 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> 8253717: Relocate stack overflow code out of thread.hpp/cpp > > src/hotspot/share/jvmci/vmStructs_jvmci.cpp line 183: > >> 181: nonstatic_field(JavaThread, _should_post_on_exceptions_flag, >> int) \ 182: nonstatic_field(JavaThread, >> _jni_environment, JNIEnv) \ 183: >> nonstatic_field(JavaThread, _stack_overflow_state._reserved_stack_activation, >> address) \ > > This doesn't look right. I thought these had to be direct entries for fields in given classes - not indirections?? Yes, this does work. I confirmed it on the graal internal slack page with Doug Simon. And there are tests that use this. > src/hotspot/share/runtime/thread.cpp line 1669: > >> 1667: _on_thread_list(false), >> 1668: DEBUG_ONLY(_java_call_counter(0) COMMA) >> 1669: _entry_point(nullptr), > > I didn't realize we had made the NULL -> nullptr switch yet ?? Kim said I should use it: Kim Barrett 10:12 AM Quoting the style guide: nullptr Prefer nullptr (n2431) to NULL. Don't use (constexpr or literal) 0 for pointers. For historical reasons there are widespread uses of both NULL and of integer 0 as a pointer value. I don't know how many people actually read the changes I made. > src/hotspot/share/runtime/thread.cpp line 2953: > >> 2951: // Verification >> 2952: >> 2953: void JavaThread::frames_do(void f(frame*, const RegisterMap* map)) { > > Where does this come from ?? It was moved from code above that was in the middle of the stack overflow code. I moved it to right before it's called. > src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/GraalHotSpotVMConfig.java > line 398: >> 396: public final int jvmciCountersThreadOffset = getFieldOffset("JavaThread::_jvmci_counters", Integer.class, >> "jlong*"); 397: public final int doingUnsafeAccessOffset = getFieldOffset("JavaThread::_doing_unsafe_access", >> Integer.class, "bool", Integer.MAX_VALUE, JVMCI || JDK >= 14); 398: public final int >> javaThreadReservedStackActivationOffset = JDK <= 8 ? 0 : >> getFieldOffset("JavaThread::_stack_overflow_state._reserved_stack_activation", Integer.class, "address"); // JDK-8046936 > > Again unclear this can actually work. Also code that I discussed with graal folks. ------------- PR: https://git.openjdk.java.net/jdk/pull/522 From iignatyev at openjdk.java.net Tue Oct 6 23:23:11 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Tue, 6 Oct 2020 23:23:11 GMT Subject: RFR: 8254102: use ProcessHandle::pid instead of ManagementFactory::getRuntimeMXBean to get pid in tests Message-ID: Hi all, could you please review this small cleanup which replaces `ManagementFactory.getRuntimeMXBean().getName().split("@")[0]` w/ `ProcessHandle.current().pid()` to get current process pid? Thanks, -- Igor ------------- Commit messages: - update copyright - use ProcessHandle::pid instead of ManagementFactory::getRuntimeMXBean to get pid Changes: https://git.openjdk.java.net/jdk/pull/534/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=534&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254102 Stats: 55 lines in 8 files changed: 0 ins; 41 del; 14 mod Patch: https://git.openjdk.java.net/jdk/pull/534.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/534/head:pull/534 PR: https://git.openjdk.java.net/jdk/pull/534 From coleenp at openjdk.java.net Tue Oct 6 23:50:22 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 6 Oct 2020 23:50:22 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v3] In-Reply-To: References: Message-ID: > This change moves the significant amount of stack overflow related code (with ascii art!) out of thread files into a > new file. Many of the functions are static functions and some go through JavaThread::_stack_overflow_state where > needed. All functions are moved and not modified except for qualification. I also added a delegating constructor to > JavaThread::JavaThread so reordered the assignments as initializers from JavaThread::initialize. > Tested with tier1-6 and builds on arm32, ppc, s390 and zero. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: fix comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/522/files - new: https://git.openjdk.java.net/jdk/pull/522/files/75ddb562..722eb6f2 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=522&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=522&range=01-02 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/522.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/522/head:pull/522 PR: https://git.openjdk.java.net/jdk/pull/522 From coleenp at openjdk.java.net Tue Oct 6 23:50:23 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 6 Oct 2020 23:50:23 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v2] In-Reply-To: References: Message-ID: <6CZ-nZKr_-ss6SOndXzE5x1X5Xo_qgRbKU2ZvxnDr8A=.4f525343-7e46-416b-9b01-8bac6fc89978@github.com> On Tue, 6 Oct 2020 22:42:53 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> 8253717: Relocate stack overflow code out of thread.hpp/cpp > > src/hotspot/share/runtime/stackOverflow.hpp line 33: > >> 31: class JavaThread; >> 32: >> 33: class StackOverflow { > > Can we add a descriptive comment of what this class actually represents please - and the fact it is tied to JavaThreads > only. Ideally this would be within the logical namespace of JavaThread so that it doesn't appear to be a > standalone/independent entity. Also the name isn't quite right as it doesn't represent an actual stack-overflow but the > protection/state mechanism around that - so perhaps StackOverflowState or StackOverflowProtection ? I could rename it StackOverflowState but I'd rather not. Just StackOverflow looks better in the static names, and the places where it refers to the state variables of JavaThread uses the stack_overflow_state() name. The static names StackOverflow::stack_shadow_zone_size() for example look really nice in the code without any further qualification. How about this as a comment: // StackOverflow handling is encapsulated in this class. This class contains state variables // for each JavaThread that implement stack overflow checking and guard page implementation. ------------- PR: https://git.openjdk.java.net/jdk/pull/522 From ysuenaga at openjdk.java.net Wed Oct 7 00:51:06 2020 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 7 Oct 2020 00:51:06 GMT Subject: RFR: 8253757: Add LLVM-based backend for hsdis In-Reply-To: References: <91erxiMDb4ftvSomuJYHPi9SX-v8Z2VLD2qEwCbz5tk=.b9ed01b5-f0e0-4ed7-9c1a-b06bc0e64640@github.com> Message-ID: <8Eqswd7tsVaGEXHdKDncXqKpW2tBsSeuY0PV6aTB9_c=.a6cf4957-9d31-4e89-bf44-e7b7852205d5@github.com> On Fri, 2 Oct 2020 11:44:51 GMT, Magnus Ihse Bursie wrote: >> @navyxliu I've merged the sources into `src/utils/hsdis` and added support to build it in the Makefile. > > This is an interesting suggestion. There is a similar attempt at replacing binutils with capstone in > https://bugs.openjdk.java.net/browse/JDK-8188073, which unfortunately has not seen much progress due to lack of > resources; I don't know if you are aware of that? There is also a (extremely low priority) effort to rewrite the hsdis > makefile to be part of the normal build system, see e.g. https://bugs.openjdk.java.net/browse/JDK-8208495. Neither of > these should be any blocker for your change, but I think it might be good if you know about them. I have couple of > concerns with your patch. One is the method in which LLVM is selected instead of binutils; afaict this depends on > having the `LLVM` variable set when executing the makefile. At the very least, this should be documented in the README. > I don't think any more complicated configuration is really necessary at this point. With full integration with the > build system, a more user-friendly way of selecting hsdis backend should be implemented, though. Second, and I don't > know if this is an artifact of git/github/the new skara tooling, but if you renamed hsdis.c to hsdis.cpp, this > relationship does not show up, not even in the generated webrevs. Instead they are considered a new + a deleted file. > This makes it hard to see what code changes you have done in that file. And third; have you tested that your changes > (both changing the main file from C to C++, and any code changes in it) does not break the old binutils functionality? > Afaic there are no test suites for exercising hsdis :-( so manual ad-hoc testing is likely needed. Can you separate LLVM and binutils from hsdis.cpp? I guess you say that the problem is both GCC and binutils are not available on Windows AArch64. Is it right? 1 question: binutils seems to support Windows AArch64. Did you try recently binutils? If we can use binutils on Windows AArch64, you can fix makefile only. https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=binutils/dlltool.c;h=ed016b97dc38cdb1b85d2f6df676b9c9750f0d41;hb=HEAD#l248 ------------- PR: https://git.openjdk.java.net/jdk/pull/392 From dholmes at openjdk.java.net Wed Oct 7 01:14:08 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 7 Oct 2020 01:14:08 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v3] In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 23:50:22 GMT, Coleen Phillimore wrote: >> This change moves the significant amount of stack overflow related code (with ascii art!) out of thread files into a >> new file. Many of the functions are static functions and some go through JavaThread::_stack_overflow_state where >> needed. All functions are moved and not modified except for qualification. I also added a delegating constructor to >> JavaThread::JavaThread so reordered the assignments as initializers from JavaThread::initialize. >> Tested with tier1-6 and builds on arm32, ppc, s390 and zero. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > fix comments Thanks for the updates and feedback on other queries! ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/522 From rriggs at openjdk.java.net Wed Oct 7 02:02:05 2020 From: rriggs at openjdk.java.net (Roger Riggs) Date: Wed, 7 Oct 2020 02:02:05 GMT Subject: RFR: 8254102: use ProcessHandle::pid instead of ManagementFactory::getRuntimeMXBean to get pid in tests In-Reply-To: References: Message-ID: <5OaD6gBX9nmmX_N690gLnHefPo4bk9z3D-WqGwD3kXA=.32b936f9-c795-49c6-b158-90845eadcd45@github.com> On Tue, 6 Oct 2020 23:08:40 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this small cleanup which replaces > `ManagementFactory.getRuntimeMXBean().getName().split("@")[0]` w/ `ProcessHandle.current().pid()` to get current > process pid? Thanks, > -- Igor All of these changes can call `ProcessHandle.current().toString()` to return pid of the current process. test/failure_handler/test/sanity/Suicide.java line 36: > 34: String osName = System.getProperty("os.name"); > 35: if (osName.contains("Windows")) { > 36: cmd = "taskkill.exe /F /PID " + pidStr; This can be simplified to ProcessHandle.current().toString(). It returns the pid of the process as a string. Explicitly converting it to a string is not necessary. The "+" concatenation would convert the number to a string. ------------- Changes requested by rriggs (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/534 From sspitsyn at openjdk.java.net Wed Oct 7 04:31:16 2020 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Wed, 7 Oct 2020 04:31:16 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v3] In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 14:06:35 GMT, Richard Reingruber wrote: >> Hi Serguei, >> >> thanks for providing feedback! I've pushed the changes based on it now but I >> have not yet merged master again. This needs a little work... >> >> Please find my replies to your comments below. >> >> Thanks, Richard. >> >>> Could you consider to place the classes EscapeBarrier and JvmtiDeferredUpdates >>> into theyr own .hpp/.cpp files? The class JvmtiDeferredUpdates would be better >>> to put into the folder 'prims' then. >> >> Done. In addition I moved preexisting class jvmtiDeferredLocalVariableSet and >> class jvmtiDeferredLocalVariable from runtime/vframe_hp.hpp to >> prims/jvmtiDeferredUpdates.hpp. Please let me know if not ok. >> >>> src/hotspot/share/opto/macro.cpp: >>> >>> ``` >>> @@ -1091,11 +1091,11 @@ >>> bool PhaseMacroExpand::eliminate_allocate_node(AllocateNode *alloc) { >>> // Don't do scalar replacement if the frame can be popped by JVMTI: >>> // if reallocation fails during deoptimization we'll pop all >>> // interpreter frames for this compiled frame and that won't play >>> // nice with JVMTI popframe. >>> - if (!EliminateAllocations || JvmtiExport::can_pop_frame() || !alloc->_is_non_escaping) { >>> + if (!EliminateAllocations || !alloc->_is_non_escaping) { >>> return false; >>> } >>> ``` >>> >>> I wonder if the comment is still correct after you removed the check for JvmtiExport::can_pop_frame(). >> >> Good catch. I fixed it previously with >> https://github.com/openjdk/jdk/pull/119/commits/18dd54b4e6f17ca723e4ae1a1e8dc57e81878dd3 >> >>> src/hotspot/share/runtime/deoptimization.hpp: >>> >>> ``` >>> + EscapeBarrier(JavaThread* calling_thread, JavaThread* deoptee_thread, bool barrier_active) >>> + : _calling_thread(calling_thread), _deoptee_thread(deoptee_thread), >>> + _barrier_active(barrier_active && (JVMCI_ONLY(UseJVMCICompiler) NOT_JVMCI(false) >>> + COMPILER2_PRESENT(|| DoEscapeAnalysis))) >>> . . . . . . . . . >>> + >>> + // Revert ea based optimizations for all java threads >>> + EscapeBarrier(JavaThread* calling_thread, bool barrier_active) >>> + : _calling_thread(calling_thread), _deoptee_thread(NULL), >>> ``` >>> >>> Nit: would better to make the parameter deoptee_thread to be the 3rd to better mach the seconf constructor. >> >> I have shuffled the parameters and moved barrier_active at first position. Would >> that be ok? >> >>> >>> ``` >>> + bool all_threads() const { return _deoptee_thread == NULL; } // Should revert optimizations for all >>> threads. + bool self_deopt() const { return _calling_thread == _deoptee_thread; } // Current thread deoptimizes >>> its own objects. + bool barrier_active() const { return _barrier_active; } // Inactive barriers are >>> created if no local objects can escape. ``` >>> >>> I'd suggest to put comments in a line before function definitions as it is done for other declarations/definitions. >> >> Done. // Note that there are quite a few locations with the comment on the same line ;) >> >>> src/hotspot/share/runtime/deoptimization.cpp: >>> >>> ``` >>> @@ -349,12 +408,12 @@ >>> >>> // Now that the vframeArray has been created if we have any deferred local writes >>> // added by jvmti then we can free up that structure as the data is now in the >>> // vframeArray >>> >>> - if (thread->deferred_locals() != NULL) { >>> - GrowableArray* list = thread->deferred_locals(); >>> + if (JvmtiDeferredUpdates::deferred_locals(thread) != NULL) { >>> + GrowableArray* list = JvmtiDeferredUpdates::deferred_locals(thread); >>> int i = 0; >>> do { >>> // Because of inlining we could have multiple vframes for a single frame >>> // and several of the vframes could have deferred writes. Find them all. >>> if (list->at(i)->id() == array->original().id()) { >>> >>> @@ -365,13 +424,14 @@ >>> } else { >>> i++; >>> } >>> } while ( i < list->length() ); >>> if (list->length() == 0) { >>> - thread->set_deferred_locals(NULL); >>> - // free the list and elements back to C heap. >>> - delete list; >>> + JvmtiDeferredUpdates* updates = thread->deferred_updates(); >>> + thread->set_deferred_updates(NULL); >>> + // free deferred updates. >>> + delete updates; >>> } >>> ``` >>> >>> It is not clear why the 'list' is not deleted anymore. If it is intentional then could you, please, add a comment with >>> an explanation? >> >> 'list' is now embedded in JvmtiDeferredUpdates. It es deleted as part of the >> JvmtiDeferredUpdates instance when there are no more deferred updates. >> >> class JvmtiDeferredUpdates : public CHeapObj { >> >> [...] >> >> // Deferred updates of locals, expressions, and monitors >> GrowableArray _deferred_locals_updates; >> >> [...] >> >> }; >> >> I introduced JvmtiDeferredUpdates because this patch introduces a new type of >> deferred update: _relock_count_after_wait. >> >> I tried to improve the encapsulation of class JvmtiDeferredUpdates and >> simplified the location you are referring to. >> >> So when is memory for deferred updates freed? >> >> (A) Deferred local variable updates are deleted when the compiled target frame is >> replaced with corresponding interpreter frames. >> See JvmtiDeferredUpdates::delete_updates_for_frame(). >> >> (B) A thread's JvmtiDeferredUpdates instance is deleted if all updates where >> delivered. All updates where delivered when JvmtiDeferredUpdates::count() >> returns 0. This is checked whenever updates are delivered. See call sites in >> JvmtiDeferredUpdates::delete_updates_for_frame() and >> JvmtiDeferredUpdates::get_and_reset_relock_count_after_wait(). >> >> (C) Besides (B) a thread's JvmtiDeferredUpdates instance is also deleted when >> the thread is destroyed. All not yet delivered updates are deleted then >> too. See JavaThread::~JavaThread() and JvmtiDeferredUpdates::~JvmtiDeferredUpdates(). >> >>> If you are okay to separate the EscapeBarrier class into its own hpp/cpp files >>> then the class EscapeBarrierSuspendHandshake is better to be colocated with >>> it. >> >> Done. >> >>> The below functions EscapeBarrier::sync_and_suspend_one() and do_thread() make a call to the set_obj_deopt_flag() which >>> seems to be a duplication. At least, it is not clear why this duplication exist and so, needs to be explained in a >>> comment. ``` >>> +void EscapeBarrier::sync_and_suspend_one() { >>> + assert(_calling_thread != NULL, "calling thread must not be NULL"); >>> + assert(_deoptee_thread != NULL, "deoptee thread must not be NULL"); >>> + assert(barrier_active(), "should not call"); >>> + >>> + // Sync with other threads that might be doing deoptimizations >>> + { >>> + // Need to switch to _thread_blocked for the wait() call >>> + ThreadBlockInVM tbivm(_calling_thread); >>> + MonitorLocker ml(_calling_thread, EscapeBarrier_lock, Mutex::_no_safepoint_check_flag); >>> + while (_self_deoptimization_in_progress || _deoptee_thread->is_obj_deopt_suspend()) { >>> + ml.wait(); >>> + } >>> + >>> + if (self_deopt()) { >>> + _self_deoptimization_in_progress = true; >>> + return; >>> + } >>> + >>> + // set suspend flag for target thread >>> + _deoptee_thread->set_obj_deopt_flag(); >>> + } >>> + >>> + // suspend target thread >>> + EscapeBarrierSuspendHandshake sh(NULL, "EscapeBarrierSuspendOne"); >>> + Handshake::execute_direct(&sh, _deoptee_thread); >>> + assert(!_deoptee_thread->has_last_Java_frame() || _deoptee_thread->frame_anchor()->walkable(), >>> + "stack should be walkable now"); >>> +} >>> . . . . . >>> +class EscapeBarrierSuspendHandshake : public HandshakeClosure { >>> + JavaThread* _excluded_thread; >>> + public: >>> + EscapeBarrierSuspendHandshake(JavaThread* excluded_thread, const char* name) : >>> + HandshakeClosure(name), >>> + _excluded_thread(excluded_thread) {} >>> + void do_thread(Thread* th) { >>> + if (th->is_Java_thread() && !th->is_hidden_from_external_view() && (th != _excluded_thread)) { >>> + th->set_obj_deopt_flag(); >>> + } >>> + } >>> +}; >>> ``` >> >> I previously removed the set_obj_deopt_flag() call from >> EscapeBarrierSuspendHandshake::do_thread() in [1]. For synchronization it is >> better to set_obj_deopt_flag() before the handshake (see comment in >> EscapeBarrier::sync_and_suspend_all()). >> >> [1] https://github.com/openjdk/jdk/pull/119/commits/18dd54b4e6f17ca723e4ae1a1e8dc57e81878dd3 >> >>> /src/hotspot/share/prims/jvmtiImpl.cpp: >>> >>> ``` >>> 421 // Constructor for non-object getter >>> 422 VM_GetOrSetLocal::VM_GetOrSetLocal(JavaThread* thread, jint depth, jint index, BasicType type) >>> 423 : _thread(thread) >>> 424 , _calling_thread(NULL) >>> 425 , _depth(depth) >>> 426 , _index(index) >>> 427 , _type(type) >>> 428 , _jvf(NULL) >>> 429 , _set(false) >>> 430 , _eb(NULL, NULL, type == T_OBJECT) >>> 431 , _result(JVMTI_ERROR_NONE) >>> 432 { >>> 433 } >>> 434 >>> 435 // Constructor for object or non-object setter >>> 436 VM_GetOrSetLocal::VM_GetOrSetLocal(JavaThread* thread, jint depth, jint index, BasicType type, jvalue value) >>> 437 : _thread(thread) >>> 438 , _calling_thread(NULL) >>> 439 , _depth(depth) >>> 440 , _index(index) >>> 441 , _type(type) >>> 442 , _value(value) >>> 443 , _jvf(NULL) >>> 444 , _set(true) >>> 445 , _eb(JavaThread::current(), thread, type == T_OBJECT) >>> 446 , _result(JVMTI_ERROR_NONE) >>> 447 { >>> 448 } >>> 449 >>> 450 // Constructor for object getter >>> 451 VM_GetOrSetLocal::VM_GetOrSetLocal(JavaThread* thread, JavaThread* calling_thread, jint depth, int index) >>> 452 : _thread(thread) >>> 453 , _calling_thread(calling_thread) >>> 454 , _depth(depth) >>> 455 , _index(index) >>> 456 , _type(T_OBJECT) >>> 457 , _jvf(NULL) >>> 458 , _set(false) >>> 459 , _eb(calling_thread, thread, true) >>> 460 , _result(JVMTI_ERROR_NONE) >>> 461 { >>> 462 } >>> ``` >>> >>> I think, false has to be passed to the constructors of non-object getters instead of expression: >>> "type == T_OBJECT". >>> The type can not be T_OBJECT for non-object getters. >> >> I used to do that. Then I changed it because the c++ compiler can fold the >> comparison to "false" and if somebody changes the non-object getter to get >> objects too then it would still be correct. >> >> Let me know if you still think it is better to pass false. Maybe add an >> assertion type == T_OBJECT then? >> >>> Q: Is an EscapeBarrier useful if false is passed as the barrier_active parameter? >> >> The EscapeBarrier is not needed then. In the case of the non-object getter above >> I'd hope that most of the constructor/desctructor of EscapeBarrier is eliminated >> by the c++ compiler then. >> >> Besides the changes you suggested I have made a bugfix in >> test/jdk/com/sun/jdi/EATests.java to prevent ObjectCollectedException. >> >> Thanks, Richard. > > Hi Serguei > (@sspitsyn) > > are you ok with the changes I made based on your comments? > Will you further review the change? > > Thanks, Richard. Hi Richard, Thank you for making the refactoring. I like it more now. :) So, the fix looks good to me in general. But could I ask you, to adjust some formatting, please? There are several things that can be done to improve the code readability. src/hotspot/share/prims/jvmtiDeferredUpdates.hpp: I'd suggest to add an empty line before lines 40, 71, 73, 93, 95, 109 to make class definitions and function declarations/definitions with comments more catchable by eyes. The following lines can be removed: 81, 82, 103 Also, there is inconsistency in function definitions formatting: - some functions have big indent between the type and name - some functions have no indent between the type and name but a big indent between name and body I'd suggest to either to remove all indents or make it reasonably smaller but consistent. It seems, there is no reason to keep these class declarations: 38 class jvmtiDeferredLocalVariable; 108 class jvmtiDeferredLocalVariableSet; src/hotspot/share/prims/jvmtiDeferredUpdates.cpp: 82 // Free deferred updates. 83 // (Note the 'list' of local variable updates is embedded in 'updates') A suggestion to change the line 83 as follows: ` 83 // Note, the 'list' of local variable updates is embedded in 'updates'.` src/hotspot/share/runtime/escapeBarrier.hpp: Add dots at the end of comments at lines 97, 99, 103. I'd suggest to add an empty line before lines 39, 40, 80, 81, 93, 94, 99, 119, 121. src/hotspot/share/runtime/escapeBarrier.cpp: The following class declaration is not needed: ` 49 class jvmtiDeferredLocalVariableSet;` because you already added this line: ` 29 #include "prims/jvmtiDeferredUpdates.hpp"` The lines below deserve a refactoring. It can be separate functions for locals, expressions and monitors, or just one function for the whole fragment: 345 GrowableArray* scopeLocals = cvf->scope()->locals(); 346 StackValueCollection* locals = cvf->locals(); 347 if (locals != NULL) { 348 for (int i2 = 0; i2 < locals->size(); i2++) { 349 StackValue* var = locals->at(i2); 350 if (var->type() == T_OBJECT && scopeLocals->at(i2)->is_object()) { 351 jvalue val; 352 val.l = cast_from_oop(locals->at(i2)->get_obj()()); 353 cvf->update_local(T_OBJECT, i2, val); 354 } 355 } 356 } 357 358 // expressions 359 GrowableArray* scopeExpressions = cvf->scope()->expressions(); 360 StackValueCollection* expressions = cvf->expressions(); 361 if (expressions != NULL) { 362 for (int i2 = 0; i2 < expressions->size(); i2++) { 363 StackValue* var = expressions->at(i2); 364 if (var->type() == T_OBJECT && scopeExpressions->at(i2)->is_object()) { 365 jvalue val; 366 val.l = cast_from_oop(expressions->at(i2)->get_obj()()); 367 cvf->update_stack(T_OBJECT, i2, val); 368 } 369 } 370 } 371 372 // monitors 373 GrowableArray* monitors = cvf->monitors(); 374 if (monitors != NULL) { 375 for (int i2 = 0; i2 < monitors->length(); i2++) { 376 if (monitors->at(i2)->eliminated()) { 377 assert(!monitors->at(i2)->owner_is_scalar_replaced(), 378 "reallocation failure, should not update"); 379 cvf->update_monitor(i2, monitors->at(i2)); 380 } 381 } 382 } src/hotspot/share/prims/jvmtiImpl.cpp: 420 // Constructor for non-object getter 421 VM_GetOrSetLocal::VM_GetOrSetLocal(JavaThread* thread, jint depth, jint index, BasicType type) 422 : _thread(thread) 423 , _calling_thread(NULL) 424 , _depth(depth) 425 , _index(index) 426 , _type(type) 427 , _jvf(NULL) 428 , _set(false) 429 , _eb(type == T_OBJECT, NULL, NULL) 430 , _result(JVMTI_ERROR_NONE) 431 { 432 } I still think, that the line 429 is going to cause confusions. It is a non-object getter, so the type should never be T_OBJECT. It won't change in the future to allow the T_OBJECT types. The only way to allow it is to merge the constructors for object and non-object getters. So, I'm suggesting to replace this line with: ` 429 , _eb(false, NULL, NULL)` ------------- PR: https://git.openjdk.java.net/jdk/pull/119 From david.holmes at oracle.com Wed Oct 7 06:39:57 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 7 Oct 2020 16:39:57 +1000 Subject: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v8] In-Reply-To: References: <4sawJHiIuc7oH5ETjrwJtJE3gkB1U2VBMVJdPmxJrg4=.e4e9b4d3-a118-4870-9b5b-f23b351093e2@github.com> Message-ID: Hi Erik, On 6/10/2020 5:37 pm, Erik ?sterlund wrote: > On Tue, 6 Oct 2020 02:57:00 GMT, David Holmes wrote: > >> Hi Erik, >> Can you give an overview of the use of the "poll word" and its relation to the "poll page" please? >> Thanks, >> David > > Hi David, > > Thanks for reviewing this code. > > There are various polls in the VM. We have runtime transitions, interpreter transitions, transitions at returns, native > wrappers, transitions in nmethods... and sometimes they are a bit different. > > The "poll word" encapsulates enough information to be able to poll for returns (stack watermark barrier), or poll for > normal handshakes/safepoints, with a conditional branch. So really, we could use the "poll word" for every single poll. > A low order bit is a boolean saying if handshake/safepoint is armed, and the rest of the word denotes the watermark for > which frame has armed returns. > > The "poll page" is for polls that do not use conditional branches, but instead uses an indirect load. It is used still > in nmethod loop polls, because I experimentally found it to perform worse with conditional branches on one machine, and > did not want to risk regressions. It is also used for VM configurations that do not yet support stack watermark > barriers, such as Graal, PPC, S390 and 32 bit platforms. They will hopefully eventually support this mechanism, but > having the poll page allows a more smooth transition. And unless it is crystal clear that the performance of the > conditional branch loop poll really is fast enough on sufficiently many machines, we might keep it until that changes. > > Hope this makes sense. Yes but I am somewhat surprised. The conventional wisdom has always been that polling based on the "poison page" approach far outperforms explicit load-test-branch approaches. Cheers, David > Thanks, > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/296 > From shade at openjdk.java.net Wed Oct 7 06:40:10 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 7 Oct 2020 06:40:10 GMT Subject: RFR: 8254102: use ProcessHandle::pid instead of ManagementFactory::getRuntimeMXBean to get pid in tests In-Reply-To: References: Message-ID: <1vxFjySSXroErkRB9oePKlg09vRS5iKeuGjU5bLlvV0=.95760bdb-865f-4a50-8186-c95073ecec48@github.com> On Tue, 6 Oct 2020 23:08:40 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this small cleanup which replaces > `ManagementFactory.getRuntimeMXBean().getName().split("@")[0]` w/ `ProcessHandle.current().pid()` to get current > process pid? Thanks, > -- Igor I think this is fine, but you might consider a little improvement below. ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/534 From shade at openjdk.java.net Wed Oct 7 06:40:11 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 7 Oct 2020 06:40:11 GMT Subject: RFR: 8254102: use ProcessHandle::pid instead of ManagementFactory::getRuntimeMXBean to get pid in tests In-Reply-To: <5OaD6gBX9nmmX_N690gLnHefPo4bk9z3D-WqGwD3kXA=.32b936f9-c795-49c6-b158-90845eadcd45@github.com> References: <5OaD6gBX9nmmX_N690gLnHefPo4bk9z3D-WqGwD3kXA=.32b936f9-c795-49c6-b158-90845eadcd45@github.com> Message-ID: On Wed, 7 Oct 2020 01:47:23 GMT, Roger Riggs wrote: >> Hi all, >> >> could you please review this small cleanup which replaces >> `ManagementFactory.getRuntimeMXBean().getName().split("@")[0]` w/ `ProcessHandle.current().pid()` to get current >> process pid? Thanks, >> -- Igor > > test/failure_handler/test/sanity/Suicide.java line 36: > >> 34: String osName = System.getProperty("os.name"); >> 35: if (osName.contains("Windows")) { >> 36: cmd = "taskkill.exe /F /PID " + pidStr; > > This can be simplified to ProcessHandle.current().toString(). It returns the pid of the process as a string. > > Explicitly converting it to a string is not necessary. The "+" concatenation would convert the number to a string. Yes, can just have `long pid` in this case. I don't see that `ProcessHandle.toString` is *specified* to return the string with pid, so relying on that is brittle. We might call `Long.toString` here directly, to avoid jumping through a few calls. But seeing how all this is a test code, that does not seem necessary. ------------- PR: https://git.openjdk.java.net/jdk/pull/534 From eosterlund at openjdk.java.net Wed Oct 7 07:00:15 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 7 Oct 2020 07:00:15 GMT Subject: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v8] In-Reply-To: References: <4sawJHiIuc7oH5ETjrwJtJE3gkB1U2VBMVJdPmxJrg4=.e4e9b4d3-a118-4870-9b5b-f23b351093e2@github.com> Message-ID: On Tue, 6 Oct 2020 12:18:39 GMT, Erik ?sterlund wrote: >>> Hi Erik, >>> Can you give an overview of the use of the "poll word" and its relation to the "poll page" please? >>> Thanks, >>> David >> >> Hi David, >> >> Thanks for reviewing this code. >> >> There are various polls in the VM. We have runtime transitions, interpreter transitions, transitions at returns, native >> wrappers, transitions in nmethods... and sometimes they are a bit different. >> The "poll word" encapsulates enough information to be able to poll for returns (stack watermark barrier), or poll for >> normal handshakes/safepoints, with a conditional branch. So really, we could use the "poll word" for every single poll. >> A low order bit is a boolean saying if handshake/safepoint is armed, and the rest of the word denotes the watermark for >> which frame has armed returns. The "poll page" is for polls that do not use conditional branches, but instead uses an >> indirect load. It is used still in nmethod loop polls, because I experimentally found it to perform worse with >> conditional branches on one machine, and did not want to risk regressions. It is also used for VM configurations that >> do not yet support stack watermark barriers, such as Graal, PPC, S390 and 32 bit platforms. They will hopefully >> eventually support this mechanism, but having the poll page allows a more smooth transition. And unless it is crystal >> clear that the performance of the conditional branch loop poll really is fast enough on sufficiently many machines, we >> might keep it until that changes. Hope this makes sense. Thanks, > >> _Mailing list message from [Andrew Haley](mailto:aph at redhat.com) on [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ >> >> On 06/10/2020 08:22, Erik ?sterlund wrote: >> >> > > This PR the implementation of "JEP 376: ZGC: Concurrent Thread-Stack Processing" (cf. >> > > https://openjdk.java.net/jeps/376). >> >> One small thing: the couple of uses of lea(InternalAddress) should really be adr; >> this generates much better code. > > Hi Andrew, > > Thanks for having a look. I applied your patch. Having said that, this is run on the safepoint slow path, so should be > a rather cold path, where threads have to wear coats and gloves. But it does not hurt to optimize the encoding further, > I suppose. Thanks, > > *Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on > [serviceability-dev](mailto:serviceability-dev at openjdk.java.net):* > > Hi Erik, > > On 6/10/2020 5:37 pm, Erik ?sterlund wrote: > > On Tue, 6 Oct 2020 02:57:00 GMT, David Holmes wrote: > > > >> Hi Erik, > >> Can you give an overview of the use of the "poll word" and its relation to the "poll page" please? > >> Thanks, > >> David > > > > Hi David, > > > > Thanks for reviewing this code. > > > > There are various polls in the VM. We have runtime transitions, interpreter transitions, transitions at returns, native > > wrappers, transitions in nmethods... and sometimes they are a bit different. > > > > The "poll word" encapsulates enough information to be able to poll for returns (stack watermark barrier), or poll for > > normal handshakes/safepoints, with a conditional branch. So really, we could use the "poll word" for every single poll. > > A low order bit is a boolean saying if handshake/safepoint is armed, and the rest of the word denotes the watermark for > > which frame has armed returns. > > > > The "poll page" is for polls that do not use conditional branches, but instead uses an indirect load. It is used still > > in nmethod loop polls, because I experimentally found it to perform worse with conditional branches on one machine, and > > did not want to risk regressions. It is also used for VM configurations that do not yet support stack watermark > > barriers, such as Graal, PPC, S390 and 32 bit platforms. They will hopefully eventually support this mechanism, but > > having the poll page allows a more smooth transition. And unless it is crystal clear that the performance of the > > conditional branch loop poll really is fast enough on sufficiently many machines, we might keep it until that changes. > > > > Hope this makes sense. > > Yes but I am somewhat surprised. The conventional wisdom has always been > that polling based on the "poison page" approach far outperforms > explicit load-test-branch approaches. > > Cheers, > David When thread local handshakes was built, both a branch based and indirect load based prototype was implemented. I had a branch based solution and Mikael Gerdin built an indirect load based solution, so we could compare them. He compared them on many machines and found that sometimes branches are a bit faster and sometimes a bit slower, depending on CPU model. But the results with indirect loads was more stable from machine to machine, while the branch based solution depended a bit more on what CPU model was being used. That is why the indirect load solution was chosen: it was not always best but it was never bad on any machine. Since then, we got loop strip mining in C2 which makes the frequency of polls less tight. My hypothesis was that with that in place, a new evaluation would show that branching is fine now. However, one machine did not agree with that. To be fair, that machine is not giving very stable results at all right now, so I am not sure if there is a real problem or not. But I thought I'll keep the old behaviour for now anyway as I do not yet have a good reason to change it, and not good enough proof that it is okay... yet. ------------- PR: https://git.openjdk.java.net/jdk/pull/296 From xliu at openjdk.java.net Wed Oct 7 08:06:14 2020 From: xliu at openjdk.java.net (Xin Liu) Date: Wed, 7 Oct 2020 08:06:14 GMT Subject: RFR: 8253757: Add LLVM-based backend for hsdis In-Reply-To: <8Eqswd7tsVaGEXHdKDncXqKpW2tBsSeuY0PV6aTB9_c=.a6cf4957-9d31-4e89-bf44-e7b7852205d5@github.com> References: <91erxiMDb4ftvSomuJYHPi9SX-v8Z2VLD2qEwCbz5tk=.b9ed01b5-f0e0-4ed7-9c1a-b06bc0e64640@github.com> <8Eqswd7tsVaGEXHdKDncXqKpW2tBsSeuY0PV6aTB9_c=.a6cf4957-9d31-4e89-bf44-e7b7852205d5@github.com> Message-ID: On Wed, 7 Oct 2020 00:48:24 GMT, Yasumasa Suenaga wrote: >> This is an interesting suggestion. There is a similar attempt at replacing binutils with capstone in >> https://bugs.openjdk.java.net/browse/JDK-8188073, which unfortunately has not seen much progress due to lack of >> resources; I don't know if you are aware of that? There is also a (extremely low priority) effort to rewrite the hsdis >> makefile to be part of the normal build system, see e.g. https://bugs.openjdk.java.net/browse/JDK-8208495. Neither of >> these should be any blocker for your change, but I think it might be good if you know about them. I have couple of >> concerns with your patch. One is the method in which LLVM is selected instead of binutils; afaict this depends on >> having the `LLVM` variable set when executing the makefile. At the very least, this should be documented in the README. >> I don't think any more complicated configuration is really necessary at this point. With full integration with the >> build system, a more user-friendly way of selecting hsdis backend should be implemented, though. Second, and I don't >> know if this is an artifact of git/github/the new skara tooling, but if you renamed hsdis.c to hsdis.cpp, this >> relationship does not show up, not even in the generated webrevs. Instead they are considered a new + a deleted file. >> This makes it hard to see what code changes you have done in that file. And third; have you tested that your changes >> (both changing the main file from C to C++, and any code changes in it) does not break the old binutils functionality? >> Afaic there are no test suites for exercising hsdis :-( so manual ad-hoc testing is likely needed. > > Can you separate LLVM and binutils from hsdis.cpp? > > I guess you say that the problem is both GCC and binutils are not available on Windows AArch64. Is it right? > 1 question: binutils seems to support Windows AArch64. Did you try recently binutils? If we can use binutils on Windows > AArch64, you can fix makefile only. > https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=binutils/dlltool.c;h=ed016b97dc38cdb1b85d2f6df676b9c9750f0d41;hb=HEAD#l248 IMHO, it's great to have an alternative disassembler. I personally had better experience using llvm MC when I decoded aarch64 and AVX instructions than BFD. Another argument is that LLVM toolchain is supposed to provide the premium experience on non-gnu platforms such as FreeBSD. @luhenry I tried to build it with LLVM10.0.1 on my x86_64, ubuntu, I ran into a small problem. here is how I build. `$make ARCH=amd64 CC=/opt/llvm/bin/clang CXX=/opt/llvm/bin/clang++ LLVM=/opt/llvm/` I can't meet this condition because Makefile defines LIBOS_linux. #elif defined(LIBOS_Linux) && defined(LIBARCH_amd64) return "x86_64-pc-linux-gnu"; Actually, Makefile assigns OS to windows/linux/aix/macosx (all lower case)and then `CPPFLAGS += -DLIBOS_$(OS) -DLIBOS="$(OS)" -DLIBARCH_$(LIBARCH) -DLIBARCH="$(LIBARCH)" -DLIB_EXT="$(LIB_EXT)"` In hsdis.cpp, `native_target_triple` needs to match whatever Makefile defined. With that fix, I generate llvm version hsdis-amd64.so and it works flawlessly ------------- PR: https://git.openjdk.java.net/jdk/pull/392 From stuefe at openjdk.java.net Wed Oct 7 08:14:13 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 7 Oct 2020 08:14:13 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v3] In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 23:50:22 GMT, Coleen Phillimore wrote: >> This change moves the significant amount of stack overflow related code (with ascii art!) out of thread files into a >> new file. Many of the functions are static functions and some go through JavaThread::_stack_overflow_state where >> needed. All functions are moved and not modified except for qualification. I also added a delegating constructor to >> JavaThread::JavaThread so reordered the assignments as initializers from JavaThread::initialize. >> Tested with tier1-6 and builds on arm32, ppc, s390 and zero. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > fix comments Hi Coleen, this is a nice cleanup. Not a full review yet. Also some of my remarks are probably follow ups (if that), I leave that up to you how much you want to take from my review. Cheers, Thomas src/hotspot/share/runtime/stackOverflow.hpp line 135: > 133: return _stack_red_zone_size; > 134: } > 135: static void set_stack_red_zone_size(size_t s) { Could we merge these individual setters for set_xxx_zone_size() into one function: static initialize_dimensions(size, size, ..) ? Would make it clearer that these are not to be touched outside initialization. That function also should live in the .cpp file. src/hotspot/share/runtime/stackOverflow.hpp line 1: > 1: /* There are a number of newline issues throughout this file. Mostly code too tight (newlines missing between method blocks) acc. to hotspot style guide. src/hotspot/share/runtime/stackOverflow.hpp line 125: > 123: // These values are derived from flags StackRedPages, StackYellowPages, > 124: // StackReservedPages and StackShadowPages. The zone size is determined > 125: // ergonomically if page_size > 4K. I think the last sentence is superfluous (the first says it all) and slightly wrong too. src/hotspot/share/runtime/stackOverflow.hpp line 145: > 143: } > 144: bool in_stack_red_zone(address a) { > 145: return a <= stack_red_zone_base() && a >= stack_end(); Note that this method conflicts with in_stack_yellow_reserved_zone() since both in_stack_yellow_reserved_zone() and is_stack_red_zone() return true for a=red zone base. Strictly speaking it is wrong here since red zone base is not in red zone. src/hotspot/share/runtime/stackOverflow.hpp line 181: > 179: return _stack_yellow_zone_size + _stack_reserved_zone_size; > 180: } > 181: bool in_stack_yellow_reserved_zone(address a) { I always have to look what this actually does since there is no "yellow reserved" zone. A slight rename would help, e.g.: "is_stack_yellow_or_reserved_zone()". src/hotspot/share/runtime/stackOverflow.hpp line 121: > 119: // (large addresses) > 120: // > 121: Nice ascii art :) I wish though it could communicate better the openness of the ranges. E.g.: +----------------+ | | <-- stack_end() | red zone | | | +----------------+ | | <-- red_zone_base() | yellow zone | | | .... | | +----------------+ <-- stack_base() Maybe its just me but I always have to think a bit more here. With downward growing stacks normal range thinking is reversed wrt to openness, so stack_base() points outside the stack and stack_end() is in the stack. This is true for all base values - they point to locations outside the zone they base. Maybe that is clear to all others but it sometimes surprises me. src/hotspot/share/runtime/stackOverflow.hpp line 176: > 174: return (a <= stack_reserved_zone_base()) && > 175: (a >= (address)((intptr_t)stack_reserved_zone_base() - stack_reserved_zone_size())); > 176: } Same here, a==reserved_zone_base is strictly speaking outside the reserved zone. src/hotspot/share/runtime/stackOverflow.hpp line 141: > 139: _stack_red_zone_size = s; > 140: } > 141: address stack_red_zone_base() { could be const (as could a couple of others, I won't mark them individually) src/hotspot/share/runtime/stackOverflow.hpp line 131: > 129: static size_t _stack_shadow_zone_size; > 130: public: > 131: static size_t stack_red_zone_size() { Naming: since you moved these formerly-Thread-methods into this enclosing class all the "_stack" and "stack" prefixes for members and methods are not needed anymore. src/hotspot/share/runtime/stackOverflow.hpp line 182: > 180: } > 181: bool in_stack_yellow_reserved_zone(address a) { > 182: return (a <= stack_reserved_zone_base()) && (a >= stack_red_zone_base()); Strictly speaking stack_reserved_zone_base is outside the reserved zone. src/hotspot/share/runtime/stackOverflow.hpp line 79: > 77: // Stack overflow support > 78: // > 79: // (small addresses) s/small/low ? src/hotspot/share/runtime/stackOverflow.hpp line 119: > 117: // -- <-- stack_base() > 118: // > 119: // (large addresses) s/large/high ? src/hotspot/share/runtime/stackOverflow.hpp line 95: > 93: // | | > 94: // | reserved pages | > 95: // | | Can we use use "zone" instead of "pages" since that term is used e.g. in https://openjdk.java.net/jeps/270 ? src/hotspot/share/runtime/stackOverflow.hpp line 75: > 73: > 74: address stack_end() const { return _stack_end; } > 75: address stack_base() const { assert(_stack_base != nullptr,"Sanity check"); return _stack_base; } We now keep stack base and stack size/end both here and in Thread. Could we merge this and use this as a data holder for Thread? ------------- PR: https://git.openjdk.java.net/jdk/pull/522 From stuefe at openjdk.java.net Wed Oct 7 08:38:11 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 7 Oct 2020 08:38:11 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v3] In-Reply-To: References: Message-ID: <75OTiPawS2oERj0JtGlZxdPiwWdFKo9STqJTK3TKRVA=.cc434741-1995-4a42-b9aa-5badfbe6976d@github.com> On Wed, 7 Oct 2020 08:11:12 GMT, Thomas Stuefe wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> fix comments > > Hi Coleen, > > this is a nice cleanup. Not a full review yet. > > Also some of my remarks are probably follow ups (if that), I leave that up to you how much you want to take from my > review. > Cheers, Thomas Github seems to order code comments by time, not by line number; this is not ideal :( ------------- PR: https://git.openjdk.java.net/jdk/pull/522 From aph at redhat.com Wed Oct 7 09:34:52 2020 From: aph at redhat.com (Andrew Haley) Date: Wed, 7 Oct 2020 10:34:52 +0100 Subject: RFR: 8254072: AArch64: Get rid of --disable-warnings-as-errors on Windows+ARM64 build In-Reply-To: References: Message-ID: On 06/10/2020 19:17, Bernhard Urban-Forster wrote: > I organized this PR so that each commit contains the warning emitted by MSVC as commit message and its relevant fix. > > Verified on > * Linux+ARM64: `{hotspot,jdk,langtools}:tier1`, no failures. > * Windows+ARM64: `{hotspot,jdk,langtools}:tier1`, no (new) failures. > * internal macOS+ARM64 port: build without `--disable-warnings-as-errors` still works. Just mentioning this here, because > it's yet another toolchain (Xcode / clang) that needs to be kept happy [going > forward](https://openjdk.java.net/jeps/391). Some of these don't look right. @@ -69,7 +69,7 @@ int compare_immediate_pair(const void *i1, const void *i2) // for i = 1, ... N result = 1 other bits are zero static inline uint64_t ones(int N) { - return (N == 64 ? -1ULL : (1ULL << N) - 1); + return (N == 64 ? ~0 : (1ULL << N) - 1); } Turns out this does work because ~0 is a signed quantity which is then sign extended to 64 bits, then converted to unsigned. But his is obscure and therefore risky coding style, worse that what it replaces. IMO this warning: warning C4146: unary minus operator applied to unsigned type, result still unsigned should not be used. There is nothing wrong with negating an unsigned value: doing so is well defined in all cases. Do the authors of MSVC not understand the language? Or do they think their users do not understand the language? Please have a look to see how many of these diffs would go away with that particular warning disabled. @@ -1524,7 +1524,7 @@ nmethod* SharedRuntime::generate_native_wrapper(MacroAssembler* masm, // Generate stack overflow check if (UseStackBanging) { - __ bang_stack_with_offset(JavaThread::stack_shadow_zone_size()); + __ bang_stack_with_offset((int)JavaThread::stack_shadow_zone_size()); } else { Unimplemented(); Could this one be fixed by changing stack_shadow_zone_size() or bang_stack_with_offset() ? I would have thought that whatever type stack_shadow_zone_size() returns should be compatible with bang_stack_with_offset(). @@ -1309,7 +1309,7 @@ class StubGenerator: public StubCodeGenerator { __ ldrw(r16, Address(a, rscratch2, Address::lsl(exact_log2(size)))); __ decode_heap_oop(temp); // calls verify_oop } - __ add(rscratch2, rscratch2, size); + __ add(rscratch2, rscratch2, (int)size); __ b(loop); __ bind(end); Definitely not. @@ -1367,7 +1367,7 @@ class StubGenerator: public StubCodeGenerator { // UnsafeCopyMemory page error: continue after ucm bool add_entry = !is_oop && (!aligned || sizeof(jlong) == size); UnsafeCopyMemoryMark ucmm(this, add_entry, true); - copy_memory(aligned, s, d, count, rscratch1, size); + copy_memory(aligned, s, d, count, rscratch1, (int)size); } Better to fix the type of size. The problem seems to be that it's passed as a size_t to generate_conjoint_copy() but it's used as an int elsewhere. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From coleenp at openjdk.java.net Wed Oct 7 11:44:12 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 7 Oct 2020 11:44:12 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v3] In-Reply-To: References: Message-ID: On Wed, 7 Oct 2020 06:03:20 GMT, Thomas Stuefe wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> fix comments > > src/hotspot/share/runtime/stackOverflow.hpp line 135: > >> 133: return _stack_red_zone_size; >> 134: } >> 135: static void set_stack_red_zone_size(size_t s) { > > Could we merge these individual setters for set_xxx_zone_size() into one function: > > static initialize_dimensions(size, size, ..) > ? > > Would make it clearer that these are not to be touched outside initialization. > > That function also should live in the .cpp file. I was focused primarily on moving the code not cleaning it up, which can come later. That said, since I touched this code and the caller, I took your suggestion and will add an initialize_stack_zone_sizes() and inlined the set_stack_x_zone_size() functions in that. This is better here. > src/hotspot/share/runtime/stackOverflow.hpp line 1: > >> 1: /* > > There are a number of newline issues throughout this file. Mostly code too tight (newlines missing between method > blocks) acc. to hotspot style guide. Changing above to remove the set_*zone_size function added newlines, so this looks better now. > src/hotspot/share/runtime/stackOverflow.hpp line 125: > >> 123: // These values are derived from flags StackRedPages, StackYellowPages, >> 124: // StackReservedPages and StackShadowPages. The zone size is determined >> 125: // ergonomically if page_size > 4K. > > I think the last sentence is superfluous (the first says it all) and slightly wrong too. Yes, fixed. ------------- PR: https://git.openjdk.java.net/jdk/pull/522 From coleenp at openjdk.java.net Wed Oct 7 12:01:19 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 7 Oct 2020 12:01:19 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v3] In-Reply-To: References: Message-ID: On Wed, 7 Oct 2020 06:20:37 GMT, Thomas Stuefe wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> fix comments > > src/hotspot/share/runtime/stackOverflow.hpp line 145: > >> 143: } >> 144: bool in_stack_red_zone(address a) { >> 145: return a <= stack_red_zone_base() && a >= stack_end(); > > Note that this method conflicts with > > in_stack_yellow_reserved_zone() > > since both in_stack_yellow_reserved_zone() and is_stack_red_zone() return true for a=red zone base. > > Strictly speaking it is wrong here since red zone base is not in red zone. I moved this code and didn't change the zone calculations. The code that calls this is in the signal handler and is irritatingly similar for all platforms. I'll file an RFE to consolidate this and check that this boundary makes sense. I'm not going to change this here. https://bugs.openjdk.java.net/browse/JDK-8254158 > src/hotspot/share/runtime/stackOverflow.hpp line 181: > >> 179: return _stack_yellow_zone_size + _stack_reserved_zone_size; >> 180: } >> 181: bool in_stack_yellow_reserved_zone(address a) { > > I always have to look what this actually does since there is no "yellow reserved" zone. A slight rename would help, > e.g.: "is_stack_yellow_or_reserved_zone()". I'm not going to rename this in this change. > src/hotspot/share/runtime/stackOverflow.hpp line 182: > >> 180: } >> 181: bool in_stack_yellow_reserved_zone(address a) { >> 182: return (a <= stack_reserved_zone_base()) && (a >= stack_red_zone_base()); > > Strictly speaking stack_reserved_zone_base is outside the reserved zone. We'll have to work on these ranges in a future change. > src/hotspot/share/runtime/stackOverflow.hpp line 79: > >> 77: // Stack overflow support >> 78: // >> 79: // (small addresses) > > s/small/low ? Fixed. > src/hotspot/share/runtime/stackOverflow.hpp line 119: > >> 117: // -- <-- stack_base() >> 118: // >> 119: // (large addresses) > > s/large/high ? fixed. > src/hotspot/share/runtime/stackOverflow.hpp line 95: > >> 93: // | | >> 94: // | reserved pages | >> 95: // | | > > Can we use use "zone" instead of "pages" since that term is used e.g. in https://openjdk.java.net/jeps/270 ? Sure, I changed this to zone. > src/hotspot/share/runtime/stackOverflow.hpp line 75: > >> 73: >> 74: address stack_end() const { return _stack_end; } >> 75: address stack_base() const { assert(_stack_base != nullptr,"Sanity check"); return _stack_base; } > > We now keep stack base and stack size/end both here and in Thread. Could we merge this and use this as a data holder > for Thread? I thought about doing this but I'd rather carry the information than a pointer back to Thread. It's only one extra word but doesn't invite abuse. ------------- PR: https://git.openjdk.java.net/jdk/pull/522 From stuefe at openjdk.java.net Wed Oct 7 12:21:18 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 7 Oct 2020 12:21:18 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v3] In-Reply-To: References: Message-ID: On Wed, 7 Oct 2020 11:58:20 GMT, Coleen Phillimore wrote: >> src/hotspot/share/runtime/stackOverflow.hpp line 75: >> >>> 73: >>> 74: address stack_end() const { return _stack_end; } >>> 75: address stack_base() const { assert(_stack_base != nullptr,"Sanity check"); return _stack_base; } >> >> We now keep stack base and stack size/end both here and in Thread. Could we merge this and use this as a data holder >> for Thread? > > I thought about doing this but I'd rather carry the information than a pointer back to Thread. It's only one extra > word but doesn't invite abuse. Okay! ------------- PR: https://git.openjdk.java.net/jdk/pull/522 From rriggs at openjdk.java.net Wed Oct 7 13:38:21 2020 From: rriggs at openjdk.java.net (Roger Riggs) Date: Wed, 7 Oct 2020 13:38:21 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> References: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> Message-ID: On Mon, 5 Oct 2020 18:29:58 GMT, CoreyAshford wrote: >> This patch set encompasses the following commits: >> >> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for >> the intrinsic. The API is similar to the existing encodeBlock intrinsic. >> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic >> implementation >> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. >> - Adds a JMH microbenchmark for both Base64 encoding and encoding. >> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. > > CoreyAshford has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains ten commits: > - AOT: Revert change to aotCodeHeap.cpp for decodeBlock > > Don't add the SET_AOT_GLOBAL_SYMBOL_VALUE macro for decode block until all > arches that implement AOT, implement the decodeBlock intrinsic. > - Base64.java decodeBlock: Changes from PR review > > * Make comparison safer and consistent with the while loop > * Update comment about the decodeBlock intrinsic so that it matches the new structure > * Add comment about the lack of a length check on the destination buffer > * As per issue 8138732, change HotSpotIntrinsicCandidate to IntrinsicCandidate > - stubGenerator_ppc.cpp: Changes from PR review > > * Fix clearing of upper bits to clear 32 bits instead of 31 (due to misreading of clrldi instruction) > * change and document loop_unrolls setting from 8 to 2 after re-running the benchmark > * align unrolled loop on a 32-byte boundary > * replace instruction used for checking isURL from a double word to single > word instruction since the register is effectively 32 bits wide > * cosmetic change to realign register comments. > - TestBase64.java: Changes from PR review > > * Use Utils.toByteArrays() method instead of a locally-defined method > * Generate the two non-Base64 tables dynamically rather than use static initialization > * Added comments describing the two above-mentioned arrays > - Expand the Base64 intrinsic regression test to cover decodeBlock > > This patch makes four significant changes: > > 1) The Power implementation of the decodeBlock intrinsic, at least, > requires a decode length of at least 128 bytes, but the existing test cases > are much shorter, maxing out at 111 bytes. So the patch adds a new input > data file which has longer test cases in it. > > 2) The original test cases only covers the encoding of just the printable > subset of the 7-bit ASCII characters. However, Base64 encoding requires > being able to encode arbitrary binary data, i.e. it must handle all 256 > 8-bit byte encodings. To remedy this, but keep the original line-oriented > style of the input data, I added another input file type that uses a simple > ASCII hexadecimal encoding - two ASCII hex characters per 8-bit byte. When > test0 is called, a new parameter is passed that specifies the type of the > input file, which is either the original ASCII type or the hexadecimal > format. So to test both longer input data and arbitrary 8-bit data, the > newly added input test file has test cases which are both longer and > encoded in ASCII hex so as to give full 8-bit capability. When reading > this type of file, test0 calls a newly-added function to translate the > ASCII hex to binary data. Except for the first line of input data, which > contains all possible 8-bit values sequentially, the input data was > generated using a random length (between 111 and 520 bytes) buffer filled > with random 8-bit data, which should give adequate coverage. > > 3) The original test did not test that the decoder detects illegal Base64 > bytes. This change chooses a random location in the encoded data to > corrupt with a randomly-chosen byte which is illegal for the specific > Base64 encoding that is chosen (i.e. standard or URLsafe). It then calls > the decode function to verify that the illegal byte is detected and the > proper exception is thrown. > > 4) The test iteration count was originally 100K, but that is far more than > enough iterations to test the intrinsic. It takes 20K iterations on each > instrinsic for HotSpot C2 to begin calling it. The test originally had > three types of encodings to test and called the encode intrinsic four times > for each iteration, which works out to 100K * 3 * 4 = 1.2M calls just to > encode. Decode was called four times as well (now five because of the > illegal byte test). I believe this is excessive and with the extra test > data I have added, the test was timing out after ten minutes of execution. > It appears that it is timing out, not because the intrinsics take a long > time to run, but because test0 generates an enormous number of discarded > data buffers for the GC system to recover (the test runs at about 39GB of > virtual memory on my test machine). To remedy the timeout problem, I have > changed the code so that a warmup function of 20K repetitions is performed > on a fixed buffer, to activate the instrinsic(s). After the warmup, I have > reduced the number of iterations to 5K on each test0 call. This should > give adequate coverage. > - Add JMH benchmark for Base64 variable length buffer decoding > - Add Power9+ intrinsic implementation for Base64 decoding > - Add HotSpot code to implement Base64 decodeBlock API > - Add HotSpotIntrinsicCandidate and API for Base64 decoding src/java.base/share/classes/java/util/Base64.java line 776: > 774: * @param sl > 775: * the total length of source array > 776: * @param dst Please update the comment for `sl`. sl is the offset (exclusive) past the last byte to be converted. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From mdoerr at openjdk.java.net Wed Oct 7 14:06:20 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 7 Oct 2020 14:06:20 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> References: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> Message-ID: On Mon, 5 Oct 2020 18:29:58 GMT, CoreyAshford wrote: >> This patch set encompasses the following commits: >> >> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for >> the intrinsic. The API is similar to the existing encodeBlock intrinsic. >> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic >> implementation >> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. >> - Adds a JMH microbenchmark for both Base64 encoding and encoding. >> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. > > CoreyAshford has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains ten commits: > - AOT: Revert change to aotCodeHeap.cpp for decodeBlock > > Don't add the SET_AOT_GLOBAL_SYMBOL_VALUE macro for decode block until all > arches that implement AOT, implement the decodeBlock intrinsic. > - Base64.java decodeBlock: Changes from PR review > > * Make comparison safer and consistent with the while loop > * Update comment about the decodeBlock intrinsic so that it matches the new structure > * Add comment about the lack of a length check on the destination buffer > * As per issue 8138732, change HotSpotIntrinsicCandidate to IntrinsicCandidate > - stubGenerator_ppc.cpp: Changes from PR review > > * Fix clearing of upper bits to clear 32 bits instead of 31 (due to misreading of clrldi instruction) > * change and document loop_unrolls setting from 8 to 2 after re-running the benchmark > * align unrolled loop on a 32-byte boundary > * replace instruction used for checking isURL from a double word to single > word instruction since the register is effectively 32 bits wide > * cosmetic change to realign register comments. > - TestBase64.java: Changes from PR review > > * Use Utils.toByteArrays() method instead of a locally-defined method > * Generate the two non-Base64 tables dynamically rather than use static initialization > * Added comments describing the two above-mentioned arrays > - Expand the Base64 intrinsic regression test to cover decodeBlock > > This patch makes four significant changes: > > 1) The Power implementation of the decodeBlock intrinsic, at least, > requires a decode length of at least 128 bytes, but the existing test cases > are much shorter, maxing out at 111 bytes. So the patch adds a new input > data file which has longer test cases in it. > > 2) The original test cases only covers the encoding of just the printable > subset of the 7-bit ASCII characters. However, Base64 encoding requires > being able to encode arbitrary binary data, i.e. it must handle all 256 > 8-bit byte encodings. To remedy this, but keep the original line-oriented > style of the input data, I added another input file type that uses a simple > ASCII hexadecimal encoding - two ASCII hex characters per 8-bit byte. When > test0 is called, a new parameter is passed that specifies the type of the > input file, which is either the original ASCII type or the hexadecimal > format. So to test both longer input data and arbitrary 8-bit data, the > newly added input test file has test cases which are both longer and > encoded in ASCII hex so as to give full 8-bit capability. When reading > this type of file, test0 calls a newly-added function to translate the > ASCII hex to binary data. Except for the first line of input data, which > contains all possible 8-bit values sequentially, the input data was > generated using a random length (between 111 and 520 bytes) buffer filled > with random 8-bit data, which should give adequate coverage. > > 3) The original test did not test that the decoder detects illegal Base64 > bytes. This change chooses a random location in the encoded data to > corrupt with a randomly-chosen byte which is illegal for the specific > Base64 encoding that is chosen (i.e. standard or URLsafe). It then calls > the decode function to verify that the illegal byte is detected and the > proper exception is thrown. > > 4) The test iteration count was originally 100K, but that is far more than > enough iterations to test the intrinsic. It takes 20K iterations on each > instrinsic for HotSpot C2 to begin calling it. The test originally had > three types of encodings to test and called the encode intrinsic four times > for each iteration, which works out to 100K * 3 * 4 = 1.2M calls just to > encode. Decode was called four times as well (now five because of the > illegal byte test). I believe this is excessive and with the extra test > data I have added, the test was timing out after ten minutes of execution. > It appears that it is timing out, not because the intrinsics take a long > time to run, but because test0 generates an enormous number of discarded > data buffers for the GC system to recover (the test runs at about 39GB of > virtual memory on my test machine). To remedy the timeout problem, I have > changed the code so that a warmup function of 20K repetitions is performed > on a fixed buffer, to activate the instrinsic(s). After the warmup, I have > reduced the number of iterations to 5K on each test0 call. This should > give adequate coverage. > - Add JMH benchmark for Base64 variable length buffer decoding > - Add Power9+ intrinsic implementation for Base64 decoding > - Add HotSpot code to implement Base64 decodeBlock API > - Add HotSpotIntrinsicCandidate and API for Base64 decoding src/java.base/share/classes/java/util/Base64.java line 812: > 810: > 811: while (sp < sl) { > 812: if (shiftto == 18 && sp + 4 < sl) { // fast path Please change to sp < s1 - 4. Current version is sensitive to integer overflow. That's not a real problem in the current code, because the next check catches that, but we should better avoid this with the new intrinsics. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From stuefe at openjdk.java.net Wed Oct 7 14:19:24 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 7 Oct 2020 14:19:24 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v3] In-Reply-To: References: Message-ID: <9iGF5lTN5Qbif99byG3yslS5CGURVCM9POxodGTWsqU=.927d106e-e08e-4702-b8bd-beef06872ac1@github.com> On Tue, 6 Oct 2020 23:50:22 GMT, Coleen Phillimore wrote: >> This change moves the significant amount of stack overflow related code (with ascii art!) out of thread files into a >> new file. Many of the functions are static functions and some go through JavaThread::_stack_overflow_state where >> needed. All functions are moved and not modified except for qualification. I also added a delegating constructor to >> JavaThread::JavaThread so reordered the assignments as initializers from JavaThread::initialize. >> Tested with tier1-6 and builds on arm32, ppc, s390 and zero. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > fix comments Thanks Coleen for the cleanup and for taking my input. This looks good to me now. ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/522 From mdoerr at openjdk.java.net Wed Oct 7 14:31:18 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 7 Oct 2020 14:31:18 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> References: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> Message-ID: On Mon, 5 Oct 2020 18:29:58 GMT, CoreyAshford wrote: >> This patch set encompasses the following commits: >> >> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for >> the intrinsic. The API is similar to the existing encodeBlock intrinsic. >> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic >> implementation >> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. >> - Adds a JMH microbenchmark for both Base64 encoding and encoding. >> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. > > CoreyAshford has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains ten commits: > - AOT: Revert change to aotCodeHeap.cpp for decodeBlock > > Don't add the SET_AOT_GLOBAL_SYMBOL_VALUE macro for decode block until all > arches that implement AOT, implement the decodeBlock intrinsic. > - Base64.java decodeBlock: Changes from PR review > > * Make comparison safer and consistent with the while loop > * Update comment about the decodeBlock intrinsic so that it matches the new structure > * Add comment about the lack of a length check on the destination buffer > * As per issue 8138732, change HotSpotIntrinsicCandidate to IntrinsicCandidate > - stubGenerator_ppc.cpp: Changes from PR review > > * Fix clearing of upper bits to clear 32 bits instead of 31 (due to misreading of clrldi instruction) > * change and document loop_unrolls setting from 8 to 2 after re-running the benchmark > * align unrolled loop on a 32-byte boundary > * replace instruction used for checking isURL from a double word to single > word instruction since the register is effectively 32 bits wide > * cosmetic change to realign register comments. > - TestBase64.java: Changes from PR review > > * Use Utils.toByteArrays() method instead of a locally-defined method > * Generate the two non-Base64 tables dynamically rather than use static initialization > * Added comments describing the two above-mentioned arrays > - Expand the Base64 intrinsic regression test to cover decodeBlock > > This patch makes four significant changes: > > 1) The Power implementation of the decodeBlock intrinsic, at least, > requires a decode length of at least 128 bytes, but the existing test cases > are much shorter, maxing out at 111 bytes. So the patch adds a new input > data file which has longer test cases in it. > > 2) The original test cases only covers the encoding of just the printable > subset of the 7-bit ASCII characters. However, Base64 encoding requires > being able to encode arbitrary binary data, i.e. it must handle all 256 > 8-bit byte encodings. To remedy this, but keep the original line-oriented > style of the input data, I added another input file type that uses a simple > ASCII hexadecimal encoding - two ASCII hex characters per 8-bit byte. When > test0 is called, a new parameter is passed that specifies the type of the > input file, which is either the original ASCII type or the hexadecimal > format. So to test both longer input data and arbitrary 8-bit data, the > newly added input test file has test cases which are both longer and > encoded in ASCII hex so as to give full 8-bit capability. When reading > this type of file, test0 calls a newly-added function to translate the > ASCII hex to binary data. Except for the first line of input data, which > contains all possible 8-bit values sequentially, the input data was > generated using a random length (between 111 and 520 bytes) buffer filled > with random 8-bit data, which should give adequate coverage. > > 3) The original test did not test that the decoder detects illegal Base64 > bytes. This change chooses a random location in the encoded data to > corrupt with a randomly-chosen byte which is illegal for the specific > Base64 encoding that is chosen (i.e. standard or URLsafe). It then calls > the decode function to verify that the illegal byte is detected and the > proper exception is thrown. > > 4) The test iteration count was originally 100K, but that is far more than > enough iterations to test the intrinsic. It takes 20K iterations on each > instrinsic for HotSpot C2 to begin calling it. The test originally had > three types of encodings to test and called the encode intrinsic four times > for each iteration, which works out to 100K * 3 * 4 = 1.2M calls just to > encode. Decode was called four times as well (now five because of the > illegal byte test). I believe this is excessive and with the extra test > data I have added, the test was timing out after ten minutes of execution. > It appears that it is timing out, not because the intrinsics take a long > time to run, but because test0 generates an enormous number of discarded > data buffers for the GC system to recover (the test runs at about 39GB of > virtual memory on my test machine). To remedy the timeout problem, I have > changed the code so that a warmup function of 20K repetitions is performed > on a fixed buffer, to activate the instrinsic(s). After the warmup, I have > reduced the number of iterations to 5K on each test0 call. This should > give adequate coverage. > - Add JMH benchmark for Base64 variable length buffer decoding > - Add Power9+ intrinsic implementation for Base64 decoding > - Add HotSpot code to implement Base64 decodeBlock API > - Add HotSpotIntrinsicCandidate and API for Base64 decoding src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3818: > 3816: __ cmpd(CCR0, end, in); > 3817: __ blt_predict_not_taken(CCR0, unrolled_loop_exit); > 3818: __ align(32); align should be before bind(unrolled_loop_start) src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3: > 1: /* > 2: * Copyright (c) 1997, 2020, Oracle and/or its affiliates. All rights reserved. > 3: * Copyright (c) 2012, 2020, SAP SE. All rights reserved. No comma before SAP SE, please! (See https://bugs.openjdk.java.net/browse/JDK-8252837) ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From coleenp at openjdk.java.net Wed Oct 7 14:41:33 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 7 Oct 2020 14:41:33 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v4] In-Reply-To: References: Message-ID: > This change moves the significant amount of stack overflow related code (with ascii art!) out of thread files into a > new file. Many of the functions are static functions and some go through JavaThread::_stack_overflow_state where > needed. All functions are moved and not modified except for qualification. I also added a delegating constructor to > JavaThread::JavaThread so reordered the assignments as initializers from JavaThread::initialize. > Tested with tier1-6 and builds on arm32, ppc, s390 and zero. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Add initialize_stack_zone_sizes ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/522/files - new: https://git.openjdk.java.net/jdk/pull/522/files/722eb6f2..24f8534f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=522&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=522&range=02-03 Stats: 74 lines in 3 files changed: 29 ins; 36 del; 9 mod Patch: https://git.openjdk.java.net/jdk/pull/522.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/522/head:pull/522 PR: https://git.openjdk.java.net/jdk/pull/522 From coleenp at openjdk.java.net Wed Oct 7 14:45:22 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 7 Oct 2020 14:45:22 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v3] In-Reply-To: References: Message-ID: On Wed, 7 Oct 2020 06:37:57 GMT, Thomas Stuefe wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> fix comments > > src/hotspot/share/runtime/stackOverflow.hpp line 121: > >> 119: // (large addresses) >> 120: // >> 121: > > Nice ascii art :) > > I wish though it could communicate better the openness of the ranges. E.g.: > > +----------------+ > | | <-- stack_end() > | red zone | > | | > +----------------+ > | | <-- red_zone_base() > | yellow zone | > | | > .... > | | > +----------------+ > <-- stack_base() > > Maybe its just me but I always have to think a bit more here. With downward growing stacks normal range thinking is > reversed wrt to openness, so stack_base() points outside the stack and stack_end() is in the stack. This is true for > all base values - they point to locations outside the zone they base. Maybe that is clear to all others but it > sometimes surprises me. I just found this comment. I think the ascii art was added by @GoeLin. I just moved it. Your picture is upside down but it sorta makes sense that the 'base' addresses point to the first address in the range, which is what I think they do. > src/hotspot/share/runtime/stackOverflow.hpp line 131: > >> 129: static size_t _stack_shadow_zone_size; >> 130: public: >> 131: static size_t stack_red_zone_size() { > > Naming: since you moved these formerly-Thread-methods into this enclosing class all the "_stack" and "stack" prefixes > for members and methods are not needed anymore. I don't want to rename these here. > src/hotspot/share/runtime/stackOverflow.hpp line 141: > >> 139: _stack_red_zone_size = s; >> 140: } >> 141: address stack_red_zone_base() { > > could be const (as could a couple of others, I won't mark them individually) The static member functions can't have const. Let me see if there are others. ------------- PR: https://git.openjdk.java.net/jdk/pull/522 From mdoerr at openjdk.java.net Wed Oct 7 14:47:20 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 7 Oct 2020 14:47:20 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> References: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> Message-ID: On Mon, 5 Oct 2020 18:29:58 GMT, CoreyAshford wrote: >> This patch set encompasses the following commits: >> >> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for >> the intrinsic. The API is similar to the existing encodeBlock intrinsic. >> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic >> implementation >> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. >> - Adds a JMH microbenchmark for both Base64 encoding and encoding. >> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. > > CoreyAshford has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains ten commits: > - AOT: Revert change to aotCodeHeap.cpp for decodeBlock > > Don't add the SET_AOT_GLOBAL_SYMBOL_VALUE macro for decode block until all > arches that implement AOT, implement the decodeBlock intrinsic. > - Base64.java decodeBlock: Changes from PR review > > * Make comparison safer and consistent with the while loop > * Update comment about the decodeBlock intrinsic so that it matches the new structure > * Add comment about the lack of a length check on the destination buffer > * As per issue 8138732, change HotSpotIntrinsicCandidate to IntrinsicCandidate > - stubGenerator_ppc.cpp: Changes from PR review > > * Fix clearing of upper bits to clear 32 bits instead of 31 (due to misreading of clrldi instruction) > * change and document loop_unrolls setting from 8 to 2 after re-running the benchmark > * align unrolled loop on a 32-byte boundary > * replace instruction used for checking isURL from a double word to single > word instruction since the register is effectively 32 bits wide > * cosmetic change to realign register comments. > - TestBase64.java: Changes from PR review > > * Use Utils.toByteArrays() method instead of a locally-defined method > * Generate the two non-Base64 tables dynamically rather than use static initialization > * Added comments describing the two above-mentioned arrays > - Expand the Base64 intrinsic regression test to cover decodeBlock > > This patch makes four significant changes: > > 1) The Power implementation of the decodeBlock intrinsic, at least, > requires a decode length of at least 128 bytes, but the existing test cases > are much shorter, maxing out at 111 bytes. So the patch adds a new input > data file which has longer test cases in it. > > 2) The original test cases only covers the encoding of just the printable > subset of the 7-bit ASCII characters. However, Base64 encoding requires > being able to encode arbitrary binary data, i.e. it must handle all 256 > 8-bit byte encodings. To remedy this, but keep the original line-oriented > style of the input data, I added another input file type that uses a simple > ASCII hexadecimal encoding - two ASCII hex characters per 8-bit byte. When > test0 is called, a new parameter is passed that specifies the type of the > input file, which is either the original ASCII type or the hexadecimal > format. So to test both longer input data and arbitrary 8-bit data, the > newly added input test file has test cases which are both longer and > encoded in ASCII hex so as to give full 8-bit capability. When reading > this type of file, test0 calls a newly-added function to translate the > ASCII hex to binary data. Except for the first line of input data, which > contains all possible 8-bit values sequentially, the input data was > generated using a random length (between 111 and 520 bytes) buffer filled > with random 8-bit data, which should give adequate coverage. > > 3) The original test did not test that the decoder detects illegal Base64 > bytes. This change chooses a random location in the encoded data to > corrupt with a randomly-chosen byte which is illegal for the specific > Base64 encoding that is chosen (i.e. standard or URLsafe). It then calls > the decode function to verify that the illegal byte is detected and the > proper exception is thrown. > > 4) The test iteration count was originally 100K, but that is far more than > enough iterations to test the intrinsic. It takes 20K iterations on each > instrinsic for HotSpot C2 to begin calling it. The test originally had > three types of encodings to test and called the encode intrinsic four times > for each iteration, which works out to 100K * 3 * 4 = 1.2M calls just to > encode. Decode was called four times as well (now five because of the > illegal byte test). I believe this is excessive and with the extra test > data I have added, the test was timing out after ten minutes of execution. > It appears that it is timing out, not because the intrinsics take a long > time to run, but because test0 generates an enormous number of discarded > data buffers for the GC system to recover (the test runs at about 39GB of > virtual memory on my test machine). To remedy the timeout problem, I have > changed the code so that a warmup function of 20K repetitions is performed > on a fixed buffer, to activate the instrinsic(s). After the warmup, I have > reduced the number of iterations to 5K on each test0 call. This should > give adequate coverage. > - Add JMH benchmark for Base64 variable length buffer decoding > - Add Power9+ intrinsic implementation for Base64 decoding > - Add HotSpot code to implement Base64 decodeBlock API > - Add HotSpotIntrinsicCandidate and API for Base64 decoding src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3812: > 3810: // Address of the last byte of the source is (in + sl - 1) > 3811: __ add(end, in, sl); > 3812: __ subi(end, end, 1); Looks a bit complicated, but ok. I'd have loaded the number of iterations into ctr and used bdnz instruction for the loop. src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3671: > 3669: // an advantage to keeping loop_unrolls small (to be able to process > 3670: // smaller buffers), 2 is clearly the best choice. > 3671: const unsigned loop_unrolls = 2; Unrolling should be re-evaluated after alignment is fixed. align(32) is currently at the wrong place (see my comment below). ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From mdoerr at openjdk.java.net Wed Oct 7 15:18:30 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 7 Oct 2020 15:18:30 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> References: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> Message-ID: On Mon, 5 Oct 2020 18:29:58 GMT, CoreyAshford wrote: >> This patch set encompasses the following commits: >> >> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for >> the intrinsic. The API is similar to the existing encodeBlock intrinsic. >> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic >> implementation >> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. >> - Adds a JMH microbenchmark for both Base64 encoding and encoding. >> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. > > CoreyAshford has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains ten commits: > - AOT: Revert change to aotCodeHeap.cpp for decodeBlock > > Don't add the SET_AOT_GLOBAL_SYMBOL_VALUE macro for decode block until all > arches that implement AOT, implement the decodeBlock intrinsic. > - Base64.java decodeBlock: Changes from PR review > > * Make comparison safer and consistent with the while loop > * Update comment about the decodeBlock intrinsic so that it matches the new structure > * Add comment about the lack of a length check on the destination buffer > * As per issue 8138732, change HotSpotIntrinsicCandidate to IntrinsicCandidate > - stubGenerator_ppc.cpp: Changes from PR review > > * Fix clearing of upper bits to clear 32 bits instead of 31 (due to misreading of clrldi instruction) > * change and document loop_unrolls setting from 8 to 2 after re-running the benchmark > * align unrolled loop on a 32-byte boundary > * replace instruction used for checking isURL from a double word to single > word instruction since the register is effectively 32 bits wide > * cosmetic change to realign register comments. > - TestBase64.java: Changes from PR review > > * Use Utils.toByteArrays() method instead of a locally-defined method > * Generate the two non-Base64 tables dynamically rather than use static initialization > * Added comments describing the two above-mentioned arrays > - Expand the Base64 intrinsic regression test to cover decodeBlock > > This patch makes four significant changes: > > 1) The Power implementation of the decodeBlock intrinsic, at least, > requires a decode length of at least 128 bytes, but the existing test cases > are much shorter, maxing out at 111 bytes. So the patch adds a new input > data file which has longer test cases in it. > > 2) The original test cases only covers the encoding of just the printable > subset of the 7-bit ASCII characters. However, Base64 encoding requires > being able to encode arbitrary binary data, i.e. it must handle all 256 > 8-bit byte encodings. To remedy this, but keep the original line-oriented > style of the input data, I added another input file type that uses a simple > ASCII hexadecimal encoding - two ASCII hex characters per 8-bit byte. When > test0 is called, a new parameter is passed that specifies the type of the > input file, which is either the original ASCII type or the hexadecimal > format. So to test both longer input data and arbitrary 8-bit data, the > newly added input test file has test cases which are both longer and > encoded in ASCII hex so as to give full 8-bit capability. When reading > this type of file, test0 calls a newly-added function to translate the > ASCII hex to binary data. Except for the first line of input data, which > contains all possible 8-bit values sequentially, the input data was > generated using a random length (between 111 and 520 bytes) buffer filled > with random 8-bit data, which should give adequate coverage. > > 3) The original test did not test that the decoder detects illegal Base64 > bytes. This change chooses a random location in the encoded data to > corrupt with a randomly-chosen byte which is illegal for the specific > Base64 encoding that is chosen (i.e. standard or URLsafe). It then calls > the decode function to verify that the illegal byte is detected and the > proper exception is thrown. > > 4) The test iteration count was originally 100K, but that is far more than > enough iterations to test the intrinsic. It takes 20K iterations on each > instrinsic for HotSpot C2 to begin calling it. The test originally had > three types of encodings to test and called the encode intrinsic four times > for each iteration, which works out to 100K * 3 * 4 = 1.2M calls just to > encode. Decode was called four times as well (now five because of the > illegal byte test). I believe this is excessive and with the extra test > data I have added, the test was timing out after ten minutes of execution. > It appears that it is timing out, not because the intrinsics take a long > time to run, but because test0 generates an enormous number of discarded > data buffers for the GC system to recover (the test runs at about 39GB of > virtual memory on my test machine). To remedy the timeout problem, I have > changed the code so that a warmup function of 20K repetitions is performed > on a fixed buffer, to activate the instrinsic(s). After the warmup, I have > reduced the number of iterations to 5K on each test0 call. This should > give adequate coverage. > - Add JMH benchmark for Base64 variable length buffer decoding > - Add Power9+ intrinsic implementation for Base64 decoding > - Add HotSpot code to implement Base64 decodeBlock API > - Add HotSpotIntrinsicCandidate and API for Base64 decoding src/hotspot/cpu/ppc/vm_version_ppc.cpp line 160: > 158: if (UseBASE64Intrinsics) { > 159: warning("UseBASE64Intrinsics specified, but needs at least Power9."); > 160: FLAG_SET_DEFAULT(UseCharacterCompareIntrinsics, false); Copy & paste bug. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From mdoerr at openjdk.java.net Wed Oct 7 15:21:11 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 7 Oct 2020 15:21:11 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> References: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> Message-ID: On Mon, 5 Oct 2020 18:29:58 GMT, CoreyAshford wrote: >> This patch set encompasses the following commits: >> >> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for >> the intrinsic. The API is similar to the existing encodeBlock intrinsic. >> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic >> implementation >> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. >> - Adds a JMH microbenchmark for both Base64 encoding and encoding. >> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. > > CoreyAshford has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains ten commits: > - AOT: Revert change to aotCodeHeap.cpp for decodeBlock > > Don't add the SET_AOT_GLOBAL_SYMBOL_VALUE macro for decode block until all > arches that implement AOT, implement the decodeBlock intrinsic. > - Base64.java decodeBlock: Changes from PR review > > * Make comparison safer and consistent with the while loop > * Update comment about the decodeBlock intrinsic so that it matches the new structure > * Add comment about the lack of a length check on the destination buffer > * As per issue 8138732, change HotSpotIntrinsicCandidate to IntrinsicCandidate > - stubGenerator_ppc.cpp: Changes from PR review > > * Fix clearing of upper bits to clear 32 bits instead of 31 (due to misreading of clrldi instruction) > * change and document loop_unrolls setting from 8 to 2 after re-running the benchmark > * align unrolled loop on a 32-byte boundary > * replace instruction used for checking isURL from a double word to single > word instruction since the register is effectively 32 bits wide > * cosmetic change to realign register comments. > - TestBase64.java: Changes from PR review > > * Use Utils.toByteArrays() method instead of a locally-defined method > * Generate the two non-Base64 tables dynamically rather than use static initialization > * Added comments describing the two above-mentioned arrays > - Expand the Base64 intrinsic regression test to cover decodeBlock > > This patch makes four significant changes: > > 1) The Power implementation of the decodeBlock intrinsic, at least, > requires a decode length of at least 128 bytes, but the existing test cases > are much shorter, maxing out at 111 bytes. So the patch adds a new input > data file which has longer test cases in it. > > 2) The original test cases only covers the encoding of just the printable > subset of the 7-bit ASCII characters. However, Base64 encoding requires > being able to encode arbitrary binary data, i.e. it must handle all 256 > 8-bit byte encodings. To remedy this, but keep the original line-oriented > style of the input data, I added another input file type that uses a simple > ASCII hexadecimal encoding - two ASCII hex characters per 8-bit byte. When > test0 is called, a new parameter is passed that specifies the type of the > input file, which is either the original ASCII type or the hexadecimal > format. So to test both longer input data and arbitrary 8-bit data, the > newly added input test file has test cases which are both longer and > encoded in ASCII hex so as to give full 8-bit capability. When reading > this type of file, test0 calls a newly-added function to translate the > ASCII hex to binary data. Except for the first line of input data, which > contains all possible 8-bit values sequentially, the input data was > generated using a random length (between 111 and 520 bytes) buffer filled > with random 8-bit data, which should give adequate coverage. > > 3) The original test did not test that the decoder detects illegal Base64 > bytes. This change chooses a random location in the encoded data to > corrupt with a randomly-chosen byte which is illegal for the specific > Base64 encoding that is chosen (i.e. standard or URLsafe). It then calls > the decode function to verify that the illegal byte is detected and the > proper exception is thrown. > > 4) The test iteration count was originally 100K, but that is far more than > enough iterations to test the intrinsic. It takes 20K iterations on each > instrinsic for HotSpot C2 to begin calling it. The test originally had > three types of encodings to test and called the encode intrinsic four times > for each iteration, which works out to 100K * 3 * 4 = 1.2M calls just to > encode. Decode was called four times as well (now five because of the > illegal byte test). I believe this is excessive and with the extra test > data I have added, the test was timing out after ten minutes of execution. > It appears that it is timing out, not because the intrinsics take a long > time to run, but because test0 generates an enormous number of discarded > data buffers for the GC system to recover (the test runs at about 39GB of > virtual memory on my test machine). To remedy the timeout problem, I have > changed the code so that a warmup function of 20K repetitions is performed > on a fixed buffer, to activate the instrinsic(s). After the warmup, I have > reduced the number of iterations to 5K on each test0 call. This should > give adequate coverage. > - Add JMH benchmark for Base64 variable length buffer decoding > - Add Power9+ intrinsic implementation for Base64 decoding > - Add HotSpot code to implement Base64 decodeBlock API > - Add HotSpotIntrinsicCandidate and API for Base64 decoding src/hotspot/share/classfile/vmIntrinsics.cpp line 491: > 489: if (!UseGHASHIntrinsics) return true; > 490: break; > 491: case vmIntrinsics::_base64_decodeBlock: I'd prefer to use consistent order. You have inserted decode after encode at other places. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From coleenp at openjdk.java.net Wed Oct 7 15:22:31 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 7 Oct 2020 15:22:31 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v5] In-Reply-To: References: Message-ID: <8lfHcGKHmQfPsBmJIPr9_gWW6IYTShT5ByyCVOxYawg=.3ae0d350-53e5-43b0-81f7-2521e4afdfdd@github.com> > This change moves the significant amount of stack overflow related code (with ascii art!) out of thread files into a > new file. Many of the functions are static functions and some go through JavaThread::_stack_overflow_state where > needed. All functions are moved and not modified except for qualification. I also added a delegating constructor to > JavaThread::JavaThread so reordered the assignments as initializers from JavaThread::initialize. > Tested with tier1-6 and builds on arm32, ppc, s390 and zero. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Added some const to StackOverflow declarations ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/522/files - new: https://git.openjdk.java.net/jdk/pull/522/files/24f8534f..3191748b Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=522&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=522&range=03-04 Stats: 11 lines in 2 files changed: 2 ins; 0 del; 9 mod Patch: https://git.openjdk.java.net/jdk/pull/522.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/522/head:pull/522 PR: https://git.openjdk.java.net/jdk/pull/522 From coleenp at openjdk.java.net Wed Oct 7 15:22:32 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 7 Oct 2020 15:22:32 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v3] In-Reply-To: References: Message-ID: On Wed, 7 Oct 2020 14:42:29 GMT, Coleen Phillimore wrote: >> src/hotspot/share/runtime/stackOverflow.hpp line 141: >> >>> 139: _stack_red_zone_size = s; >>> 140: } >>> 141: address stack_red_zone_base() { >> >> could be const (as could a couple of others, I won't mark them individually) > > The static member functions can't have const. Let me see if there are others. I added some 'const's and pushed up the commit. It is perfect now. ------------- PR: https://git.openjdk.java.net/jdk/pull/522 From mdoerr at openjdk.java.net Wed Oct 7 15:27:09 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 7 Oct 2020 15:27:09 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> References: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> Message-ID: On Mon, 5 Oct 2020 18:29:58 GMT, CoreyAshford wrote: >> This patch set encompasses the following commits: >> >> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for >> the intrinsic. The API is similar to the existing encodeBlock intrinsic. >> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic >> implementation >> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. >> - Adds a JMH microbenchmark for both Base64 encoding and encoding. >> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. > > CoreyAshford has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains ten commits: > - AOT: Revert change to aotCodeHeap.cpp for decodeBlock > > Don't add the SET_AOT_GLOBAL_SYMBOL_VALUE macro for decode block until all > arches that implement AOT, implement the decodeBlock intrinsic. > - Base64.java decodeBlock: Changes from PR review > > * Make comparison safer and consistent with the while loop > * Update comment about the decodeBlock intrinsic so that it matches the new structure > * Add comment about the lack of a length check on the destination buffer > * As per issue 8138732, change HotSpotIntrinsicCandidate to IntrinsicCandidate > - stubGenerator_ppc.cpp: Changes from PR review > > * Fix clearing of upper bits to clear 32 bits instead of 31 (due to misreading of clrldi instruction) > * change and document loop_unrolls setting from 8 to 2 after re-running the benchmark > * align unrolled loop on a 32-byte boundary > * replace instruction used for checking isURL from a double word to single > word instruction since the register is effectively 32 bits wide > * cosmetic change to realign register comments. > - TestBase64.java: Changes from PR review > > * Use Utils.toByteArrays() method instead of a locally-defined method > * Generate the two non-Base64 tables dynamically rather than use static initialization > * Added comments describing the two above-mentioned arrays > - Expand the Base64 intrinsic regression test to cover decodeBlock > > This patch makes four significant changes: > > 1) The Power implementation of the decodeBlock intrinsic, at least, > requires a decode length of at least 128 bytes, but the existing test cases > are much shorter, maxing out at 111 bytes. So the patch adds a new input > data file which has longer test cases in it. > > 2) The original test cases only covers the encoding of just the printable > subset of the 7-bit ASCII characters. However, Base64 encoding requires > being able to encode arbitrary binary data, i.e. it must handle all 256 > 8-bit byte encodings. To remedy this, but keep the original line-oriented > style of the input data, I added another input file type that uses a simple > ASCII hexadecimal encoding - two ASCII hex characters per 8-bit byte. When > test0 is called, a new parameter is passed that specifies the type of the > input file, which is either the original ASCII type or the hexadecimal > format. So to test both longer input data and arbitrary 8-bit data, the > newly added input test file has test cases which are both longer and > encoded in ASCII hex so as to give full 8-bit capability. When reading > this type of file, test0 calls a newly-added function to translate the > ASCII hex to binary data. Except for the first line of input data, which > contains all possible 8-bit values sequentially, the input data was > generated using a random length (between 111 and 520 bytes) buffer filled > with random 8-bit data, which should give adequate coverage. > > 3) The original test did not test that the decoder detects illegal Base64 > bytes. This change chooses a random location in the encoded data to > corrupt with a randomly-chosen byte which is illegal for the specific > Base64 encoding that is chosen (i.e. standard or URLsafe). It then calls > the decode function to verify that the illegal byte is detected and the > proper exception is thrown. > > 4) The test iteration count was originally 100K, but that is far more than > enough iterations to test the intrinsic. It takes 20K iterations on each > instrinsic for HotSpot C2 to begin calling it. The test originally had > three types of encodings to test and called the encode intrinsic four times > for each iteration, which works out to 100K * 3 * 4 = 1.2M calls just to > encode. Decode was called four times as well (now five because of the > illegal byte test). I believe this is excessive and with the extra test > data I have added, the test was timing out after ten minutes of execution. > It appears that it is timing out, not because the intrinsics take a long > time to run, but because test0 generates an enormous number of discarded > data buffers for the GC system to recover (the test runs at about 39GB of > virtual memory on my test machine). To remedy the timeout problem, I have > changed the code so that a warmup function of 20K repetitions is performed > on a fixed buffer, to activate the instrinsic(s). After the warmup, I have > reduced the number of iterations to 5K on each test0 call. This should > give adequate coverage. > - Add JMH benchmark for Base64 variable length buffer decoding > - Add Power9+ intrinsic implementation for Base64 decoding > - Add HotSpot code to implement Base64 decodeBlock API > - Add HotSpotIntrinsicCandidate and API for Base64 decoding src/hotspot/share/opto/library_call.cpp line 310: > 308: bool inline_base64_decodeBlock(); > 309: bool inline_digestBase_implCompress(vmIntrinsics::ID id); > 310: bool inline_sha_implCompress(vmIntrinsics::ID id); Why is that in this change? ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From redestad at openjdk.java.net Wed Oct 7 15:31:16 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Wed, 7 Oct 2020 15:31:16 GMT Subject: RFR: 8254168: Remove TemplateTable::count_calls Message-ID: <-4q2TWTzGtNKf7ilZ0VaxuPGBZSxy9rbOpi-QgCyhAI=.a57e7dd0-145d-4d2f-89a4-31f23777b4b8@github.com> Method TemplateTable::count_calls is never called, and only has a dummy definition on some platforms. I suggest removing it. ------------- Commit messages: - Merge branch 'master' into remove_count_calls - Remove TemplateTable::count_calls Changes: https://git.openjdk.java.net/jdk/pull/544/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=544&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254168 Stats: 17 lines in 4 files changed: 0 ins; 17 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/544.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/544/head:pull/544 PR: https://git.openjdk.java.net/jdk/pull/544 From mdoerr at openjdk.java.net Wed Oct 7 15:40:15 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 7 Oct 2020 15:40:15 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> References: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> Message-ID: On Mon, 5 Oct 2020 18:29:58 GMT, CoreyAshford wrote: >> This patch set encompasses the following commits: >> >> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for >> the intrinsic. The API is similar to the existing encodeBlock intrinsic. >> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic >> implementation >> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. >> - Adds a JMH microbenchmark for both Base64 encoding and encoding. >> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. > > CoreyAshford has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains ten commits: > - AOT: Revert change to aotCodeHeap.cpp for decodeBlock > > Don't add the SET_AOT_GLOBAL_SYMBOL_VALUE macro for decode block until all > arches that implement AOT, implement the decodeBlock intrinsic. > - Base64.java decodeBlock: Changes from PR review > > * Make comparison safer and consistent with the while loop > * Update comment about the decodeBlock intrinsic so that it matches the new structure > * Add comment about the lack of a length check on the destination buffer > * As per issue 8138732, change HotSpotIntrinsicCandidate to IntrinsicCandidate > - stubGenerator_ppc.cpp: Changes from PR review > > * Fix clearing of upper bits to clear 32 bits instead of 31 (due to misreading of clrldi instruction) > * change and document loop_unrolls setting from 8 to 2 after re-running the benchmark > * align unrolled loop on a 32-byte boundary > * replace instruction used for checking isURL from a double word to single > word instruction since the register is effectively 32 bits wide > * cosmetic change to realign register comments. > - TestBase64.java: Changes from PR review > > * Use Utils.toByteArrays() method instead of a locally-defined method > * Generate the two non-Base64 tables dynamically rather than use static initialization > * Added comments describing the two above-mentioned arrays > - Expand the Base64 intrinsic regression test to cover decodeBlock > > This patch makes four significant changes: > > 1) The Power implementation of the decodeBlock intrinsic, at least, > requires a decode length of at least 128 bytes, but the existing test cases > are much shorter, maxing out at 111 bytes. So the patch adds a new input > data file which has longer test cases in it. > > 2) The original test cases only covers the encoding of just the printable > subset of the 7-bit ASCII characters. However, Base64 encoding requires > being able to encode arbitrary binary data, i.e. it must handle all 256 > 8-bit byte encodings. To remedy this, but keep the original line-oriented > style of the input data, I added another input file type that uses a simple > ASCII hexadecimal encoding - two ASCII hex characters per 8-bit byte. When > test0 is called, a new parameter is passed that specifies the type of the > input file, which is either the original ASCII type or the hexadecimal > format. So to test both longer input data and arbitrary 8-bit data, the > newly added input test file has test cases which are both longer and > encoded in ASCII hex so as to give full 8-bit capability. When reading > this type of file, test0 calls a newly-added function to translate the > ASCII hex to binary data. Except for the first line of input data, which > contains all possible 8-bit values sequentially, the input data was > generated using a random length (between 111 and 520 bytes) buffer filled > with random 8-bit data, which should give adequate coverage. > > 3) The original test did not test that the decoder detects illegal Base64 > bytes. This change chooses a random location in the encoded data to > corrupt with a randomly-chosen byte which is illegal for the specific > Base64 encoding that is chosen (i.e. standard or URLsafe). It then calls > the decode function to verify that the illegal byte is detected and the > proper exception is thrown. > > 4) The test iteration count was originally 100K, but that is far more than > enough iterations to test the intrinsic. It takes 20K iterations on each > instrinsic for HotSpot C2 to begin calling it. The test originally had > three types of encodings to test and called the encode intrinsic four times > for each iteration, which works out to 100K * 3 * 4 = 1.2M calls just to > encode. Decode was called four times as well (now five because of the > illegal byte test). I believe this is excessive and with the extra test > data I have added, the test was timing out after ten minutes of execution. > It appears that it is timing out, not because the intrinsics take a long > time to run, but because test0 generates an enormous number of discarded > data buffers for the GC system to recover (the test runs at about 39GB of > virtual memory on my test machine). To remedy the timeout problem, I have > changed the code so that a warmup function of 20K repetitions is performed > on a fixed buffer, to activate the instrinsic(s). After the warmup, I have > reduced the number of iterations to 5K on each test0 call. This should > give adequate coverage. > - Add JMH benchmark for Base64 variable length buffer decoding > - Add Power9+ intrinsic implementation for Base64 decoding > - Add HotSpot code to implement Base64 decodeBlock API > - Add HotSpotIntrinsicCandidate and API for Base64 decoding src/hotspot/share/opto/runtime.cpp line 1211: > 1209: // result type needed > 1210: fields = TypeTuple::fields(1); > 1211: fields[TypeFunc::Parms + 0] = TypeInt::INT; // dst ofs, or -1 Why "- or -1" in the comment? ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From shade at openjdk.java.net Wed Oct 7 15:43:14 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 7 Oct 2020 15:43:14 GMT Subject: RFR: 8254166: Zero: return-type warning in zeroInterpreter_zero.cpp Message-ID: This breaks 11u without disabled warnings as errors, but the issue exists in head JDK as well. Might as well fix it everywhere. Testing: - [x] Linux ARM32 Zero build on HEAD JDK - [x] Linux ARM32 Zero build on 11u ------------- Commit messages: - 8254166: Zero: return-type warning in zeroInterpreter_zero.cpp Changes: https://git.openjdk.java.net/jdk/pull/545/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=545&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254166 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/545.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/545/head:pull/545 PR: https://git.openjdk.java.net/jdk/pull/545 From coleenp at openjdk.java.net Wed Oct 7 15:49:14 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 7 Oct 2020 15:49:14 GMT Subject: RFR: 8254168: Remove TemplateTable::count_calls In-Reply-To: <-4q2TWTzGtNKf7ilZ0VaxuPGBZSxy9rbOpi-QgCyhAI=.a57e7dd0-145d-4d2f-89a4-31f23777b4b8@github.com> References: <-4q2TWTzGtNKf7ilZ0VaxuPGBZSxy9rbOpi-QgCyhAI=.a57e7dd0-145d-4d2f-89a4-31f23777b4b8@github.com> Message-ID: <7viD4kywhzxYTnnhSMbzIvP89-tbGwE7B6cy47Zf_0I=.81d31d8b-aefe-4c60-828a-f56e3cfdd7e8@github.com> On Wed, 7 Oct 2020 15:26:57 GMT, Claes Redestad wrote: > Method TemplateTable::count_calls is never called, and only has a dummy definition on some platforms. I suggest > removing it. Looks good + trivial! Thanks for the cleanup! ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/544 From mdoerr at openjdk.java.net Wed Oct 7 15:52:11 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 7 Oct 2020 15:52:11 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> References: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> Message-ID: On Mon, 5 Oct 2020 18:29:58 GMT, CoreyAshford wrote: >> This patch set encompasses the following commits: >> >> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for >> the intrinsic. The API is similar to the existing encodeBlock intrinsic. >> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic >> implementation >> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. >> - Adds a JMH microbenchmark for both Base64 encoding and encoding. >> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. > > CoreyAshford has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains ten commits: > - AOT: Revert change to aotCodeHeap.cpp for decodeBlock > > Don't add the SET_AOT_GLOBAL_SYMBOL_VALUE macro for decode block until all > arches that implement AOT, implement the decodeBlock intrinsic. > - Base64.java decodeBlock: Changes from PR review > > * Make comparison safer and consistent with the while loop > * Update comment about the decodeBlock intrinsic so that it matches the new structure > * Add comment about the lack of a length check on the destination buffer > * As per issue 8138732, change HotSpotIntrinsicCandidate to IntrinsicCandidate > - stubGenerator_ppc.cpp: Changes from PR review > > * Fix clearing of upper bits to clear 32 bits instead of 31 (due to misreading of clrldi instruction) > * change and document loop_unrolls setting from 8 to 2 after re-running the benchmark > * align unrolled loop on a 32-byte boundary > * replace instruction used for checking isURL from a double word to single > word instruction since the register is effectively 32 bits wide > * cosmetic change to realign register comments. > - TestBase64.java: Changes from PR review > > * Use Utils.toByteArrays() method instead of a locally-defined method > * Generate the two non-Base64 tables dynamically rather than use static initialization > * Added comments describing the two above-mentioned arrays > - Expand the Base64 intrinsic regression test to cover decodeBlock > > This patch makes four significant changes: > > 1) The Power implementation of the decodeBlock intrinsic, at least, > requires a decode length of at least 128 bytes, but the existing test cases > are much shorter, maxing out at 111 bytes. So the patch adds a new input > data file which has longer test cases in it. > > 2) The original test cases only covers the encoding of just the printable > subset of the 7-bit ASCII characters. However, Base64 encoding requires > being able to encode arbitrary binary data, i.e. it must handle all 256 > 8-bit byte encodings. To remedy this, but keep the original line-oriented > style of the input data, I added another input file type that uses a simple > ASCII hexadecimal encoding - two ASCII hex characters per 8-bit byte. When > test0 is called, a new parameter is passed that specifies the type of the > input file, which is either the original ASCII type or the hexadecimal > format. So to test both longer input data and arbitrary 8-bit data, the > newly added input test file has test cases which are both longer and > encoded in ASCII hex so as to give full 8-bit capability. When reading > this type of file, test0 calls a newly-added function to translate the > ASCII hex to binary data. Except for the first line of input data, which > contains all possible 8-bit values sequentially, the input data was > generated using a random length (between 111 and 520 bytes) buffer filled > with random 8-bit data, which should give adequate coverage. > > 3) The original test did not test that the decoder detects illegal Base64 > bytes. This change chooses a random location in the encoded data to > corrupt with a randomly-chosen byte which is illegal for the specific > Base64 encoding that is chosen (i.e. standard or URLsafe). It then calls > the decode function to verify that the illegal byte is detected and the > proper exception is thrown. > > 4) The test iteration count was originally 100K, but that is far more than > enough iterations to test the intrinsic. It takes 20K iterations on each > instrinsic for HotSpot C2 to begin calling it. The test originally had > three types of encodings to test and called the encode intrinsic four times > for each iteration, which works out to 100K * 3 * 4 = 1.2M calls just to > encode. Decode was called four times as well (now five because of the > illegal byte test). I believe this is excessive and with the extra test > data I have added, the test was timing out after ten minutes of execution. > It appears that it is timing out, not because the intrinsics take a long > time to run, but because test0 generates an enormous number of discarded > data buffers for the GC system to recover (the test runs at about 39GB of > virtual memory on my test machine). To remedy the timeout problem, I have > changed the code so that a warmup function of 20K repetitions is performed > on a fixed buffer, to activate the instrinsic(s). After the warmup, I have > reduced the number of iterations to 5K on each test0 call. This should > give adequate coverage. > - Add JMH benchmark for Base64 variable length buffer decoding > - Add Power9+ intrinsic implementation for Base64 decoding > - Add HotSpot code to implement Base64 decodeBlock API > - Add HotSpotIntrinsicCandidate and API for Base64 decoding test/hotspot/jtreg/compiler/intrinsics/base64/TestBase64.java line 89: > 87: ran.nextBytes(srcBuf); > 88: > 89: // This should be enough to get both the decoder and encoder intrinsic loaded up and running. Better: ... get encode() and decode() compiled on highest tier. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From shade at openjdk.java.net Wed Oct 7 15:59:22 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 7 Oct 2020 15:59:22 GMT Subject: RFR: 8254166: Zero: return-type warning in zeroInterpreter_zero.cpp [v2] In-Reply-To: References: Message-ID: > This breaks 11u without disabled warnings as errors, but the issue exists in head JDK as well. Might as well fix it > everywhere. > Testing: > - [x] Linux ARM32 Zero build on HEAD JDK > - [x] Linux ARM32 Zero build on 11u Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Added comment ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/545/files - new: https://git.openjdk.java.net/jdk/pull/545/files/b5167bde..95d62436 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=545&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=545&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/545.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/545/head:pull/545 PR: https://git.openjdk.java.net/jdk/pull/545 From sgehwolf at openjdk.java.net Wed Oct 7 15:59:23 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Wed, 7 Oct 2020 15:59:23 GMT Subject: RFR: 8254166: Zero: return-type warning in zeroInterpreter_zero.cpp [v2] In-Reply-To: References: Message-ID: <_eFBj1vrNPC-LZy22THhAj2cC2Q2qOcD1ryefb1TlXY=.3302be3b-65b0-4b15-96a5-9918144c06b6@github.com> On Wed, 7 Oct 2020 15:57:10 GMT, Aleksey Shipilev wrote: >> This breaks 11u without disabled warnings as errors, but the issue exists in head JDK as well. Might as well fix it >> everywhere. >> Testing: >> - [x] Linux ARM32 Zero build on HEAD JDK >> - [x] Linux ARM32 Zero build on 11u > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Added comment Looks OK to me. I'd suggest adding a comment as the return after `ShouldNotReachHere()` looks odd otherwise. src/hotspot/cpu/zero/zeroInterpreter_zero.cpp line 143: > 141: default: > 142: ShouldNotReachHere(); > 143: return result; Perhaps add a `// silence compiler` comment to the return? ------------- Marked as reviewed by sgehwolf (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/545 From shade at openjdk.java.net Wed Oct 7 15:59:24 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 7 Oct 2020 15:59:24 GMT Subject: RFR: 8254166: Zero: return-type warning in zeroInterpreter_zero.cpp [v2] In-Reply-To: <_eFBj1vrNPC-LZy22THhAj2cC2Q2qOcD1ryefb1TlXY=.3302be3b-65b0-4b15-96a5-9918144c06b6@github.com> References: <_eFBj1vrNPC-LZy22THhAj2cC2Q2qOcD1ryefb1TlXY=.3302be3b-65b0-4b15-96a5-9918144c06b6@github.com> Message-ID: <9v1fNnhqTl-XJhXqx2omLZ3xipAxZBzYl8Oq8oETT64=.a0c5e76d-6dc8-4d64-89d3-c200a9b991c8@github.com> On Wed, 7 Oct 2020 15:52:11 GMT, Severin Gehwolf wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> Added comment > > src/hotspot/cpu/zero/zeroInterpreter_zero.cpp line 143: > >> 141: default: >> 142: ShouldNotReachHere(); >> 143: return result; > > Perhaps add a `// silence compiler` comment to the return? Right. Added in new revision. ------------- PR: https://git.openjdk.java.net/jdk/pull/545 From dcubed at openjdk.java.net Wed Oct 7 16:03:13 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 7 Oct 2020 16:03:13 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v5] In-Reply-To: <8lfHcGKHmQfPsBmJIPr9_gWW6IYTShT5ByyCVOxYawg=.3ae0d350-53e5-43b0-81f7-2521e4afdfdd@github.com> References: <8lfHcGKHmQfPsBmJIPr9_gWW6IYTShT5ByyCVOxYawg=.3ae0d350-53e5-43b0-81f7-2521e4afdfdd@github.com> Message-ID: On Wed, 7 Oct 2020 15:22:31 GMT, Coleen Phillimore wrote: >> This change moves the significant amount of stack overflow related code (with ascii art!) out of thread files into a >> new file. Many of the functions are static functions and some go through JavaThread::_stack_overflow_state where >> needed. All functions are moved and not modified except for qualification. I also added a delegating constructor to >> JavaThread::JavaThread so reordered the assignments as initializers from JavaThread::initialize. >> Tested with tier1-6 and builds on arm32, ppc, s390 and zero. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Added some const to StackOverflow declarations Based on what you wrote here in the invite: `All functions are moved and not modified except for qualification.` `I also added a delegating constructor to JavaThread::JavaThread so reordered the assignments as initializers from JavaThread::initialize.` I didn't do my usual crawl through review of the old code and the new code to make sure things "were the same". However, based on comments posted by other folks there have been changes to the code after movement, e.g., the change from `NULL` to `nullptr`. I'm going to have to assume that other folks did a crawl through reviews. Thumbs up. src/hotspot/share/runtime/stackOverflow.hpp line 34: > 32: > 33: // StackOverflow handling is encapsulated in this class. This class contains state variables > 34: // for each JavaThread that implement stack overflow checking and guard page implementation. This clause doesn't read quite right... Perhaps: `// for each JavaThread that implement stack overflow checking and guard page functionality.` The "that implement" and "guard page implementation." phrases in the same clause just don't read right... Your call on making a change here. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/522 From sgehwolf at openjdk.java.net Wed Oct 7 16:05:17 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Wed, 7 Oct 2020 16:05:17 GMT Subject: RFR: 8254166: Zero: return-type warning in zeroInterpreter_zero.cpp [v2] In-Reply-To: References: Message-ID: On Wed, 7 Oct 2020 15:59:22 GMT, Aleksey Shipilev wrote: >> This breaks 11u without disabled warnings as errors, but the issue exists in head JDK as well. Might as well fix it >> everywhere. >> Testing: >> - [x] Linux ARM32 Zero build on HEAD JDK >> - [x] Linux ARM32 Zero build on 11u > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Added comment Looks fine. ------------- Marked as reviewed by sgehwolf (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/545 From mdoerr at openjdk.java.net Wed Oct 7 16:29:13 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 7 Oct 2020 16:29:13 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> References: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> Message-ID: On Mon, 5 Oct 2020 18:29:58 GMT, CoreyAshford wrote: >> This patch set encompasses the following commits: >> >> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for >> the intrinsic. The API is similar to the existing encodeBlock intrinsic. >> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic >> implementation >> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. >> - Adds a JMH microbenchmark for both Base64 encoding and encoding. >> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. > > CoreyAshford has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains ten commits: > - AOT: Revert change to aotCodeHeap.cpp for decodeBlock > > Don't add the SET_AOT_GLOBAL_SYMBOL_VALUE macro for decode block until all > arches that implement AOT, implement the decodeBlock intrinsic. > - Base64.java decodeBlock: Changes from PR review > > * Make comparison safer and consistent with the while loop > * Update comment about the decodeBlock intrinsic so that it matches the new structure > * Add comment about the lack of a length check on the destination buffer > * As per issue 8138732, change HotSpotIntrinsicCandidate to IntrinsicCandidate > - stubGenerator_ppc.cpp: Changes from PR review > > * Fix clearing of upper bits to clear 32 bits instead of 31 (due to misreading of clrldi instruction) > * change and document loop_unrolls setting from 8 to 2 after re-running the benchmark > * align unrolled loop on a 32-byte boundary > * replace instruction used for checking isURL from a double word to single > word instruction since the register is effectively 32 bits wide > * cosmetic change to realign register comments. > - TestBase64.java: Changes from PR review > > * Use Utils.toByteArrays() method instead of a locally-defined method > * Generate the two non-Base64 tables dynamically rather than use static initialization > * Added comments describing the two above-mentioned arrays > - Expand the Base64 intrinsic regression test to cover decodeBlock > > This patch makes four significant changes: > > 1) The Power implementation of the decodeBlock intrinsic, at least, > requires a decode length of at least 128 bytes, but the existing test cases > are much shorter, maxing out at 111 bytes. So the patch adds a new input > data file which has longer test cases in it. > > 2) The original test cases only covers the encoding of just the printable > subset of the 7-bit ASCII characters. However, Base64 encoding requires > being able to encode arbitrary binary data, i.e. it must handle all 256 > 8-bit byte encodings. To remedy this, but keep the original line-oriented > style of the input data, I added another input file type that uses a simple > ASCII hexadecimal encoding - two ASCII hex characters per 8-bit byte. When > test0 is called, a new parameter is passed that specifies the type of the > input file, which is either the original ASCII type or the hexadecimal > format. So to test both longer input data and arbitrary 8-bit data, the > newly added input test file has test cases which are both longer and > encoded in ASCII hex so as to give full 8-bit capability. When reading > this type of file, test0 calls a newly-added function to translate the > ASCII hex to binary data. Except for the first line of input data, which > contains all possible 8-bit values sequentially, the input data was > generated using a random length (between 111 and 520 bytes) buffer filled > with random 8-bit data, which should give adequate coverage. > > 3) The original test did not test that the decoder detects illegal Base64 > bytes. This change chooses a random location in the encoded data to > corrupt with a randomly-chosen byte which is illegal for the specific > Base64 encoding that is chosen (i.e. standard or URLsafe). It then calls > the decode function to verify that the illegal byte is detected and the > proper exception is thrown. > > 4) The test iteration count was originally 100K, but that is far more than > enough iterations to test the intrinsic. It takes 20K iterations on each > instrinsic for HotSpot C2 to begin calling it. The test originally had > three types of encodings to test and called the encode intrinsic four times > for each iteration, which works out to 100K * 3 * 4 = 1.2M calls just to > encode. Decode was called four times as well (now five because of the > illegal byte test). I believe this is excessive and with the extra test > data I have added, the test was timing out after ten minutes of execution. > It appears that it is timing out, not because the intrinsics take a long > time to run, but because test0 generates an enormous number of discarded > data buffers for the GC system to recover (the test runs at about 39GB of > virtual memory on my test machine). To remedy the timeout problem, I have > changed the code so that a warmup function of 20K repetitions is performed > on a fixed buffer, to activate the instrinsic(s). After the warmup, I have > reduced the number of iterations to 5K on each test0 call. This should > give adequate coverage. > - Add JMH benchmark for Base64 variable length buffer decoding > - Add Power9+ intrinsic implementation for Base64 decoding > - Add HotSpot code to implement Base64 decodeBlock API > - Add HotSpotIntrinsicCandidate and API for Base64 decoding Hi Corey, thanks for contributing this change. Looks basically good. Please address the inline comments from Roger and me. Core libs part is reviewed by Roger and the whole change by me. The shared hotspot part is straight forward because it's very similar to the encode intrinsic. So I think we only need a 2nd review for the PPC64 algorithm implementation. I can sponsor the change when this is completed. ------------- Changes requested by mdoerr (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/293 From shade at openjdk.java.net Wed Oct 7 16:58:11 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 7 Oct 2020 16:58:11 GMT Subject: RFR: 8253901: ARM32: SIGSEGV during monitorexit due to incorrect register use (after JDK-8253540) In-Reply-To: References: Message-ID: On Mon, 5 Oct 2020 10:38:45 GMT, Boris Ulasevich wrote: > [JDK-8253540](https://bugs.openjdk.java.net/browse/JDK-8253540) changed InterpreterRuntime::monitorexit call from > call_VM to call_VM_leaf. This requires additional arrangement for ARM32: the parameter must be in R0. This looks sensible to me. I read the code around the changes, and they seem fine. No obvious problems there. I assume `tier1`, `tier2` pass with these changes? ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/503 From redestad at openjdk.java.net Wed Oct 7 17:13:17 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Wed, 7 Oct 2020 17:13:17 GMT Subject: Integrated: 8254168: Remove TemplateTable::count_calls In-Reply-To: <-4q2TWTzGtNKf7ilZ0VaxuPGBZSxy9rbOpi-QgCyhAI=.a57e7dd0-145d-4d2f-89a4-31f23777b4b8@github.com> References: <-4q2TWTzGtNKf7ilZ0VaxuPGBZSxy9rbOpi-QgCyhAI=.a57e7dd0-145d-4d2f-89a4-31f23777b4b8@github.com> Message-ID: <57AnjMPvCImRX-e57Rifg4GlwTj0P12B6IpUdQhCqm8=.8b33e017-f988-43fc-9b44-07078eb48b5a@github.com> On Wed, 7 Oct 2020 15:26:57 GMT, Claes Redestad wrote: > Method TemplateTable::count_calls is never called, and only has a dummy definition on some platforms. I suggest > removing it. This pull request has now been integrated. Changeset: 739347f0 Author: Claes Redestad URL: https://git.openjdk.java.net/jdk/commit/739347f0 Stats: 17 lines in 4 files changed: 0 ins; 17 del; 0 mod 8254168: Remove TemplateTable::count_calls Reviewed-by: coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/544 From github.com+51754783+coreyashford at openjdk.java.net Wed Oct 7 17:24:12 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Wed, 7 Oct 2020 17:24:12 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: References: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> Message-ID: On Wed, 7 Oct 2020 13:35:35 GMT, Roger Riggs wrote: >> CoreyAshford has updated the pull request with a new target base due to a merge or a rebase. The pull request now >> contains ten commits: >> - AOT: Revert change to aotCodeHeap.cpp for decodeBlock >> >> Don't add the SET_AOT_GLOBAL_SYMBOL_VALUE macro for decode block until all >> arches that implement AOT, implement the decodeBlock intrinsic. >> - Base64.java decodeBlock: Changes from PR review >> >> * Make comparison safer and consistent with the while loop >> * Update comment about the decodeBlock intrinsic so that it matches the new structure >> * Add comment about the lack of a length check on the destination buffer >> * As per issue 8138732, change HotSpotIntrinsicCandidate to IntrinsicCandidate >> - stubGenerator_ppc.cpp: Changes from PR review >> >> * Fix clearing of upper bits to clear 32 bits instead of 31 (due to misreading of clrldi instruction) >> * change and document loop_unrolls setting from 8 to 2 after re-running the benchmark >> * align unrolled loop on a 32-byte boundary >> * replace instruction used for checking isURL from a double word to single >> word instruction since the register is effectively 32 bits wide >> * cosmetic change to realign register comments. >> - TestBase64.java: Changes from PR review >> >> * Use Utils.toByteArrays() method instead of a locally-defined method >> * Generate the two non-Base64 tables dynamically rather than use static initialization >> * Added comments describing the two above-mentioned arrays >> - Expand the Base64 intrinsic regression test to cover decodeBlock >> >> This patch makes four significant changes: >> >> 1) The Power implementation of the decodeBlock intrinsic, at least, >> requires a decode length of at least 128 bytes, but the existing test cases >> are much shorter, maxing out at 111 bytes. So the patch adds a new input >> data file which has longer test cases in it. >> >> 2) The original test cases only covers the encoding of just the printable >> subset of the 7-bit ASCII characters. However, Base64 encoding requires >> being able to encode arbitrary binary data, i.e. it must handle all 256 >> 8-bit byte encodings. To remedy this, but keep the original line-oriented >> style of the input data, I added another input file type that uses a simple >> ASCII hexadecimal encoding - two ASCII hex characters per 8-bit byte. When >> test0 is called, a new parameter is passed that specifies the type of the >> input file, which is either the original ASCII type or the hexadecimal >> format. So to test both longer input data and arbitrary 8-bit data, the >> newly added input test file has test cases which are both longer and >> encoded in ASCII hex so as to give full 8-bit capability. When reading >> this type of file, test0 calls a newly-added function to translate the >> ASCII hex to binary data. Except for the first line of input data, which >> contains all possible 8-bit values sequentially, the input data was >> generated using a random length (between 111 and 520 bytes) buffer filled >> with random 8-bit data, which should give adequate coverage. >> >> 3) The original test did not test that the decoder detects illegal Base64 >> bytes. This change chooses a random location in the encoded data to >> corrupt with a randomly-chosen byte which is illegal for the specific >> Base64 encoding that is chosen (i.e. standard or URLsafe). It then calls >> the decode function to verify that the illegal byte is detected and the >> proper exception is thrown. >> >> 4) The test iteration count was originally 100K, but that is far more than >> enough iterations to test the intrinsic. It takes 20K iterations on each >> instrinsic for HotSpot C2 to begin calling it. The test originally had >> three types of encodings to test and called the encode intrinsic four times >> for each iteration, which works out to 100K * 3 * 4 = 1.2M calls just to >> encode. Decode was called four times as well (now five because of the >> illegal byte test). I believe this is excessive and with the extra test >> data I have added, the test was timing out after ten minutes of execution. >> It appears that it is timing out, not because the intrinsics take a long >> time to run, but because test0 generates an enormous number of discarded >> data buffers for the GC system to recover (the test runs at about 39GB of >> virtual memory on my test machine). To remedy the timeout problem, I have >> changed the code so that a warmup function of 20K repetitions is performed >> on a fixed buffer, to activate the instrinsic(s). After the warmup, I have >> reduced the number of iterations to 5K on each test0 call. This should >> give adequate coverage. >> - Add JMH benchmark for Base64 variable length buffer decoding >> - Add Power9+ intrinsic implementation for Base64 decoding >> - Add HotSpot code to implement Base64 decodeBlock API >> - Add HotSpotIntrinsicCandidate and API for Base64 decoding > > src/java.base/share/classes/java/util/Base64.java line 776: > >> 774: * @param sl >> 775: * the total length of source array >> 776: * @param dst > > Please update the comment for `sl`. sl is the offset (exclusive) past the last byte to be converted. Ok, will change. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From github.com+51754783+coreyashford at openjdk.java.net Wed Oct 7 17:24:17 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Wed, 7 Oct 2020 17:24:17 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: References: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> Message-ID: On Wed, 7 Oct 2020 14:03:14 GMT, Martin Doerr wrote: >> CoreyAshford has updated the pull request with a new target base due to a merge or a rebase. The pull request now >> contains ten commits: >> - AOT: Revert change to aotCodeHeap.cpp for decodeBlock >> >> Don't add the SET_AOT_GLOBAL_SYMBOL_VALUE macro for decode block until all >> arches that implement AOT, implement the decodeBlock intrinsic. >> - Base64.java decodeBlock: Changes from PR review >> >> * Make comparison safer and consistent with the while loop >> * Update comment about the decodeBlock intrinsic so that it matches the new structure >> * Add comment about the lack of a length check on the destination buffer >> * As per issue 8138732, change HotSpotIntrinsicCandidate to IntrinsicCandidate >> - stubGenerator_ppc.cpp: Changes from PR review >> >> * Fix clearing of upper bits to clear 32 bits instead of 31 (due to misreading of clrldi instruction) >> * change and document loop_unrolls setting from 8 to 2 after re-running the benchmark >> * align unrolled loop on a 32-byte boundary >> * replace instruction used for checking isURL from a double word to single >> word instruction since the register is effectively 32 bits wide >> * cosmetic change to realign register comments. >> - TestBase64.java: Changes from PR review >> >> * Use Utils.toByteArrays() method instead of a locally-defined method >> * Generate the two non-Base64 tables dynamically rather than use static initialization >> * Added comments describing the two above-mentioned arrays >> - Expand the Base64 intrinsic regression test to cover decodeBlock >> >> This patch makes four significant changes: >> >> 1) The Power implementation of the decodeBlock intrinsic, at least, >> requires a decode length of at least 128 bytes, but the existing test cases >> are much shorter, maxing out at 111 bytes. So the patch adds a new input >> data file which has longer test cases in it. >> >> 2) The original test cases only covers the encoding of just the printable >> subset of the 7-bit ASCII characters. However, Base64 encoding requires >> being able to encode arbitrary binary data, i.e. it must handle all 256 >> 8-bit byte encodings. To remedy this, but keep the original line-oriented >> style of the input data, I added another input file type that uses a simple >> ASCII hexadecimal encoding - two ASCII hex characters per 8-bit byte. When >> test0 is called, a new parameter is passed that specifies the type of the >> input file, which is either the original ASCII type or the hexadecimal >> format. So to test both longer input data and arbitrary 8-bit data, the >> newly added input test file has test cases which are both longer and >> encoded in ASCII hex so as to give full 8-bit capability. When reading >> this type of file, test0 calls a newly-added function to translate the >> ASCII hex to binary data. Except for the first line of input data, which >> contains all possible 8-bit values sequentially, the input data was >> generated using a random length (between 111 and 520 bytes) buffer filled >> with random 8-bit data, which should give adequate coverage. >> >> 3) The original test did not test that the decoder detects illegal Base64 >> bytes. This change chooses a random location in the encoded data to >> corrupt with a randomly-chosen byte which is illegal for the specific >> Base64 encoding that is chosen (i.e. standard or URLsafe). It then calls >> the decode function to verify that the illegal byte is detected and the >> proper exception is thrown. >> >> 4) The test iteration count was originally 100K, but that is far more than >> enough iterations to test the intrinsic. It takes 20K iterations on each >> instrinsic for HotSpot C2 to begin calling it. The test originally had >> three types of encodings to test and called the encode intrinsic four times >> for each iteration, which works out to 100K * 3 * 4 = 1.2M calls just to >> encode. Decode was called four times as well (now five because of the >> illegal byte test). I believe this is excessive and with the extra test >> data I have added, the test was timing out after ten minutes of execution. >> It appears that it is timing out, not because the intrinsics take a long >> time to run, but because test0 generates an enormous number of discarded >> data buffers for the GC system to recover (the test runs at about 39GB of >> virtual memory on my test machine). To remedy the timeout problem, I have >> changed the code so that a warmup function of 20K repetitions is performed >> on a fixed buffer, to activate the instrinsic(s). After the warmup, I have >> reduced the number of iterations to 5K on each test0 call. This should >> give adequate coverage. >> - Add JMH benchmark for Base64 variable length buffer decoding >> - Add Power9+ intrinsic implementation for Base64 decoding >> - Add HotSpot code to implement Base64 decodeBlock API >> - Add HotSpotIntrinsicCandidate and API for Base64 decoding > > src/java.base/share/classes/java/util/Base64.java line 812: > >> 810: >> 811: while (sp < sl) { >> 812: if (shiftto == 18 && sp + 4 < sl) { // fast path > > Please change to sp < s1 - 4. Current version is sensitive to integer overflow. That's not a real problem in the > current code, because the next check catches that, but we should better avoid this with the new intrinsics. Good catch. Will fix. > src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3818: > >> 3816: __ cmpd(CCR0, end, in); >> 3817: __ blt_predict_not_taken(CCR0, unrolled_loop_exit); >> 3818: __ align(32); > > align should be before bind(unrolled_loop_start) oops, yes, that was dumb. Good catch! Will fix and re-benchmark. > src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3: > >> 1: /* >> 2: * Copyright (c) 1997, 2020, Oracle and/or its affiliates. All rights reserved. >> 3: * Copyright (c) 2012, 2020, SAP SE. All rights reserved. > > No comma before SAP SE, please! (See https://bugs.openjdk.java.net/browse/JDK-8252837) Interesting. I was trying to make it consistent with the Oracle copyright. Ok, will fix. > src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3671: > >> 3669: // an advantage to keeping loop_unrolls small (to be able to process >> 3670: // smaller buffers), 2 is clearly the best choice. >> 3671: const unsigned loop_unrolls = 2; > > Unrolling should be re-evaluated after alignment is fixed. align(32) is currently at the wrong place (see my comment > below). Agreed. > src/hotspot/cpu/ppc/vm_version_ppc.cpp line 160: > >> 158: if (UseBASE64Intrinsics) { >> 159: warning("UseBASE64Intrinsics specified, but needs at least Power9."); >> 160: FLAG_SET_DEFAULT(UseCharacterCompareIntrinsics, false); > > Copy & paste bug. Oops, yes, will fix. > src/hotspot/share/classfile/vmIntrinsics.cpp line 491: > >> 489: if (!UseGHASHIntrinsics) return true; >> 490: break; >> 491: case vmIntrinsics::_base64_decodeBlock: > > I'd prefer to use consistent order. You have inserted decode after encode at other places. I agree. Will fix. > src/hotspot/share/opto/library_call.cpp line 310: > >> 308: bool inline_base64_decodeBlock(); >> 309: bool inline_digestBase_implCompress(vmIntrinsics::ID id); >> 310: bool inline_sha_implCompress(vmIntrinsics::ID id); > > Why is that in this change? Good catch! I'm not sure what happened there. Will investigate. > src/hotspot/share/opto/runtime.cpp line 1211: > >> 1209: // result type needed >> 1210: fields = TypeTuple::fields(1); >> 1211: fields[TypeFunc::Parms + 0] = TypeInt::INT; // dst ofs, or -1 > > Why ", or -1" in the comment? At one point the intrinsic would return -1 in the event of encountering a non-base64 byte. I will update the comment to be correct for the revised semantics. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From mcimadamore at openjdk.java.net Wed Oct 7 17:30:38 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Wed, 7 Oct 2020 17:30:38 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) Message-ID: This patch contains the changes associated with the third incubation round of the foreign memory access API incubation (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from multiple threads * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee that the memory will be deallocated, eventually * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class has been added, which defines several useful dereference routines; these are really just thin wrappers around memory access var handles, but they make the barrier of entry for using this API somewhat lower. A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html Specdiff: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html CSR: https://bugs.openjdk.java.net/browse/JDK-8254163 ### API Changes * `MemorySegment` * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) * added a no-arg factory for a native restricted segment representing entire native heap * rename `withOwnerThread` to `handoff` * add new `share` method, to create shared segments * add new `registerCleaner` method, to register a segment against a cleaner * add more helpers to create arrays from a segment e.g. `toIntArray` * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) * `MemoryAddress` * drop `segment` accessor * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative to a given segment * `MemoryAccess` * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. `getByteAtOffset` vs `getByteAtIndex`). * `MemoryHandles` * drop `withOffset` combinator * drop `withStride` combinator * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which it is easy to derive all the other handles using plain var handle combinators. * `Addressable` * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. * `MemoryLayouts` * A new layout, for machine addresses, has been added to the mix. ### Implementation changes There are two main things to discuss here: support for shared segments, and the general simplification of the memory access var handle support. #### Shared segments The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment is shared, it would be possible for a thread to close it while another is accessing it. After considering several options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, once we detect that a thread is accessing the very segment we're about to close, what should happen? We first experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two implementations, one for confined segments and one for shared segments; the main difference between the two is what happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. #### Memory access var handles overhaul The key realization here was that if all memory access var handles took a coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level access on the innards of the memory access var handle. All that code is now gone. #### Test changes Not much to see here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 ------------- Commit messages: - Add modified files - RFR 8254162: Implementation of Foreign-Memory Access API (Third Incubator) Changes: https://git.openjdk.java.net/jdk/pull/548/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254162 Stats: 7467 lines in 75 files changed: 5024 ins; 1373 del; 1070 mod Patch: https://git.openjdk.java.net/jdk/pull/548.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/548/head:pull/548 PR: https://git.openjdk.java.net/jdk/pull/548 From github.com+51754783+coreyashford at openjdk.java.net Wed Oct 7 17:52:16 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Wed, 7 Oct 2020 17:52:16 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: References: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> Message-ID: On Wed, 7 Oct 2020 17:19:50 GMT, CoreyAshford wrote: >> src/hotspot/share/opto/library_call.cpp line 310: >> >>> 308: bool inline_base64_decodeBlock(); >>> 309: bool inline_digestBase_implCompress(vmIntrinsics::ID id); >>> 310: bool inline_sha_implCompress(vmIntrinsics::ID id); >> >> Why is that in this change? > > Good catch! I'm not sure what happened there. Will investigate. This seems to be a mistake I made during a rebase with a conflict. Will fix. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From github.com+51754783+coreyashford at openjdk.java.net Wed Oct 7 17:52:16 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Wed, 7 Oct 2020 17:52:16 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: References: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> Message-ID: On Wed, 7 Oct 2020 15:48:37 GMT, Martin Doerr wrote: >> CoreyAshford has updated the pull request with a new target base due to a merge or a rebase. The pull request now >> contains ten commits: >> - AOT: Revert change to aotCodeHeap.cpp for decodeBlock >> >> Don't add the SET_AOT_GLOBAL_SYMBOL_VALUE macro for decode block until all >> arches that implement AOT, implement the decodeBlock intrinsic. >> - Base64.java decodeBlock: Changes from PR review >> >> * Make comparison safer and consistent with the while loop >> * Update comment about the decodeBlock intrinsic so that it matches the new structure >> * Add comment about the lack of a length check on the destination buffer >> * As per issue 8138732, change HotSpotIntrinsicCandidate to IntrinsicCandidate >> - stubGenerator_ppc.cpp: Changes from PR review >> >> * Fix clearing of upper bits to clear 32 bits instead of 31 (due to misreading of clrldi instruction) >> * change and document loop_unrolls setting from 8 to 2 after re-running the benchmark >> * align unrolled loop on a 32-byte boundary >> * replace instruction used for checking isURL from a double word to single >> word instruction since the register is effectively 32 bits wide >> * cosmetic change to realign register comments. >> - TestBase64.java: Changes from PR review >> >> * Use Utils.toByteArrays() method instead of a locally-defined method >> * Generate the two non-Base64 tables dynamically rather than use static initialization >> * Added comments describing the two above-mentioned arrays >> - Expand the Base64 intrinsic regression test to cover decodeBlock >> >> This patch makes four significant changes: >> >> 1) The Power implementation of the decodeBlock intrinsic, at least, >> requires a decode length of at least 128 bytes, but the existing test cases >> are much shorter, maxing out at 111 bytes. So the patch adds a new input >> data file which has longer test cases in it. >> >> 2) The original test cases only covers the encoding of just the printable >> subset of the 7-bit ASCII characters. However, Base64 encoding requires >> being able to encode arbitrary binary data, i.e. it must handle all 256 >> 8-bit byte encodings. To remedy this, but keep the original line-oriented >> style of the input data, I added another input file type that uses a simple >> ASCII hexadecimal encoding - two ASCII hex characters per 8-bit byte. When >> test0 is called, a new parameter is passed that specifies the type of the >> input file, which is either the original ASCII type or the hexadecimal >> format. So to test both longer input data and arbitrary 8-bit data, the >> newly added input test file has test cases which are both longer and >> encoded in ASCII hex so as to give full 8-bit capability. When reading >> this type of file, test0 calls a newly-added function to translate the >> ASCII hex to binary data. Except for the first line of input data, which >> contains all possible 8-bit values sequentially, the input data was >> generated using a random length (between 111 and 520 bytes) buffer filled >> with random 8-bit data, which should give adequate coverage. >> >> 3) The original test did not test that the decoder detects illegal Base64 >> bytes. This change chooses a random location in the encoded data to >> corrupt with a randomly-chosen byte which is illegal for the specific >> Base64 encoding that is chosen (i.e. standard or URLsafe). It then calls >> the decode function to verify that the illegal byte is detected and the >> proper exception is thrown. >> >> 4) The test iteration count was originally 100K, but that is far more than >> enough iterations to test the intrinsic. It takes 20K iterations on each >> instrinsic for HotSpot C2 to begin calling it. The test originally had >> three types of encodings to test and called the encode intrinsic four times >> for each iteration, which works out to 100K * 3 * 4 = 1.2M calls just to >> encode. Decode was called four times as well (now five because of the >> illegal byte test). I believe this is excessive and with the extra test >> data I have added, the test was timing out after ten minutes of execution. >> It appears that it is timing out, not because the intrinsics take a long >> time to run, but because test0 generates an enormous number of discarded >> data buffers for the GC system to recover (the test runs at about 39GB of >> virtual memory on my test machine). To remedy the timeout problem, I have >> changed the code so that a warmup function of 20K repetitions is performed >> on a fixed buffer, to activate the instrinsic(s). After the warmup, I have >> reduced the number of iterations to 5K on each test0 call. This should >> give adequate coverage. >> - Add JMH benchmark for Base64 variable length buffer decoding >> - Add Power9+ intrinsic implementation for Base64 decoding >> - Add HotSpot code to implement Base64 decodeBlock API >> - Add HotSpotIntrinsicCandidate and API for Base64 decoding > > test/hotspot/jtreg/compiler/intrinsics/base64/TestBase64.java line 89: > >> 87: ran.nextBytes(srcBuf); >> 88: >> 89: // This should be enough to get both the decoder and encoder intrinsic loaded up and running. > > Better: ... get encode() and decode() compiled on highest tier. Thanks. I'm still learning the lingo :) Will fix. > src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3812: > >> 3810: // Address of the last byte of the source is (in + sl - 1) >> 3811: __ add(end, in, sl); >> 3812: __ subi(end, end, 1); > > Looks a bit complicated, but ok. I'd have loaded the number of iterations into ctr and used bdnz instruction for the > loop. (bdz / bdnz may also be slighly faster depending on the CPU implementation.) I agree that a count would be preferable, because it's easier to read. Will fix. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From rrich at openjdk.java.net Wed Oct 7 17:52:27 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Wed, 7 Oct 2020 17:52:27 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v7] In-Reply-To: References: Message-ID: <6Scp6XjVCcdJN0tUKionVwGKoiBG8UeA-OpBXHrCYqk=.01170b1e-9722-4461-84e4-77e8fd447ac4@github.com> > Hi, > > this is the continuation of the review of the implementation for: > > https://bugs.openjdk.java.net/browse/JDK-8227745 > https://bugs.openjdk.java.net/browse/JDK-8233915 > > It allows for JIT optimizations based on escape analysis even if JVMTI agents acquire capabilities to access references > to objects that are subject to such optimizations, e.g. scalar replacement. The implementation reverts such > optimizations just before access very much as when switching from JIT compiled execution to the interpreter, aka > "deoptimization". Webrev.8 was the last one before before the transition to Git/Github: > > http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ > > Thanks, Richard. Richard Reingruber has updated the pull request incrementally with five additional commits since the last revision: - Factorized fragment out of EscapeBarrier::deoptimize_objects_internal into new method in compiledVFrame. - More smaller changes proposed by Serguei. - jvmtiDeferredUpdates.hpp: remove forward declarations. - jvmtiDeferredLocalVariable: move member variables to the beginning of the class definition. - jvmtiDeferredUpdates.hpp: add/remove empty lines and improve indentation. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/119/files - new: https://git.openjdk.java.net/jdk/pull/119/files/1c586cfb..03f751eb Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=119&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=119&range=05-06 Stats: 183 lines in 7 files changed: 93 ins; 66 del; 24 mod Patch: https://git.openjdk.java.net/jdk/pull/119.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/119/head:pull/119 PR: https://git.openjdk.java.net/jdk/pull/119 From rrich at openjdk.java.net Wed Oct 7 17:55:11 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Wed, 7 Oct 2020 17:55:11 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v3] In-Reply-To: References: Message-ID: On Wed, 7 Oct 2020 04:28:16 GMT, Serguei Spitsyn wrote: >> Hi Serguei >> (@sspitsyn) >> >> are you ok with the changes I made based on your comments? >> Will you further review the change? >> >> Thanks, Richard. > > Hi Richard, > > Thank you for making the refactoring. I like it more now. :) > So, the fix looks good to me in general. > > But could I ask you, to adjust some formatting, please? > There are several things that can be done to improve the code readability. > > src/hotspot/share/prims/jvmtiDeferredUpdates.hpp: > > I'd suggest to add an empty line before lines 40, 71, 73, 93, 95, 109 to make class definitions and function > declarations/definitions with comments more catchable by eyes. The following lines can be removed: 81, 82, 103 > > Also, there is inconsistency in function definitions formatting: > - some functions have big indent between the type and name > - some functions have no indent between the type and name but a big indent between name and body > I'd suggest to either to remove all indents or make it reasonably smaller but consistent. > > It seems, there is no reason to keep these class declarations: > 38 class jvmtiDeferredLocalVariable; > 108 class jvmtiDeferredLocalVariableSet; > > src/hotspot/share/prims/jvmtiDeferredUpdates.cpp: > > 82 // Free deferred updates. > 83 // (Note the 'list' of local variable updates is embedded in 'updates') > > A suggestion to change the line 83 as follows: > ` 83 // Note, the 'list' of local variable updates is embedded in 'updates'.` > > src/hotspot/share/runtime/escapeBarrier.hpp: > > Add dots at the end of comments at lines 97, 99, 103. > I'd suggest to add an empty line before lines 39, 40, 80, 81, 93, 94, 99, 119, 121. > > src/hotspot/share/runtime/escapeBarrier.cpp: > > The following class declaration is not needed: > ` 49 class jvmtiDeferredLocalVariableSet;` > > because you already added this line: > ` 29 #include "prims/jvmtiDeferredUpdates.hpp"` > > The lines below deserve a refactoring. It can be separate functions for locals, expressions and monitors, or just one > function for the whole fragment: > 345 GrowableArray* scopeLocals = cvf->scope()->locals(); > 346 StackValueCollection* locals = cvf->locals(); > 347 if (locals != NULL) { > 348 for (int i2 = 0; i2 < locals->size(); i2++) { > 349 StackValue* var = locals->at(i2); > 350 if (var->type() == T_OBJECT && scopeLocals->at(i2)->is_object()) { > 351 jvalue val; > 352 val.l = cast_from_oop(locals->at(i2)->get_obj()()); > 353 cvf->update_local(T_OBJECT, i2, val); > 354 } > 355 } > 356 } > 357 > 358 // expressions > 359 GrowableArray* scopeExpressions = cvf->scope()->expressions(); > 360 StackValueCollection* expressions = cvf->expressions(); > 361 if (expressions != NULL) { > 362 for (int i2 = 0; i2 < expressions->size(); i2++) { > 363 StackValue* var = expressions->at(i2); > 364 if (var->type() == T_OBJECT && scopeExpressions->at(i2)->is_object()) { > 365 jvalue val; > 366 val.l = cast_from_oop(expressions->at(i2)->get_obj()()); > 367 cvf->update_stack(T_OBJECT, i2, val); > 368 } > 369 } > 370 } > 371 > 372 // monitors > 373 GrowableArray* monitors = cvf->monitors(); > 374 if (monitors != NULL) { > 375 for (int i2 = 0; i2 < monitors->length(); i2++) { > 376 if (monitors->at(i2)->eliminated()) { > 377 assert(!monitors->at(i2)->owner_is_scalar_replaced(), > 378 "reallocation failure, should not update"); > 379 cvf->update_monitor(i2, monitors->at(i2)); > 380 } > 381 } > 382 } > > > src/hotspot/share/prims/jvmtiImpl.cpp: > > 420 // Constructor for non-object getter > 421 VM_GetOrSetLocal::VM_GetOrSetLocal(JavaThread* thread, jint depth, jint index, BasicType type) > 422 : _thread(thread) > 423 , _calling_thread(NULL) > 424 , _depth(depth) > 425 , _index(index) > 426 , _type(type) > 427 , _jvf(NULL) > 428 , _set(false) > 429 , _eb(type == T_OBJECT, NULL, NULL) > 430 , _result(JVMTI_ERROR_NONE) > 431 { > 432 } > > I still think, that the line 429 is going to cause confusions. > It is a non-object getter, so the type should never be T_OBJECT. > It won't change in the future to allow the T_OBJECT types. > The only way to allow it is to merge the constructors for object and non-object getters. > So, I'm suggesting to replace this line with: > ` 429 , _eb(false, NULL, NULL)` Hi Serguei, > Thank you for making the refactoring. I like it more now. :) > So, the fix looks good to me in general. Good :) > But could I ask you, to adjust some formatting, please? > There are several things that can be done to improve the code readability. > > src/hotspot/share/prims/jvmtiDeferredUpdates.hpp: > > I'd suggest to add an empty line before lines 40, 71, 73, 93, 95, 109 to make class definitions and function > declarations/definitions with comments more catchable by eyes. The following lines can be removed: 81, 82, 103 Sure. I've made the changes. > Also, there is inconsistency in function definitions formatting: > > * some functions have big indent between the type and name > > * some functions have no indent between the type and name but a big indent between name and body > I'd suggest to either to remove all indents or make it reasonably smaller but consistent. > I've made the indents smaller. I also moved private members jvmtiDeferredLocalVariable at the beginning. Looks better now. > It seems, there is no reason to keep these class declarations: > > ``` > 38 class jvmtiDeferredLocalVariable; > 108 class jvmtiDeferredLocalVariableSet; > ``` Removed. > src/hotspot/share/prims/jvmtiDeferredUpdates.cpp: > > ``` > 82 // Free deferred updates. > 83 // (Note the 'list' of local variable updates is embedded in 'updates') > ``` > > A suggestion to change the line 83 as follows: > ` 83 // Note, the 'list' of local variable updates is embedded in 'updates'.` Done. > src/hotspot/share/runtime/escapeBarrier.hpp: > > Add dots at the end of comments at lines 97, 99, 103. > I'd suggest to add an empty line before lines 39, 40, 80, 81, 93, 94, 99, 119, 121. Done. > src/hotspot/share/runtime/escapeBarrier.cpp: > > The following class declaration is not needed: > ` 49 class jvmtiDeferredLocalVariableSet;` > > because you already added this line: > ` 29 #include "prims/jvmtiDeferredUpdates.hpp"` Your right. Thanks. > The lines below deserve a refactoring. It can be separate functions for locals, expressions and monitors, or just one > function for the whole fragment: > ``` > 345 GrowableArray* scopeLocals = cvf->scope()->locals(); > 346 StackValueCollection* locals = cvf->locals(); > 347 if (locals != NULL) { > 348 for (int i2 = 0; i2 < locals->size(); i2++) { > 349 StackValue* var = locals->at(i2); > 350 if (var->type() == T_OBJECT && scopeLocals->at(i2)->is_object()) { > 351 jvalue val; > 352 val.l = cast_from_oop(locals->at(i2)->get_obj()()); > 353 cvf->update_local(T_OBJECT, i2, val); > 354 } > 355 } > 356 } > 357 > 358 // expressions > 359 GrowableArray* scopeExpressions = cvf->scope()->expressions(); > 360 StackValueCollection* expressions = cvf->expressions(); > 361 if (expressions != NULL) { > 362 for (int i2 = 0; i2 < expressions->size(); i2++) { > 363 StackValue* var = expressions->at(i2); > 364 if (var->type() == T_OBJECT && scopeExpressions->at(i2)->is_object()) { > 365 jvalue val; > 366 val.l = cast_from_oop(expressions->at(i2)->get_obj()()); > 367 cvf->update_stack(T_OBJECT, i2, val); > 368 } > 369 } > 370 } > 371 > 372 // monitors > 373 GrowableArray* monitors = cvf->monitors(); > 374 if (monitors != NULL) { > 375 for (int i2 = 0; i2 < monitors->length(); i2++) { > 376 if (monitors->at(i2)->eliminated()) { > 377 assert(!monitors->at(i2)->owner_is_scalar_replaced(), > 378 "reallocation failure, should not update"); > 379 cvf->update_monitor(i2, monitors->at(i2)); > 380 } > 381 } > 382 } > ``` I moved the fragment into a new method in compiledVFrame. Please note that an equal fragment exists here too: https://github.com/openjdk/jdk/blob/1e8e543b264bb985bfee535fedc9ffe7db5ad482/src/hotspot/share/jvmci/jvmciCompilerToVM.cpp#L1524-L1558 Actually this location could be implemented on top of EscapeBarrier. Maybe (maybe not?) in a follow-up... > src/hotspot/share/prims/jvmtiImpl.cpp: > > ``` > 420 // Constructor for non-object getter > 421 VM_GetOrSetLocal::VM_GetOrSetLocal(JavaThread* thread, jint depth, jint index, BasicType type) > 422 : _thread(thread) > 423 , _calling_thread(NULL) > 424 , _depth(depth) > 425 , _index(index) > 426 , _type(type) > 427 , _jvf(NULL) > 428 , _set(false) > 429 , _eb(type == T_OBJECT, NULL, NULL) > 430 , _result(JVMTI_ERROR_NONE) > 431 { > 432 } > ``` > > I still think, that the line 429 is going to cause confusions. > It is a non-object getter, so the type should never be T_OBJECT. > It won't change in the future to allow the T_OBJECT types. > The only way to allow it is to merge the constructors for object and non-object getters. > So, I'm suggesting to replace this line with: > ` 429 , _eb(false, NULL, NULL)` Ok, done. ------------- PR: https://git.openjdk.java.net/jdk/pull/119 From iklam at openjdk.java.net Wed Oct 7 17:56:13 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 7 Oct 2020 17:56:13 GMT Subject: RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive [v12] In-Reply-To: <9emWKl6fr-GA5LN0uHhuEd5D123QcoCiHQR1M9bAbag=.cc4b6129-8b33-47e4-a421-9e6b4817933b@github.com> References: <9emWKl6fr-GA5LN0uHhuEd5D123QcoCiHQR1M9bAbag=.cc4b6129-8b33-47e4-a421-9e6b4817933b@github.com> Message-ID: On Tue, 6 Oct 2020 20:46:17 GMT, Yumin Qi wrote: >> This patch is reorganized after 8252725, which is separated from this patch to refactor jlink glugin code. The previous >> webrev with hg can be found at: http://cr.openjdk.java.net/~minqi/2020/8247536/webrev-05. With 8252725 integrated, the >> regeneration of holder classes is simply to call the new added GenerateJLIClassesHelper.cdsGenerateHolderClasses >> function. Tests: tier1-4 > > Yumin Qi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains > 23 commits: > - Added new separate function to CDS for logging species and modified the existing function to log lambda form invokers. > Changed isDumpLoadedClassList to a reasonable name isDumpingClassList as read only in CDS. > - Merge branch 'master' of https://github.com/openjdk/jdk into jdk-8247536 > - Removed unused imports. > - Fixed comments with correct class and method name in CDS, removed unused variables after last change. > - Moved and renamed cdsGenerateHolderClasses from GenerateJLIClassesHelp to CDS as generateLambdaFormHolderClasses. Added > input verification function in CDS before class generation. Added more test scenarios. Removed trailing unused ending > words for output of lambda form trace line in case of DumpLoadedClassList. > - Move the check work to java, restore code in VM. Modified test code according to the changes. The invoke name > verififcation is not implemented since not all the holder class are processed, not all the functions of processed > holder classes are added. For holder class with DirectMethodHandle in its name, only the name in the > DMH_METHOD_TYPE_MAP keyset is added, ithe line with other names just gets skipped silently. This makes the verification > on invoke names difficul, a name not in the keyset should not fail the test. Also add a boolean to > cdsGenerateHolderClasses to indicate call path. > - Remove trailing word of line which is not used in holder class regeneration. There is a trailing LF (Line Feed) so trim > white spaces from both front and end of the line or it will fail method type validation. > - In case of exception happens during reloading class, CHECK will return without free the allocated buffer for class > bytes so moved the buffer allocation and freeing to caller. Also removed test 6 since there is not guarantee that we > can give a signature which will always fail. Additional changes to GenerateJLIClassesHelper according to review > suggestion. > - Merge branch 'master' of https://github.com/openjdk/jdk into jdk-8247536 > - Merge branch 'master' of https://git.openjdk.java.net/jdk into jdk-8247536 > - ... and 13 more: https://git.openjdk.java.net/jdk/compare/82fe023b...f5584dcf Marked as reviewed by iklam (Reviewer). src/hotspot/share/classfile/lambdaFormInvokers.cpp line 133: > 131: log_info(cds)("Class %s not present, skip", name); > 132: return; > 133: } `assert(klass->is_instance_klass(), "Should be");` should be after the NULL check of `klass` src/hotspot/share/prims/jvm.cpp line 3872: > 3870: JVM_ENTRY(jboolean, JVM_IsDumpingClassList(JNIEnv *env)) > 3871: JVMWrapper("JVM_IsDumpingClassList"); > 3872: return DumpLoadedClassList != NULL && classlist_file->is_open(); For sanity, it's better to add `classlist_file != NULL` ------------- PR: https://git.openjdk.java.net/jdk/pull/193 From smonteith at openjdk.java.net Wed Oct 7 18:06:15 2020 From: smonteith at openjdk.java.net (Stuart Monteith) Date: Wed, 7 Oct 2020 18:06:15 GMT Subject: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v12] In-Reply-To: References: Message-ID: <19ts4LYwyvtljcsVAjrinz6Jx2esPUWeUdByyX0CUUo=.2b78dff1-8d73-453e-95f5-4362cc3635f3@github.com> On Tue, 6 Oct 2020 12:17:05 GMT, Erik ?sterlund wrote: >> This PR the implementation of "JEP 376: ZGC: Concurrent Thread-Stack Processing" (cf. >> https://openjdk.java.net/jeps/376). >> Basically, this patch modifies the epilog safepoint when returning from a frame (supporting interpreter frames, c1, c2, >> and native wrapper frames), to compare the stack pointer against a thread-local value. This turns return polls into >> more of a swiss army knife that can be used to poll for safepoints, handshakes, but also returns into not yet safe to >> expose frames, denoted by a "stack watermark". ZGC will leave frames (and other thread oops) in a state of a mess in >> the GC checkpoint safepoints, rather than processing all threads and their stacks. Processing is initialized >> automagically when threads wake up for a safepoint, or get poked by a handshake or safepoint. Said initialization >> processes a few (3) frames and other thread oops. The rest - the bulk of the frame processing, is deferred until it is >> actually needed. It is needed when a frame is exposed to either 1) execution (returns or unwinding due to exception >> handling), or 2) stack walker APIs. A hook is then run to go and finish the lazy processing of frames. Mutator and GC >> threads can compete for processing. The processing is therefore performed under a per-thread lock. Note that disarming >> of the poll word (that the returns are comparing against) is only performed by the thread itself. So sliding the >> watermark up will require one runtime call for a thread to note that nothing needs to be done, and then update the poll >> word accordingly. Downgrading the poll word concurrently by other threads was simply not worth the complexity it >> brought (and is only possible on TSO machines). So left that one out. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Review: Andrew CR 1 I've been reviewing this and stepping through the debugger. It looks OK to me. ------------- Marked as reviewed by smonteith (Author). PR: https://git.openjdk.java.net/jdk/pull/296 From shade at openjdk.java.net Wed Oct 7 18:12:20 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 7 Oct 2020 18:12:20 GMT Subject: RFR: 8254102: use ProcessHandle::pid instead of ManagementFactory::getRuntimeMXBean to get pid in tests [v2] In-Reply-To: References: Message-ID: On Wed, 7 Oct 2020 18:09:28 GMT, Igor Ignatyev wrote: >> Hi all, >> >> could you please review this small cleanup which replaces >> `ManagementFactory.getRuntimeMXBean().getName().split("@")[0]` w/ `ProcessHandle.current().pid()` to get current >> process pid? Thanks, >> -- Igor > > Igor Ignatyev has updated the pull request incrementally with two additional commits since the last revision: > > - use Long.toString instead of String.valueOf > - remove explicit coversion to String in Suicide.java Still looks good to me. ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/534 From iignatyev at openjdk.java.net Wed Oct 7 18:12:19 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Wed, 7 Oct 2020 18:12:19 GMT Subject: RFR: 8254102: use ProcessHandle::pid instead of ManagementFactory::getRuntimeMXBean to get pid in tests [v2] In-Reply-To: References: Message-ID: > Hi all, > > could you please review this small cleanup which replaces > `ManagementFactory.getRuntimeMXBean().getName().split("@")[0]` w/ `ProcessHandle.current().pid()` to get current > process pid? Thanks, > -- Igor Igor Ignatyev has updated the pull request incrementally with two additional commits since the last revision: - use Long.toString instead of String.valueOf - remove explicit coversion to String in Suicide.java ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/534/files - new: https://git.openjdk.java.net/jdk/pull/534/files/b6f4b94a..bc6e3515 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=534&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=534&range=00-01 Stats: 8 lines in 6 files changed: 0 ins; 0 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/534.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/534/head:pull/534 PR: https://git.openjdk.java.net/jdk/pull/534 From iignatyev at openjdk.java.net Wed Oct 7 18:12:20 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Wed, 7 Oct 2020 18:12:20 GMT Subject: RFR: 8254102: use ProcessHandle::pid instead of ManagementFactory::getRuntimeMXBean to get pid in tests [v2] In-Reply-To: References: <5OaD6gBX9nmmX_N690gLnHefPo4bk9z3D-WqGwD3kXA=.32b936f9-c795-49c6-b158-90845eadcd45@github.com> Message-ID: On Wed, 7 Oct 2020 06:30:43 GMT, Aleksey Shipilev wrote: >> test/failure_handler/test/sanity/Suicide.java line 36: >> >>> 34: String osName = System.getProperty("os.name"); >>> 35: if (osName.contains("Windows")) { >>> 36: cmd = "taskkill.exe /F /PID " + pidStr; >> >> This can be simplified to ProcessHandle.current().toString(). It returns the pid of the process as a string. >> >> Explicitly converting it to a string is not necessary. The "+" concatenation would convert the number to a string. > > Yes, can just have `long pid` in this case. > > I don't see that `ProcessHandle.toString` is *specified* to return the string with pid, so relying on that is brittle. > We might call `Long.toString` here directly, to avoid jumping through a few calls. But seeing how all this is a test > code, that does not seem necessary. @RogerRiggs , as Aleksey pointed out `ProcessHandle::toString` isn't specified to return the string which contains only `pid` (in fact it's not specified at all), so I don't think we should rely on the current implementation. although it doesn't matter much, I've removed explicit conversion here and replaced `String::valueOf` with `Long::toString` in other places. ------------- PR: https://git.openjdk.java.net/jdk/pull/534 From stuefe at openjdk.java.net Wed Oct 7 18:19:14 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 7 Oct 2020 18:19:14 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v3] In-Reply-To: References: Message-ID: On Wed, 7 Oct 2020 14:41:25 GMT, Coleen Phillimore wrote: >> src/hotspot/share/runtime/stackOverflow.hpp line 121: >> >>> 119: // (large addresses) >>> 120: // >>> 121: >> >> Nice ascii art :) >> >> I wish though it could communicate better the openness of the ranges. E.g.: >> >> +----------------+ >> | | <-- stack_end() >> | red zone | >> | | >> +----------------+ >> | | <-- red_zone_base() >> | yellow zone | >> | | >> .... >> | | >> +----------------+ >> <-- stack_base() >> >> Maybe its just me but I always have to think a bit more here. With downward growing stacks normal range thinking is >> reversed wrt to openness, so stack_base() points outside the stack and stack_end() is in the stack. This is true for >> all base values - they point to locations outside the zone they base. Maybe that is clear to all others but it >> sometimes surprises me. > > I just found this comment. I think the ascii art was added by @GoeLin. I just moved it. Your picture is upside down > but it sorta makes sense that the 'base' addresses point to the first address in the range, which is what I think they > do. I'm quite sure they don't. stack_base() points to one-beyond-the-highest address in stack and therefore outside the stack. If the stack is 8 pages, stack_base points to the start of the 9th page. Therefore stack_base may actually point into a different memory region, eg the stack of a neighboring thread, should they happen to be allocated without gap. stack_red_zone_base() points to one-beyond-the-highest address in the red zone resp. the lowest address in the yellow zone. So it points outside the red zone. And so forth, for all other "base" values. All are one-beyond pointers. But nothing that needs to be addressed with your patch, of course. ------------- PR: https://git.openjdk.java.net/jdk/pull/522 From stuefe at openjdk.java.net Wed Oct 7 18:25:19 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 7 Oct 2020 18:25:19 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v5] In-Reply-To: <8lfHcGKHmQfPsBmJIPr9_gWW6IYTShT5ByyCVOxYawg=.3ae0d350-53e5-43b0-81f7-2521e4afdfdd@github.com> References: <8lfHcGKHmQfPsBmJIPr9_gWW6IYTShT5ByyCVOxYawg=.3ae0d350-53e5-43b0-81f7-2521e4afdfdd@github.com> Message-ID: On Wed, 7 Oct 2020 15:22:31 GMT, Coleen Phillimore wrote: >> This change moves the significant amount of stack overflow related code (with ascii art!) out of thread files into a >> new file. Many of the functions are static functions and some go through JavaThread::_stack_overflow_state where >> needed. All functions are moved and not modified except for qualification. I also added a delegating constructor to >> JavaThread::JavaThread so reordered the assignments as initializers from JavaThread::initialize. >> Tested with tier1-6 and builds on arm32, ppc, s390 and zero. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Added some const to StackOverflow declarations One small nit, otherwise okay. src/hotspot/share/runtime/stackOverflow.cpp line 38: > 36: size_t StackOverflow::_stack_shadow_zone_size = 0; > 37: > 38: void StackOverflow::initialize_stack_zone_sizes(size_t alignment) { I think you can remove the parameter and hard-code 4K inside this function. The reason is that the "StackXXXPages" parameters are defined as "number of 4k units", not "number of pages". ------------- Changes requested by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/522 From rriggs at openjdk.java.net Wed Oct 7 18:41:12 2020 From: rriggs at openjdk.java.net (Roger Riggs) Date: Wed, 7 Oct 2020 18:41:12 GMT Subject: RFR: 8254102: use ProcessHandle::pid instead of ManagementFactory::getRuntimeMXBean to get pid in tests [v2] In-Reply-To: References: Message-ID: On Wed, 7 Oct 2020 18:12:19 GMT, Igor Ignatyev wrote: >> Hi all, >> >> could you please review this small cleanup which replaces >> `ManagementFactory.getRuntimeMXBean().getName().split("@")[0]` w/ `ProcessHandle.current().pid()` to get current >> process pid? Thanks, >> -- Igor > > Igor Ignatyev has updated the pull request incrementally with two additional commits since the last revision: > > - use Long.toString instead of String.valueOf > - remove explicit coversion to String in Suicide.java Marked as reviewed by rriggs (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/534 From stuefe at openjdk.java.net Wed Oct 7 18:45:06 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 7 Oct 2020 18:45:06 GMT Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on macOS [v3] In-Reply-To: References: <6iVRP-20baz0_46SouR-dj9SyspR5QvaL9iJMdeipDE=.92688b4e-ebd3-4681-8e63-a4aee752c407@github.com> <_XaA5cQEInPMn5Q5gj2y7AFCRprFQiYfI6BeUN49FhA=.9f17ae05-b37e-4f40-a83f-fd34aa812575@github.com> Message-ID: On Tue, 6 Oct 2020 21:03:34 GMT, Anton Kozlov wrote: > > Could you explain how the choice between SysV and mmap is made on AIX? It looks like > > ``` > develop(uintx, Use64KPagesThreshold, 0, \ > "4K/64K page allocation threshold.") \ > ... > if (os::vm_page_size() == 4*K) { > return reserve_mmaped_memory(bytes, NULL /* requested_addr */); > } else { > if (bytes >= Use64KPagesThreshold) { > return reserve_shmated_memory(bytes, NULL /* requested_addr */); > } else { > return reserve_mmaped_memory(bytes, NULL /* requested_addr */); > } > } > ``` > > (there only two calls to reserve_shmated_memory and both of them are like above. Is SysV SHM used in product builds?) > For now, the AIX case looks a bit different. The choice is made by the platform and the shared code cannot control > this. So yes, I cannot see how to avoid handle_t or similar. On AIX we have 4K and 64K pages (actually more but those are interesting). 64K pages are desireable for larger areas like heap. 64K pages can only be allocated with SystemV shared memory. mmap'ed memory is always 4K paged. But SystemV shared mem has a number of disadvantages, like inability to protect the memory, and a large attach alignment (256M). So it is cumbersome. os::vm_page_size() on AIX is a fake. The hotspot code assumes that the underlying Operating System has some sort of "base page size" (usually what is returned by sysconf(_SC_PAGESIZE)), and then optionally some sort of huge page size which follows different rules (e.g. pinned). On Aix things are more fluid. When investigating 64K page support on AIX I decided eventually to fool hotspot into thinking that the base page size is 64k. Long story, this was way before the OpenJDK existed and this was a propietary code base with no possibilty of changing things upstream. Therefore os::vm_page_size returns 64K ("64K fake mode"). This can be disabled. So above code fragment uses mmaped memory if 64K fake mode is disabled, and if it is enabeld, it uses mmap for smaller regions and shmget for larger ones. > > In contrast, THP and MAP_JIT are the way to implement a request from the shared code. Even for THP, shared code seems > to know why it should "realign" (not sure why commit has an alignment_hint parameter, while it is possible to realign > after a regular commit). I assume there is enough context in the shared code that can be provided for platform > functions, without a handle_t. And the same context should anyway be provided to reserve function, so handle_t can be > filled with all necessary information. I believe the alignment hint and the TPH code had their roots in Solaris code. So its current form (I guess) is heavily warped by history. A new implementation would maybe just have a "os::set_tph(start, size)" function and leave it at that. And yes, I do not think it is necessary for os::commit to do this. In fact, Linux could probably set TPH unconditionally always when UseTransparentHugePages is active. That would alleviate the need for the alignment_hint parameter and the realign function. I opened https://bugs.openjdk.java.net/browse/JDK-8253890 to follow up on this. ------------- PR: https://git.openjdk.java.net/jdk/pull/294 From coleenp at openjdk.java.net Wed Oct 7 18:48:13 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 7 Oct 2020 18:48:13 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v5] In-Reply-To: References: <8lfHcGKHmQfPsBmJIPr9_gWW6IYTShT5ByyCVOxYawg=.3ae0d350-53e5-43b0-81f7-2521e4afdfdd@github.com> Message-ID: On Wed, 7 Oct 2020 15:52:58 GMT, Daniel D. Daugherty wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Added some const to StackOverflow declarations > > src/hotspot/share/runtime/stackOverflow.hpp line 34: > >> 32: >> 33: // StackOverflow handling is encapsulated in this class. This class contains state variables >> 34: // for each JavaThread that implement stack overflow checking and guard page implementation. > > This clause doesn't read quite right... Perhaps: > > `// for each JavaThread that implement stack overflow checking and guard page functionality.` > > The "that implement" and "guard page implementation." phrases in the same clause > just don't read right... Your call on making a change here. // StackOverflow handling is encapsulated in this class. This class contains state variables // for each JavaThread that are used to detect stack overflow though explicit checks or through // checks in the signal handler when stack banging into guard pages causes a trap. // The state variables also record whether guard pages are enabled or disabled. Does this sound better? It's more descriptive but hopefully still summarizing. ------------- PR: https://git.openjdk.java.net/jdk/pull/522 From coleenp at openjdk.java.net Wed Oct 7 18:48:13 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 7 Oct 2020 18:48:13 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v5] In-Reply-To: References: <8lfHcGKHmQfPsBmJIPr9_gWW6IYTShT5ByyCVOxYawg=.3ae0d350-53e5-43b0-81f7-2521e4afdfdd@github.com> Message-ID: On Wed, 7 Oct 2020 18:41:09 GMT, Coleen Phillimore wrote: >> src/hotspot/share/runtime/stackOverflow.hpp line 34: >> >>> 32: >>> 33: // StackOverflow handling is encapsulated in this class. This class contains state variables >>> 34: // for each JavaThread that implement stack overflow checking and guard page implementation. >> >> This clause doesn't read quite right... Perhaps: >> >> `// for each JavaThread that implement stack overflow checking and guard page functionality.` >> >> The "that implement" and "guard page implementation." phrases in the same clause >> just don't read right... Your call on making a change here. > > // StackOverflow handling is encapsulated in this class. This class contains state variables > // for each JavaThread that are used to detect stack overflow though explicit checks or through > // checks in the signal handler when stack banging into guard pages causes a trap. > // The state variables also record whether guard pages are enabled or disabled. > > Does this sound better? It's more descriptive but hopefully still summarizing. Re: your comment above. I did just move the code but some of the review comments are on the existing code. I did end up adding a function initialize_stack_zone_sizes and removing the set*zone functions as a result and changed some comments. And added some consts. ------------- PR: https://git.openjdk.java.net/jdk/pull/522 From coleenp at openjdk.java.net Wed Oct 7 18:48:14 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 7 Oct 2020 18:48:14 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v5] In-Reply-To: References: <8lfHcGKHmQfPsBmJIPr9_gWW6IYTShT5ByyCVOxYawg=.3ae0d350-53e5-43b0-81f7-2521e4afdfdd@github.com> Message-ID: On Wed, 7 Oct 2020 18:20:54 GMT, Thomas Stuefe wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Added some const to StackOverflow declarations > > src/hotspot/share/runtime/stackOverflow.cpp line 38: > >> 36: size_t StackOverflow::_stack_shadow_zone_size = 0; >> 37: >> 38: void StackOverflow::initialize_stack_zone_sizes(size_t alignment) { > > I think you can remove the parameter and hard-code 4K inside this function. The reason is that the "StackXXXPages" > parameters are defined as "number of 4k units", not "number of pages". Ok, I'll move the comment in os.cpp also that tells you why 4k is a thing. ------------- PR: https://git.openjdk.java.net/jdk/pull/522 From iignatyev at openjdk.java.net Wed Oct 7 18:55:08 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Wed, 7 Oct 2020 18:55:08 GMT Subject: Integrated: 8254102: use ProcessHandle::pid instead of ManagementFactory::getRuntimeMXBean to get pid in tests In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 23:08:40 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this small cleanup which replaces > `ManagementFactory.getRuntimeMXBean().getName().split("@")[0]` w/ `ProcessHandle.current().pid()` to get current > process pid? Thanks, > -- Igor This pull request has now been integrated. Changeset: 5a9bd41e Author: Igor Ignatyev URL: https://git.openjdk.java.net/jdk/commit/5a9bd41e Stats: 57 lines in 8 files changed: 0 ins; 41 del; 16 mod 8254102: use ProcessHandle::pid instead of ManagementFactory::getRuntimeMXBean to get pid in tests Reviewed-by: rriggs, shade ------------- PR: https://git.openjdk.java.net/jdk/pull/534 From iignatyev at openjdk.java.net Wed Oct 7 18:55:07 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Wed, 7 Oct 2020 18:55:07 GMT Subject: RFR: 8254102: use ProcessHandle::pid instead of ManagementFactory::getRuntimeMXBean to get pid in tests [v2] In-Reply-To: References: Message-ID: On Wed, 7 Oct 2020 18:38:07 GMT, Roger Riggs wrote: >> Igor Ignatyev has updated the pull request incrementally with two additional commits since the last revision: >> >> - use Long.toString instead of String.valueOf >> - remove explicit coversion to String in Suicide.java > > Marked as reviewed by rriggs (Reviewer). Roger, Aleksey, thanks for your reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/534 From dcubed at openjdk.java.net Wed Oct 7 18:59:08 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 7 Oct 2020 18:59:08 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v5] In-Reply-To: References: <8lfHcGKHmQfPsBmJIPr9_gWW6IYTShT5ByyCVOxYawg=.3ae0d350-53e5-43b0-81f7-2521e4afdfdd@github.com> Message-ID: On Wed, 7 Oct 2020 18:43:40 GMT, Coleen Phillimore wrote: >> // StackOverflow handling is encapsulated in this class. This class contains state variables >> // for each JavaThread that are used to detect stack overflow though explicit checks or through >> // checks in the signal handler when stack banging into guard pages causes a trap. >> // The state variables also record whether guard pages are enabled or disabled. >> >> Does this sound better? It's more descriptive but hopefully still summarizing. > > Re: your comment above. I did just move the code but some of the review comments are on the existing code. I did end > up adding a function initialize_stack_zone_sizes and removing the set*zone functions as a result and changed some > comments. And added some consts. Re: the revised StackOverflow comment Looks good to me. ------------- PR: https://git.openjdk.java.net/jdk/pull/522 From coleenp at openjdk.java.net Wed Oct 7 19:05:09 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 7 Oct 2020 19:05:09 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v3] In-Reply-To: References: Message-ID: On Wed, 7 Oct 2020 18:15:59 GMT, Thomas Stuefe wrote: >> I just found this comment. I think the ascii art was added by @GoeLin. I just moved it. Your picture is upside down >> but it sorta makes sense that the 'base' addresses point to the first address in the range, which is what I think they >> do. > > I'm quite sure they don't. > > stack_base() points to one-beyond-the-highest address in stack and therefore outside the stack. If the stack is 8 > pages, stack_base points to the start of the 9th page. Therefore stack_base may actually point into a different memory > region, eg the stack of a neighboring thread, should they happen to be allocated without gap. stack_red_zone_base() > points to one-beyond-the-highest address in the red zone resp. the lowest address in the yellow zone. So it points > outside the red zone. And so forth, for all other "base" values. All are one-beyond pointers. > > But nothing that needs to be addressed with your patch, of course. in Thread::record_stack_base_and_size -> set_stack_base(os::current_stack_base()); which has different implementations in os_cpu files. You're saying that these set stack_base to one word beyond? Which makes all the calculations off by one. We should file a bug or rfe to clean this up. I haven't worked out how it would manifest itself as a bug. I'll file it but you might need to fill in some details. Right, I'm not going to address it with this patch, which is supposed to be a cleanup. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/522 From minqi at openjdk.java.net Wed Oct 7 19:44:27 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Wed, 7 Oct 2020 19:44:27 GMT Subject: RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive [v12] In-Reply-To: References: <9emWKl6fr-GA5LN0uHhuEd5D123QcoCiHQR1M9bAbag=.cc4b6129-8b33-47e4-a421-9e6b4817933b@github.com> Message-ID: On Wed, 7 Oct 2020 17:48:41 GMT, Ioi Lam wrote: >> Yumin Qi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains >> 23 commits: >> - Added new separate function to CDS for logging species and modified the existing function to log lambda form invokers. >> Changed isDumpLoadedClassList to a reasonable name isDumpingClassList as read only in CDS. >> - Merge branch 'master' of https://github.com/openjdk/jdk into jdk-8247536 >> - Removed unused imports. >> - Fixed comments with correct class and method name in CDS, removed unused variables after last change. >> - Moved and renamed cdsGenerateHolderClasses from GenerateJLIClassesHelp to CDS as generateLambdaFormHolderClasses. Added >> input verification function in CDS before class generation. Added more test scenarios. Removed trailing unused ending >> words for output of lambda form trace line in case of DumpLoadedClassList. >> - Move the check work to java, restore code in VM. Modified test code according to the changes. The invoke name >> verififcation is not implemented since not all the holder class are processed, not all the functions of processed >> holder classes are added. For holder class with DirectMethodHandle in its name, only the name in the >> DMH_METHOD_TYPE_MAP keyset is added, ithe line with other names just gets skipped silently. This makes the verification >> on invoke names difficul, a name not in the keyset should not fail the test. Also add a boolean to >> cdsGenerateHolderClasses to indicate call path. >> - Remove trailing word of line which is not used in holder class regeneration. There is a trailing LF (Line Feed) so trim >> white spaces from both front and end of the line or it will fail method type validation. >> - In case of exception happens during reloading class, CHECK will return without free the allocated buffer for class >> bytes so moved the buffer allocation and freeing to caller. Also removed test 6 since there is not guarantee that we >> can give a signature which will always fail. Additional changes to GenerateJLIClassesHelper according to review >> suggestion. >> - Merge branch 'master' of https://github.com/openjdk/jdk into jdk-8247536 >> - Merge branch 'master' of https://git.openjdk.java.net/jdk into jdk-8247536 >> - ... and 13 more: https://git.openjdk.java.net/jdk/compare/82fe023b...f5584dcf > > src/hotspot/share/prims/jvm.cpp line 3872: > >> 3870: JVM_ENTRY(jboolean, JVM_IsDumpingClassList(JNIEnv *env)) >> 3871: JVMWrapper("JVM_IsDumpingClassList"); >> 3872: return DumpLoadedClassList != NULL && classlist_file->is_open(); > > For sanity, it's better to add `classlist_file != NULL` done ------------- PR: https://git.openjdk.java.net/jdk/pull/193 From minqi at openjdk.java.net Wed Oct 7 19:44:24 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Wed, 7 Oct 2020 19:44:24 GMT Subject: RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive [v13] In-Reply-To: References: Message-ID: > This patch is reorganized after 8252725, which is separated from this patch to refactor jlink glugin code. The previous > webrev with hg can be found at: http://cr.openjdk.java.net/~minqi/2020/8247536/webrev-05. With 8252725 integrated, the > regeneration of holder classes is simply to call the new added GenerateJLIClassesHelper.cdsGenerateHolderClasses > function. Tests: tier1-4 Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: Move assert on klass is InstanceKlass after its NULL check. Added sanity check on class_list_file is not NULL before check on it is open. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/193/files - new: https://git.openjdk.java.net/jdk/pull/193/files/f5584dcf..107192f3 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=193&range=12 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=193&range=11-12 Stats: 3 lines in 2 files changed: 1 ins; 1 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/193.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/193/head:pull/193 PR: https://git.openjdk.java.net/jdk/pull/193 From minqi at openjdk.java.net Wed Oct 7 19:44:29 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Wed, 7 Oct 2020 19:44:29 GMT Subject: RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive [v11] In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 18:12:50 GMT, Mandy Chung wrote: >> Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed unused imports. > > src/java.base/share/classes/jdk/internal/misc/CDS.java line 83: > >> 81: * check if -XX:+DumpLoadedClassList and given file is open >> 82: */ >> 83: public static boolean isDumpLoadedClassList() { > > I agree with Ioi's suggestion to rename this to `isDumpingClassList` which describes what the VM is doing. Done > src/java.base/share/classes/java/lang/invoke/GenerateJLIClassesHelper.java line 74: > >> 72: System.out.println(traceSP + (salvage != null ? " (salvaged)" : " (generated)")); >> 73: } >> 74: CDS.traceLambdaFormInvoker(traceSP); > > I suggest leaving the existing code unchanged. Instead, add the following: > if (CDS.isDumpingClassList()) { > CDS.traceSpeciesType(cn); > } > > The above uses Ioi's suggested method name which reads better. Done > src/java.base/share/classes/java/lang/invoke/GenerateJLIClassesHelper.java line 63: > >> 61: if (TRACE_RESOLVE) { >> 62: System.out.println(traceLF + (resolvedMember != null ? " (success)" : " (fail)")); >> 63: } > > I suggest not to change the existing code. Instead, have `CDS::traceLambdaFormInvoker` > to take individual parameters `Class holder, String name, String shortenSignature` > (rather than the formatted string). Something like: > > if (CDS.isDumpLoadedClassList()) { > CDS.traceLambdaFormInvoker(holder, name, shortenSignature(basicTypeSignature(type)); > } > > This also gives flexibility to CDS to decide on what format to write to the class list (like this case, you drop the > text "success/fail") > In addition, the conditional check on `CDS.isDumpLoadedClassList()` is hard to relate to why CDS traces these events. > I see Ioi's comment on this method name too. I agree with Ioi that `isDumpingClassList` makes more sense. Done ------------- PR: https://git.openjdk.java.net/jdk/pull/193 From sspitsyn at openjdk.java.net Wed Oct 7 19:50:11 2020 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Wed, 7 Oct 2020 19:50:11 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v7] In-Reply-To: <6Scp6XjVCcdJN0tUKionVwGKoiBG8UeA-OpBXHrCYqk=.01170b1e-9722-4461-84e4-77e8fd447ac4@github.com> References: <6Scp6XjVCcdJN0tUKionVwGKoiBG8UeA-OpBXHrCYqk=.01170b1e-9722-4461-84e4-77e8fd447ac4@github.com> Message-ID: On Wed, 7 Oct 2020 17:52:27 GMT, Richard Reingruber wrote: >> Hi, >> >> this is the continuation of the review of the implementation for: >> >> https://bugs.openjdk.java.net/browse/JDK-8227745 >> https://bugs.openjdk.java.net/browse/JDK-8233915 >> >> It allows for JIT optimizations based on escape analysis even if JVMTI agents acquire capabilities to access references >> to objects that are subject to such optimizations, e.g. scalar replacement. The implementation reverts such >> optimizations just before access very much as when switching from JIT compiled execution to the interpreter, aka >> "deoptimization". Webrev.8 was the last one before before the transition to Git/Github: >> >> http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ >> >> Thanks, Richard. > > Richard Reingruber has updated the pull request incrementally with five additional commits since the last revision: > > - Factorized fragment out of EscapeBarrier::deoptimize_objects_internal into new method in compiledVFrame. > - More smaller changes proposed by Serguei. > - jvmtiDeferredUpdates.hpp: remove forward declarations. > - jvmtiDeferredLocalVariable: move member variables to the beginning of the class definition. > - jvmtiDeferredUpdates.hpp: add/remove empty lines and improve indentation. Richard, Thank you for the formatting and refactoring changes. The fix looks good to me. ------------- Marked as reviewed by sspitsyn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/119 From bulasevich at openjdk.java.net Thu Oct 8 06:49:23 2020 From: bulasevich at openjdk.java.net (Boris Ulasevich) Date: Thu, 8 Oct 2020 06:49:23 GMT Subject: RFR: 8253901: ARM32: SIGSEGV during monitorexit due to incorrect register use (after JDK-8253540) In-Reply-To: References: Message-ID: On Wed, 7 Oct 2020 16:55:29 GMT, Aleksey Shipilev wrote: > This looks sensible to me. Thank you! > I assume `tier1`, `tier2` pass with these changes? Yes! ------------- PR: https://git.openjdk.java.net/jdk/pull/503 From erikj at openjdk.java.net Thu Oct 8 06:50:04 2020 From: erikj at openjdk.java.net (Erik Joelsson) Date: Thu, 8 Oct 2020 06:50:04 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) In-Reply-To: References: Message-ID: <_GsbJIUKoZcZhBL3g5LIazayukM61sEJOqQcKpp7Hzw=.12a07196-262a-411e-bb04-70e1321564dd@github.com> On Wed, 7 Oct 2020 17:13:22 GMT, Maurizio Cimadamore wrote: > This patch contains the changes associated with the third incubation round of the foreign memory access API incubation > (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: > * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from > multiple threads > * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee > that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class > has been added, which defines several useful dereference routines; these are really just thin wrappers around memory > access var handles, but they make the barrier of entry for using this API somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not > the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link > to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit > of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which > wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as > dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability > in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; > secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can > use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done > by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided > below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be > happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, > Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd > like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio > Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative > to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a > carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access > base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte > offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. > `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which > it is easy to derive all the other handles using plain var handle combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both > `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients > can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. > * `MemoryLayouts` > * A new layout, for machine addresses, has been added to the mix. > > > > ### Implementation changes > > There are two main things to discuss here: support for shared segments, and the general simplification of the memory > access var handle support. > #### Shared segments > > The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to > achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment > is shared, it would be possible for a thread to close it while another is accessing it. After considering several > options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he > reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world > (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a > close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and > the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). > It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of > these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, > we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to > whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of > stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, > once we detect that a thread is accessing the very segment we're about to close, what should happen? We first > experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it > fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the > machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to > minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread > is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and > try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should > be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single > place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in > addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` > annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) > class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, > like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is > tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during > access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory > access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead > of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a > `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed > successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two > implementations, one for confined segments and one for shared segments; the main difference between the two is what > happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared > segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or > `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` > state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. #### > Memory access var handles overhaul The key realization here was that if all memory access var handles took a > coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle > form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var > handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that > e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the > implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level > access on the innards of the memory access var handle. All that code is now gone. #### Test changes Not much to see > here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, > since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` > functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the > microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared > segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - > https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 Build changes look pretty good, just a few minor nits. make/modules/java.base/gensrc/GensrcScopedMemoryAccess.gmk line 33: > 31: > 32: SCOPED_MEMORY_ACCESS_TEMPLATE := > $(TOPDIR)/src/java.base/share/classes/jdk/internal/misc/X-ScopedMemoryAccess.java.template 33: > SCOPED_MEMORY_ACCESS_BIN_TEMPLATE := > $(TOPDIR)/src/java.base/share/classes/jdk/internal/misc/X-ScopedMemoryAccess-bin.java.template Should these variables be defined based on SCOPED_MEMORY_ACCESS_SRC_DIR to avoid repeating that path? make/modules/java.base/gensrc/GensrcScopedMemoryAccess.gmk line 151: > 149: $(CP) $(SCOPED_MEMORY_ACCESS_TEMPLATE) $(DEST) > 150: $(foreach t, $(SCOPE_MEMORY_ACCESS_TYPES), \ > 151: $(TOOL_SPP) -nel -K$(BIN_$t_type) -Dtype=$(BIN_$t_type) -DType=$(BIN_$t_Type) $(BIN_$t_ARGS) \ Please indent with and 2 spaces for logical indent of the foreach body. make/modules/java.base/gensrc/GensrcScopedMemoryAccess.gmk line 155: > 153: $(PRINTF) "}\n" >> $(DEST) > 154: > 155: TARGETS += $(DEST) Missing newline. ------------- PR: https://git.openjdk.java.net/jdk/pull/548 From github.com+51754783+coreyashford at openjdk.java.net Thu Oct 8 06:53:42 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Thu, 8 Oct 2020 06:53:42 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2] In-Reply-To: References: <6P-wrlA7c9wlXEhcoAYW5J9W2jwTmTTj3PkDObPh0LA=.c11aa048-8853-4e1c-a689-c34b89c15f12@github.com> Message-ID: On Wed, 7 Oct 2020 16:26:04 GMT, Martin Doerr wrote: >> CoreyAshford has updated the pull request with a new target base due to a merge or a rebase. The pull request now >> contains ten commits: >> - AOT: Revert change to aotCodeHeap.cpp for decodeBlock >> >> Don't add the SET_AOT_GLOBAL_SYMBOL_VALUE macro for decode block until all >> arches that implement AOT, implement the decodeBlock intrinsic. >> - Base64.java decodeBlock: Changes from PR review >> >> * Make comparison safer and consistent with the while loop >> * Update comment about the decodeBlock intrinsic so that it matches the new structure >> * Add comment about the lack of a length check on the destination buffer >> * As per issue 8138732, change HotSpotIntrinsicCandidate to IntrinsicCandidate >> - stubGenerator_ppc.cpp: Changes from PR review >> >> * Fix clearing of upper bits to clear 32 bits instead of 31 (due to misreading of clrldi instruction) >> * change and document loop_unrolls setting from 8 to 2 after re-running the benchmark >> * align unrolled loop on a 32-byte boundary >> * replace instruction used for checking isURL from a double word to single >> word instruction since the register is effectively 32 bits wide >> * cosmetic change to realign register comments. >> - TestBase64.java: Changes from PR review >> >> * Use Utils.toByteArrays() method instead of a locally-defined method >> * Generate the two non-Base64 tables dynamically rather than use static initialization >> * Added comments describing the two above-mentioned arrays >> - Expand the Base64 intrinsic regression test to cover decodeBlock >> >> This patch makes four significant changes: >> >> 1) The Power implementation of the decodeBlock intrinsic, at least, >> requires a decode length of at least 128 bytes, but the existing test cases >> are much shorter, maxing out at 111 bytes. So the patch adds a new input >> data file which has longer test cases in it. >> >> 2) The original test cases only covers the encoding of just the printable >> subset of the 7-bit ASCII characters. However, Base64 encoding requires >> being able to encode arbitrary binary data, i.e. it must handle all 256 >> 8-bit byte encodings. To remedy this, but keep the original line-oriented >> style of the input data, I added another input file type that uses a simple >> ASCII hexadecimal encoding - two ASCII hex characters per 8-bit byte. When >> test0 is called, a new parameter is passed that specifies the type of the >> input file, which is either the original ASCII type or the hexadecimal >> format. So to test both longer input data and arbitrary 8-bit data, the >> newly added input test file has test cases which are both longer and >> encoded in ASCII hex so as to give full 8-bit capability. When reading >> this type of file, test0 calls a newly-added function to translate the >> ASCII hex to binary data. Except for the first line of input data, which >> contains all possible 8-bit values sequentially, the input data was >> generated using a random length (between 111 and 520 bytes) buffer filled >> with random 8-bit data, which should give adequate coverage. >> >> 3) The original test did not test that the decoder detects illegal Base64 >> bytes. This change chooses a random location in the encoded data to >> corrupt with a randomly-chosen byte which is illegal for the specific >> Base64 encoding that is chosen (i.e. standard or URLsafe). It then calls >> the decode function to verify that the illegal byte is detected and the >> proper exception is thrown. >> >> 4) The test iteration count was originally 100K, but that is far more than >> enough iterations to test the intrinsic. It takes 20K iterations on each >> instrinsic for HotSpot C2 to begin calling it. The test originally had >> three types of encodings to test and called the encode intrinsic four times >> for each iteration, which works out to 100K * 3 * 4 = 1.2M calls just to >> encode. Decode was called four times as well (now five because of the >> illegal byte test). I believe this is excessive and with the extra test >> data I have added, the test was timing out after ten minutes of execution. >> It appears that it is timing out, not because the intrinsics take a long >> time to run, but because test0 generates an enormous number of discarded >> data buffers for the GC system to recover (the test runs at about 39GB of >> virtual memory on my test machine). To remedy the timeout problem, I have >> changed the code so that a warmup function of 20K repetitions is performed >> on a fixed buffer, to activate the instrinsic(s). After the warmup, I have >> reduced the number of iterations to 5K on each test0 call. This should >> give adequate coverage. >> - Add JMH benchmark for Base64 variable length buffer decoding >> - Add Power9+ intrinsic implementation for Base64 decoding >> - Add HotSpot code to implement Base64 decodeBlock API >> - Add HotSpotIntrinsicCandidate and API for Base64 decoding > > Hi Corey, > thanks for contributing this change. Looks basically good. Please address the inline comments from Roger and me. > Core libs part is reviewed by Roger and the whole change by me. The shared hotspot part is straight forward because > it's very similar to the encode intrinsic. So I think we only need a 2nd review for the PPC64 algorithm implementation. > I can sponsor the change when this is completed. Ready for another review. I hope I addressed all of the issues raised. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From github.com+51754783+coreyashford at openjdk.java.net Thu Oct 8 06:53:26 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Thu, 8 Oct 2020 06:53:26 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v3] In-Reply-To: References: Message-ID: > This patch set encompasses the following commits: > > - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for > the intrinsic. The API is similar to the existing encodeBlock intrinsic. > - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic > implementation > - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. > - Adds a JMH microbenchmark for both Base64 encoding and encoding. > - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. CoreyAshford has updated the pull request incrementally with seven additional commits since the last revision: - stubGenerator_ppc.cpp: Fix multiple issues as per Martin Doerr's v2 review * Remove extraneous comma from SAP copyright notice * Move align(32) to the head of the loop rather than the beginning of the unwound code * Simplified looping condition to use a loop counter instead of a final address. This eliminated the need for the "end" variable, and essentially replaced it with CTR, which is computed using a simple bitwise shift of the size. * Re-ran benchmarks against loop_unrolls values: 1, 2, 4, 8, 16 to find optimal value, now 4. * Corrected a typo in the word "elements" - vm_version_ppc.cpp: per Martin Doerr's review of v2: fix copy/paste error - vmIntrinsics.cpp: Per Martin Doerr's v2 review: rearrange order of case statement to be consistent with others. - runtime.cpp: per Martin Doerr's review of v2, correct comment as per current semantics of decodeBlock() * The reference to "ofs" seems to be a copy/paste error. * -1 is no longer returned from decodeBlock() in the event of a non-base64 character being encountered; only a count of bytes written to dst. - TestBase64.java: Change comment as per Martin Doerr's v2 review - Base64.java: Make changes as per Roger Riggs and Martin Doerr's v2 Review * Make comment about the sl parameter more precise * Fix comparison to avoid possible integer overflow of sp - library_call.cpp: Fix rebase merge error ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/293/files - new: https://git.openjdk.java.net/jdk/pull/293/files/e42ac7db..8932c233 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=293&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=293&range=01-02 Stats: 42 lines in 7 files changed: 9 ins; 11 del; 22 mod Patch: https://git.openjdk.java.net/jdk/pull/293.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/293/head:pull/293 PR: https://git.openjdk.java.net/jdk/pull/293 From shade at openjdk.java.net Thu Oct 8 06:54:21 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 8 Oct 2020 06:54:21 GMT Subject: RFR: 8254096: remove jdk.test.lib.Utils::getMandatoryProperty(String) method In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 20:54:50 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this small and trivial cleanup that removes `getMandatoryProperty` method from > `jdk.test.lib.Utils` as it's unused? > Thanks, > -- Igor Marked as reviewed by shade (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/533 From iignatyev at openjdk.java.net Thu Oct 8 06:54:31 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Thu, 8 Oct 2020 06:54:31 GMT Subject: RFR: 8254096: remove jdk.test.lib.Utils::getMandatoryProperty(String) method In-Reply-To: References: Message-ID: On Thu, 8 Oct 2020 05:32:17 GMT, Aleksey Shipilev wrote: >> Hi all, >> >> could you please review this small and trivial cleanup that removes `getMandatoryProperty` method from >> `jdk.test.lib.Utils` as it's unused? >> Thanks, >> -- Igor > > Marked as reviewed by shade (Reviewer). Thanks Aleksey. ------------- PR: https://git.openjdk.java.net/jdk/pull/533 From github.com+70893615+jasontatton-aws at openjdk.java.net Thu Oct 8 06:55:15 2020 From: github.com+70893615+jasontatton-aws at openjdk.java.net (Jason Tatton) Date: Thu, 8 Oct 2020 06:55:15 GMT Subject: RFR: 8173585: Intrinsify StringLatin1.indexOf(char) [v5] In-Reply-To: <_T0873dC5tfUtGn9r1_Y21JkPKRt-za3MM9hPN2GQKQ=.b865fe53-5417-424f-81b6-1566a330640e@github.com> References: <_T0873dC5tfUtGn9r1_Y21JkPKRt-za3MM9hPN2GQKQ=.b865fe53-5417-424f-81b6-1566a330640e@github.com> Message-ID: > This is an implementation of the indexOf(char) intrinsic for StringLatin1 (1 byte encoded Strings). It is provided for > x86 and ARM64. The implementation is greatly inspired by the indexOf(char) intrinsic for StringUTF16. To incorporate it > I had to make a small change to StringLatin1.java (refactor of functionality to intrisified private method) as well as > code for C2. Submitted to: hotspot-compiler-dev and core-libs-dev as this patch contains a change to hotspot and > java/lang/StringLatin1.java https://bugs.openjdk.java.net/browse/JDK-8173585 > > Details of testing: > ============ > I have created a jtreg test ?compiler/intrinsics/string/TestStringLatin1IndexOfChar? to cover this new intrinsic. Note > that, particularly for the x86 implementation of the intrinsic, the code path taken is dependent upon the length of the > input String. Hence the test has been designed to cover all these cases. In summary they are: > - A ?short? string of < 16 characters. > - A SIMD String of 16 ? 31 characters. > - A AVX2 SIMD String of 32 characters+. > > Hardware used for testing: > ----------------------------- > > - Intel Xeon CPU E5-2680 (JVM did not recognize this as having AVX2 support) ? Intel i7 processor (with AVX2 support). > - AWS Graviton 2 (ARM 64 processor). > > I also ran; ?run-test-tier1? and ?run-test-tier2? for: x86_64 and aarch64. > > Possible future enhancements: > ==================== > For the x86 implementation there may be two further improvements we can make in order to improve performance of both > the StringUTF16 and StringLatin1 indexOf(char) intrinsics: > 1. Make use of AVX-512 instructions. > 2. For ?short? Strings (see below), I think it may be possible to modify the existing algorithm to still use SSE SIMD > instructions instead of a loop. > Benchmark results: > ============ > **Without** the new StringLatin1 indexOf(char) intrinsic: > > | Benchmark | Mode | Cnt | Score | Error | Units | > | ------------- | ------------- |------------- |------------- |------------- |------------- | > | IndexOfBenchmark.latin1_mixed_char | avgt | 5 | **26,389.129** | ? 182.581 | ns/op | > | IndexOfBenchmark.utf16_mixed_char | avgt | 5 | 17,885.383 | ? 435.933 | ns/op | > > > **With** the new StringLatin1 indexOf(char) intrinsic: > > | Benchmark | Mode | Cnt | Score | Error | Units | > | ------------- | ------------- |------------- |------------- |------------- |------------- | > | IndexOfBenchmark.latin1_mixed_char | avgt | 5 | **17,875.185** | ? 407.716 | ns/op | > | IndexOfBenchmark.utf16_mixed_char | avgt | 5 | 18,292.802 | ? 167.306 | ns/op | > > > The objective of the patch is to bring the performance of StringLatin1 indexOf(char) in line with StringUTF16 > indexOf(char) for x86 and ARM64. We can see above that this has been achieved. Similar results were obtained when > running on ARM. Jason Tatton has updated the pull request incrementally with one additional commit since the last revision: 8173585: Intrinsify StringLatin1.indexOf(char) Added new unit test: findOneItem. This will test strings of varying length ensuring that for all lengths one instance of the search char can be found. We check what happens when the search character is in each position of the search string (including first and last positions). ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/71/files - new: https://git.openjdk.java.net/jdk/pull/71/files/c8a2849e..8ead02ab Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=71&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=71&range=03-04 Stats: 34 lines in 1 file changed: 26 ins; 0 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/71.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/71/head:pull/71 PR: https://git.openjdk.java.net/jdk/pull/71 From github.com+70893615+jasontatton-aws at openjdk.java.net Thu Oct 8 06:55:53 2020 From: github.com+70893615+jasontatton-aws at openjdk.java.net (Jason Tatton) Date: Thu, 8 Oct 2020 06:55:53 GMT Subject: RFR: 8173585: Intrinsify StringLatin1.indexOf(char) [v4] In-Reply-To: References: <_T0873dC5tfUtGn9r1_Y21JkPKRt-za3MM9hPN2GQKQ=.b865fe53-5417-424f-81b6-1566a330640e@github.com> Message-ID: On Tue, 6 Oct 2020 20:18:02 GMT, Nils Eliasson wrote: >> Jason Tatton has updated the pull request incrementally with one additional commit since the last revision: >> >> 8173585: Intrinsify StringLatin1.indexOf(char) >> >> Rewrite of unit test and newlines added to end of files >> >> Changes to unit test: >> - main test adjusted such that Strings gennerated are much longer (up to >> 2048 characters) and of the form: azaza, aazaazaa, aaazaaazaaa, etc with >> 'z' being the search character searched for. Multiple instances of the >> search character are included in the String in order to validate that >> the starting offset is correctly handleded. Results are compared to non >> intrinsified version of the code. Longer strings means that the looping >> functionality of the various paths is entered into. >> - Run configurations introduced such that it checks behaviour where use >> of SSE and AVX instructions are restricted. >> - Tier4InvocationThreshold adjusted so as to ensure C2 code iis invoked. >> >> Other changes: >> - newlines added at end of files > > test/hotspot/jtreg/compiler/intrinsics/string/TestStringLatin1IndexOfChar.java line 25: > >> 23: import jdk.test.lib.Asserts; >> 24: >> 25: public class TestStringLatin1IndexOfChar{ > > Can you please add testing for these edge cases: > - when the search char is the first char > - when the search char is the last char > - when the string has length 1 Thanks for reviewing this. I have added a new test: `findOneItem` which covers these edge cases ------------- PR: https://git.openjdk.java.net/jdk/pull/71 From iignatyev at openjdk.java.net Thu Oct 8 06:58:42 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Thu, 8 Oct 2020 06:58:42 GMT Subject: Integrated: 8254096: remove jdk.test.lib.Utils::getMandatoryProperty(String) method In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 20:54:50 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this small and trivial cleanup that removes `getMandatoryProperty` method from > `jdk.test.lib.Utils` as it's unused? > Thanks, > -- Igor This pull request has now been integrated. Changeset: 9cdfd0fa Author: Igor Ignatyev URL: https://git.openjdk.java.net/jdk/commit/9cdfd0fa Stats: 13 lines in 1 file changed: 0 ins; 13 del; 0 mod 8254096: remove jdk.test.lib.Utils::getMandatoryProperty(String) method Reviewed-by: shade ------------- PR: https://git.openjdk.java.net/jdk/pull/533 From coleenp at openjdk.java.net Thu Oct 8 06:56:31 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 8 Oct 2020 06:56:31 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v6] In-Reply-To: References: Message-ID: > This change moves the significant amount of stack overflow related code (with ascii art!) out of thread files into a > new file. Many of the functions are static functions and some go through JavaThread::_stack_overflow_state where > needed. All functions are moved and not modified except for qualification. I also added a delegating constructor to > JavaThread::JavaThread so reordered the assignments as initializers from JavaThread::initialize. > Tested with tier1-6 and builds on arm32, ppc, s390 and zero. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Revised comment ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/522/files - new: https://git.openjdk.java.net/jdk/pull/522/files/3191748b..6e3d070b Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=522&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=522&range=04-05 Stats: 14 lines in 3 files changed: 7 ins; 3 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/522.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/522/head:pull/522 PR: https://git.openjdk.java.net/jdk/pull/522 From dholmes at openjdk.java.net Thu Oct 8 06:56:48 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 8 Oct 2020 06:56:48 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v6] In-Reply-To: References: Message-ID: On Thu, 8 Oct 2020 06:52:05 GMT, Coleen Phillimore wrote: >> This change moves the significant amount of stack overflow related code (with ascii art!) out of thread files into a >> new file. Many of the functions are static functions and some go through JavaThread::_stack_overflow_state where >> needed. All functions are moved and not modified except for qualification. I also added a delegating constructor to >> JavaThread::JavaThread so reordered the assignments as initializers from JavaThread::initialize. >> Tested with tier1-6 and builds on arm32, ppc, s390 and zero. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Revised comment With regard to the discussion on whether "*_base()" functions are inclusive or exclusive, the stack_base() function is exclusive and we fixed a number of checks that were incorrectly checking <= stack_base() rather than < stack_base(). See: https://bugs.openjdk.java.net/browse/JDK-8234372 and related issues. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/522 From stuefe at openjdk.java.net Thu Oct 8 06:57:12 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 8 Oct 2020 06:57:12 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v6] In-Reply-To: References: Message-ID: <1_g79Qy5B8dtub9THxK25LWJ5_Md7cqZ6HgRgB7mNso=.ffcd7e46-c0bb-4248-9da0-a9bf1be549c8@github.com> On Thu, 8 Oct 2020 06:52:05 GMT, Coleen Phillimore wrote: >> This change moves the significant amount of stack overflow related code (with ascii art!) out of thread files into a >> new file. Many of the functions are static functions and some go through JavaThread::_stack_overflow_state where >> needed. All functions are moved and not modified except for qualification. I also added a delegating constructor to >> JavaThread::JavaThread so reordered the assignments as initializers from JavaThread::initialize. >> Tested with tier1-6 and builds on arm32, ppc, s390 and zero. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Revised comment @coleenp: Looks all good to me now. Thanks for that work! @dholmes-ora : I knew remember now the discussion as well. Seeing that it was only in February this is embarrassing :) I still find the fact the base values are off by one odd but having them at crooked values (page ends) would be odd too, so maybe its just a matter of brushing up comments. ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/522 From coleenp at openjdk.java.net Thu Oct 8 06:57:29 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 8 Oct 2020 06:57:29 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v5] In-Reply-To: References: <8lfHcGKHmQfPsBmJIPr9_gWW6IYTShT5ByyCVOxYawg=.3ae0d350-53e5-43b0-81f7-2521e4afdfdd@github.com> Message-ID: On Wed, 7 Oct 2020 18:56:03 GMT, Daniel D. Daugherty wrote: >> Re: your comment above. I did just move the code but some of the review comments are on the existing code. I did end >> up adding a function initialize_stack_zone_sizes and removing the set*zone functions as a result and changed some >> comments. And added some consts. > > Re: the revised StackOverflow comment > > Looks good to me. Thanks. Updated the PR. ------------- PR: https://git.openjdk.java.net/jdk/pull/522 From bulasevich at openjdk.java.net Thu Oct 8 06:59:00 2020 From: bulasevich at openjdk.java.net (Boris Ulasevich) Date: Thu, 8 Oct 2020 06:59:00 GMT Subject: Integrated: 8253901: ARM32: SIGSEGV during monitorexit due to incorrect register use (after JDK-8253540) In-Reply-To: References: Message-ID: On Mon, 5 Oct 2020 10:38:45 GMT, Boris Ulasevich wrote: > [JDK-8253540](https://bugs.openjdk.java.net/browse/JDK-8253540) changed InterpreterRuntime::monitorexit call from > call_VM to call_VM_leaf. This requires additional arrangement for ARM32: the parameter must be in R0. This pull request has now been integrated. Changeset: fd0cb98e Author: Boris Ulasevich URL: https://git.openjdk.java.net/jdk/commit/fd0cb98e Stats: 9 lines in 3 files changed: 2 ins; 0 del; 7 mod 8253901: ARM32: SIGSEGV during monitorexit due to incorrect register use (after JDK-8253540) Reviewed-by: shade ------------- PR: https://git.openjdk.java.net/jdk/pull/503 From coleenp at openjdk.java.net Thu Oct 8 06:57:49 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 8 Oct 2020 06:57:49 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v5] In-Reply-To: References: <8lfHcGKHmQfPsBmJIPr9_gWW6IYTShT5ByyCVOxYawg=.3ae0d350-53e5-43b0-81f7-2521e4afdfdd@github.com> Message-ID: On Wed, 7 Oct 2020 18:45:14 GMT, Coleen Phillimore wrote: >> src/hotspot/share/runtime/stackOverflow.cpp line 38: >> >>> 36: size_t StackOverflow::_stack_shadow_zone_size = 0; >>> 37: >>> 38: void StackOverflow::initialize_stack_zone_sizes(size_t alignment) { >> >> I think you can remove the parameter and hard-code 4K inside this function. The reason is that the "StackXXXPages" >> parameters are defined as "number of 4k units", not "number of pages". > > Ok, I'll move the comment in os.cpp also that tells you why 4k is a thing. Ok, I made the suggested change and updated the PR. ------------- PR: https://git.openjdk.java.net/jdk/pull/522 From coleenp at openjdk.java.net Thu Oct 8 06:58:40 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 8 Oct 2020 06:58:40 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v3] In-Reply-To: References: Message-ID: <_c_nbhTOsNLY-NlTtKaMOt1nHqsOoxNkK5Ho9ihySEk=.859b87a4-2148-45dc-8a0a-246d84cbbefa@github.com> On Wed, 7 Oct 2020 06:44:08 GMT, Thomas Stuefe wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> fix comments > > src/hotspot/share/runtime/stackOverflow.hpp line 176: > >> 174: return (a <= stack_reserved_zone_base()) && >> 175: (a >= (address)((intptr_t)stack_reserved_zone_base() - stack_reserved_zone_size())); >> 176: } > > Same here, a==reserved_zone_base is strictly speaking outside the reserved zone. I added this to https://bugs.openjdk.java.net/browse/JDK-8254189. ------------- PR: https://git.openjdk.java.net/jdk/pull/522 From eosterlund at openjdk.java.net Thu Oct 8 07:00:16 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 8 Oct 2020 07:00:16 GMT Subject: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v12] In-Reply-To: <19ts4LYwyvtljcsVAjrinz6Jx2esPUWeUdByyX0CUUo=.2b78dff1-8d73-453e-95f5-4362cc3635f3@github.com> References: <19ts4LYwyvtljcsVAjrinz6Jx2esPUWeUdByyX0CUUo=.2b78dff1-8d73-453e-95f5-4362cc3635f3@github.com> Message-ID: On Wed, 7 Oct 2020 18:03:10 GMT, Stuart Monteith wrote: > I've been reviewing this and stepping through the debugger. It looks OK to me. Thanks for the review Stuart. ------------- PR: https://git.openjdk.java.net/jdk/pull/296 From minqi at openjdk.java.net Thu Oct 8 06:59:23 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Thu, 8 Oct 2020 06:59:23 GMT Subject: RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive [v14] In-Reply-To: References: Message-ID: > This patch is reorganized after 8252725, which is separated from this patch to refactor jlink glugin code. The previous > webrev with hg can be found at: http://cr.openjdk.java.net/~minqi/2020/8247536/webrev-05. With 8252725 integrated, the > regeneration of holder classes is simply to call the new added GenerateJLIClassesHelper.cdsGenerateHolderClasses > function. Tests: tier1-4 Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: Made isDumpingClassList a private final and add a public function to access it. Changed function validateInputLines to isValidInputLines and return a boolean to indicate its valid. Added more comments for review concern. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/193/files - new: https://git.openjdk.java.net/jdk/pull/193/files/107192f3..f163fe4c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=193&range=13 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=193&range=12-13 Stats: 28 lines in 2 files changed: 9 ins; 6 del; 13 mod Patch: https://git.openjdk.java.net/jdk/pull/193.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/193/head:pull/193 PR: https://git.openjdk.java.net/jdk/pull/193 From mchung at openjdk.java.net Thu Oct 8 06:59:55 2020 From: mchung at openjdk.java.net (Mandy Chung) Date: Thu, 8 Oct 2020 06:59:55 GMT Subject: RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive [v12] In-Reply-To: <9emWKl6fr-GA5LN0uHhuEd5D123QcoCiHQR1M9bAbag=.cc4b6129-8b33-47e4-a421-9e6b4817933b@github.com> References: <9emWKl6fr-GA5LN0uHhuEd5D123QcoCiHQR1M9bAbag=.cc4b6129-8b33-47e4-a421-9e6b4817933b@github.com> Message-ID: <-y8lEorT3v4i2G_yBbrmd6faeU584CXFnPEiM8GcPec=.d9e8a152-c319-43dc-a6d1-f745b7eccb20@github.com> On Tue, 6 Oct 2020 20:46:17 GMT, Yumin Qi wrote: >> This patch is reorganized after 8252725, which is separated from this patch to refactor jlink glugin code. The previous >> webrev with hg can be found at: http://cr.openjdk.java.net/~minqi/2020/8247536/webrev-05. With 8252725 integrated, the >> regeneration of holder classes is simply to call the new added GenerateJLIClassesHelper.cdsGenerateHolderClasses >> function. Tests: tier1-4 > > Yumin Qi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains > 23 commits: > - Added new separate function to CDS for logging species and modified the existing function to log lambda form invokers. > Changed isDumpLoadedClassList to a reasonable name isDumpingClassList as read only in CDS. > - Merge branch 'master' of https://github.com/openjdk/jdk into jdk-8247536 > - Removed unused imports. > - Fixed comments with correct class and method name in CDS, removed unused variables after last change. > - Moved and renamed cdsGenerateHolderClasses from GenerateJLIClassesHelp to CDS as generateLambdaFormHolderClasses. Added > input verification function in CDS before class generation. Added more test scenarios. Removed trailing unused ending > words for output of lambda form trace line in case of DumpLoadedClassList. > - Move the check work to java, restore code in VM. Modified test code according to the changes. The invoke name > verififcation is not implemented since not all the holder class are processed, not all the functions of processed > holder classes are added. For holder class with DirectMethodHandle in its name, only the name in the > DMH_METHOD_TYPE_MAP keyset is added, ithe line with other names just gets skipped silently. This makes the verification > on invoke names difficul, a name not in the keyset should not fail the test. Also add a boolean to > cdsGenerateHolderClasses to indicate call path. > - Remove trailing word of line which is not used in holder class regeneration. There is a trailing LF (Line Feed) so trim > white spaces from both front and end of the line or it will fail method type validation. > - In case of exception happens during reloading class, CHECK will return without free the allocated buffer for class > bytes so moved the buffer allocation and freeing to caller. Also removed test 6 since there is not guarantee that we > can give a signature which will always fail. Additional changes to GenerateJLIClassesHelper according to review > suggestion. > - Merge branch 'master' of https://github.com/openjdk/jdk into jdk-8247536 > - Merge branch 'master' of https://git.openjdk.java.net/jdk into jdk-8247536 > - ... and 13 more: https://git.openjdk.java.net/jdk/compare/82fe023b...f5584dcf src/java.base/share/classes/jdk/internal/misc/CDS.java line 40: > 38: * indicator for dumping class list. > 39: */ > 40: static public final boolean isDumpingClassList; what about making this a private static field and adding a public static `isDumpingClassList()` method (which was in the previous version). src/java.base/share/classes/jdk/internal/misc/CDS.java line 144: > 142: String line = s.trim(); > 143: if (!line.startsWith("[LF_RESOLVE]") && !line.startsWith("[SPECIES_RESOLVE]")) { > 144: System.out.println("Wrong prefix: " + line); Should this throw an exception instead? src/java.base/share/classes/jdk/internal/misc/CDS.java line 155: > 153: System.out.println("Incorrecct number of items in the line: " + parts.length); > 154: System.out.println("line: " + line); > 155: return null; I think these error cases should throw `IllegalArgumentException` and VM decides how to handle the exception. src/java.base/share/classes/jdk/internal/misc/CDS.java line 140: > 138: // return null for invalid input > 139: private static Stream validateInputLines(String[] lines) { > 140: ArrayList list = new ArrayList(lines.length); Nit: this can use diamond operatior like this: `new ArrayList<>(lines.length)`. src/java.base/share/classes/jdk/internal/misc/CDS.java line 184: > 182: Objects.requireNonNull(lines); > 183: try { > 184: Stream lineStream = validateInputLines(lines); It seems clearer to have `validateInputLines` do validation only and convert this line into: validateInputLines(lines); Stream lineStream = Arrays.stream(lines); src/java.base/share/classes/jdk/internal/misc/CDS.java line 178: > 176: /** > 177: * called from vm to generate MethodHandle holder classes > 178: * @return @code { Object[] } if holder classes can be generated. type: `s/@code { Object[]/{@code Object[]}` src/java.base/share/classes/jdk/internal/misc/CDS.java line 198: > 196: return retArray; > 197: } catch (Exception e) { > 198: e.printStackTrace(); Is this a debugging statement? If CDS swallows the exception thrown, I think VM should emit the warning message and print the stack trace if appropriate. ------------- PR: https://git.openjdk.java.net/jdk/pull/193 From minqi at openjdk.java.net Thu Oct 8 07:00:27 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Thu, 8 Oct 2020 07:00:27 GMT Subject: RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive [v12] In-Reply-To: <-y8lEorT3v4i2G_yBbrmd6faeU584CXFnPEiM8GcPec=.d9e8a152-c319-43dc-a6d1-f745b7eccb20@github.com> References: <9emWKl6fr-GA5LN0uHhuEd5D123QcoCiHQR1M9bAbag=.cc4b6129-8b33-47e4-a421-9e6b4817933b@github.com> <-y8lEorT3v4i2G_yBbrmd6faeU584CXFnPEiM8GcPec=.d9e8a152-c319-43dc-a6d1-f745b7eccb20@github.com> Message-ID: On Wed, 7 Oct 2020 20:36:18 GMT, Mandy Chung wrote: >> Yumin Qi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains >> 23 commits: >> - Added new separate function to CDS for logging species and modified the existing function to log lambda form invokers. >> Changed isDumpLoadedClassList to a reasonable name isDumpingClassList as read only in CDS. >> - Merge branch 'master' of https://github.com/openjdk/jdk into jdk-8247536 >> - Removed unused imports. >> - Fixed comments with correct class and method name in CDS, removed unused variables after last change. >> - Moved and renamed cdsGenerateHolderClasses from GenerateJLIClassesHelp to CDS as generateLambdaFormHolderClasses. Added >> input verification function in CDS before class generation. Added more test scenarios. Removed trailing unused ending >> words for output of lambda form trace line in case of DumpLoadedClassList. >> - Move the check work to java, restore code in VM. Modified test code according to the changes. The invoke name >> verififcation is not implemented since not all the holder class are processed, not all the functions of processed >> holder classes are added. For holder class with DirectMethodHandle in its name, only the name in the >> DMH_METHOD_TYPE_MAP keyset is added, ithe line with other names just gets skipped silently. This makes the verification >> on invoke names difficul, a name not in the keyset should not fail the test. Also add a boolean to >> cdsGenerateHolderClasses to indicate call path. >> - Remove trailing word of line which is not used in holder class regeneration. There is a trailing LF (Line Feed) so trim >> white spaces from both front and end of the line or it will fail method type validation. >> - In case of exception happens during reloading class, CHECK will return without free the allocated buffer for class >> bytes so moved the buffer allocation and freeing to caller. Also removed test 6 since there is not guarantee that we >> can give a signature which will always fail. Additional changes to GenerateJLIClassesHelper according to review >> suggestion. >> - Merge branch 'master' of https://github.com/openjdk/jdk into jdk-8247536 >> - Merge branch 'master' of https://git.openjdk.java.net/jdk into jdk-8247536 >> - ... and 13 more: https://git.openjdk.java.net/jdk/compare/82fe023b...f5584dcf > > src/java.base/share/classes/jdk/internal/misc/CDS.java line 144: > >> 142: String line = s.trim(); >> 143: if (!line.startsWith("[LF_RESOLVE]") && !line.startsWith("[SPECIES_RESOLVE]")) { >> 144: System.out.println("Wrong prefix: " + line); > > Should this throw an exception instead? This part is for check the format only, throw exceptions will lead more objects generated which should not be archived in shared heap. Since this is only called from VM, so decide not to throw exception here. > src/java.base/share/classes/jdk/internal/misc/CDS.java line 40: > >> 38: * indicator for dumping class list. >> 39: */ >> 40: static public final boolean isDumpingClassList; > > what about making this a private static field and adding a public static `isDumpingClassList()` method (which was in > the previous version). That will have a name for two properties, if you that is OK, I will use the previous version. > src/java.base/share/classes/jdk/internal/misc/CDS.java line 155: > >> 153: System.out.println("Incorrecct number of items in the line: " + parts.length); >> 154: System.out.println("line: " + line); >> 155: return null; > > I think these error cases should throw `IllegalArgumentException` and VM decides how to handle the exception. Same reason as above. > src/java.base/share/classes/jdk/internal/misc/CDS.java line 140: > >> 138: // return null for invalid input >> 139: private static Stream validateInputLines(String[] lines) { >> 140: ArrayList list = new ArrayList(lines.length); > > Nit: this can use diamond operatior like this: `new ArrayList<>(lines.length)`. Will update. > src/java.base/share/classes/jdk/internal/misc/CDS.java line 184: > >> 182: Objects.requireNonNull(lines); >> 183: try { >> 184: Stream lineStream = validateInputLines(lines); > > It seems clearer to have `validateInputLines` do validation only and convert this line into: > > validateInputLines(lines); > Stream lineStream = Arrays.stream(lines); Somewhere in the testing framework, the line in ExtraClassList was added '\f' which needs trim, I did not dig into deep the root cause so here just return a new List with ending white spaces trimmed instead. I will file a bug for it and when it fixed, we can do this way. > src/java.base/share/classes/jdk/internal/misc/CDS.java line 178: > >> 176: /** >> 177: * called from vm to generate MethodHandle holder classes >> 178: * @return @code { Object[] } if holder classes can be generated. > > type: `s/@code { Object[]/{@code Object[]}` Will update > src/java.base/share/classes/jdk/internal/misc/CDS.java line 198: > >> 196: return retArray; >> 197: } catch (Exception e) { >> 198: e.printStackTrace(); > > Is this a debugging statement? If CDS swallows the exception thrown, I think VM should emit the warning message and > print the stack trace if appropriate. I want to print this in java since in VM it is a little complex than java. It just used to give the stack trace where the exception happened for debug purpose. Maybe I should put comment for it. ------------- PR: https://git.openjdk.java.net/jdk/pull/193 From shade at openjdk.java.net Thu Oct 8 07:43:00 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 8 Oct 2020 07:43:00 GMT Subject: RFR: 8254166: Zero: return-type warning in zeroInterpreter_zero.cpp [v3] In-Reply-To: References: Message-ID: > This breaks 11u without disabled warnings as errors, but the issue exists in head JDK as well. Might as well fix it > everywhere. > Testing: > - [x] Linux ARM32 Zero build on HEAD JDK > - [x] Linux ARM32 Zero build on 11u Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into JDK-8254166-zero-narrow-return - Added comment - 8254166: Zero: return-type warning in zeroInterpreter_zero.cpp ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/545/files - new: https://git.openjdk.java.net/jdk/pull/545/files/95d62436..2ab022fc Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=545&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=545&range=01-02 Stats: 8688 lines in 123 files changed: 4495 ins; 3691 del; 502 mod Patch: https://git.openjdk.java.net/jdk/pull/545.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/545/head:pull/545 PR: https://git.openjdk.java.net/jdk/pull/545 From shade at openjdk.java.net Thu Oct 8 07:44:43 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 8 Oct 2020 07:44:43 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) In-Reply-To: References: Message-ID: <2m-ZUtCAOMDADZVduYjzmcNHYziHMIC3axl1CNolulo=.4e19ccf9-4320-434d-b8c3-b186f74f1b5e@github.com> On Wed, 7 Oct 2020 17:13:22 GMT, Maurizio Cimadamore wrote: > This patch contains the changes associated with the third incubation round of the foreign memory access API incubation > (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: > * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from > multiple threads > * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee > that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class > has been added, which defines several useful dereference routines; these are really just thin wrappers around memory > access var handles, but they make the barrier of entry for using this API somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not > the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link > to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit > of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which > wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as > dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability > in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; > secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can > use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done > by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided > below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be > happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, > Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd > like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio > Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative > to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a > carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access > base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte > offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. > `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which > it is easy to derive all the other handles using plain var handle combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both > `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients > can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. > * `MemoryLayouts` > * A new layout, for machine addresses, has been added to the mix. > > > > ### Implementation changes > > There are two main things to discuss here: support for shared segments, and the general simplification of the memory > access var handle support. > #### Shared segments > > The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to > achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment > is shared, it would be possible for a thread to close it while another is accessing it. After considering several > options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he > reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world > (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a > close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and > the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). > It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of > these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, > we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to > whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of > stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, > once we detect that a thread is accessing the very segment we're about to close, what should happen? We first > experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it > fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the > machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to > minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread > is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and > try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should > be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single > place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in > addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` > annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) > class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, > like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is > tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during > access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory > access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead > of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a > `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed > successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two > implementations, one for confined segments and one for shared segments; the main difference between the two is what > happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared > segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or > `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` > state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. #### > Memory access var handles overhaul The key realization here was that if all memory access var handles took a > coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle > form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var > handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that > e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the > implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level > access on the innards of the memory access var handle. All that code is now gone. #### Test changes Not much to see > here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, > since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` > functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the > microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared > segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - > https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 Drive-by review. src/hotspot/share/prims/scopedMemoryAccess.cpp line 1: > 1: Misses copyright header. src/hotspot/share/prims/scopedMemoryAccess.cpp line 81: > 79: CompiledMethod* cm = last_frame.cb()->as_compiled_method(); > 80: > 81: //FIXME: this doesn't work if reachability fences are violated by C2 Maybe turn this into /* */ block, so that it is obvious the whole thing relates to the `FIXME`? src/hotspot/share/prims/scopedMemoryAccess.cpp line 96: > 94: int depth = 0; > 95: vframeStream stream(jt); > 96: for (; !stream.at_end(); stream.next()) { Can move `stream` initialization right into the `for` statement? src/hotspot/share/prims/scopedMemoryAccess.cpp line 138: > 136: /// JVM_RegisterUnsafeMethods > 137: > 138: #define LANG "Ljdk/internal/misc/" This is weirdly defined as `LANG`. I suppose this would change to `Ljava/lang` soon. But maybe `PACKAGE` is better. src/hotspot/share/prims/scopedMemoryAccess.cpp line 130: > 128: * Top frames containg obj will be deoptimized. > 129: */ > 130: JVM_ENTRY(jboolean, ScopedMemoryAccess_closeScope(JNIEnv *env, jobject receiver, jobject deopt, jobject > exception)) { `JVM_ENTRY` does not require a brace, it is braced already. See existing uses of `JVM_ENTRY`. src/hotspot/share/prims/scopedMemoryAccess.cpp line 134: > 132: Handshake::execute(&cl); > 133: return !cl._found; > 134: } JVM_END Ditto for `JVM_END`. src/hotspot/share/prims/scopedMemoryAccess.cpp line 166: > 164: int ok = env->RegisterNatives(scopedMemoryAccessClass, jdk_internal_misc_ScopedMemoryAccess_methods, > sizeof(jdk_internal_misc_ScopedMemoryAccess_methods)/sizeof(JNINativeMethod)); 165: guarantee(ok == 0, "register > jdk.internal.misc.ScopedMemoryAccess natives"); 166: } JVM_END `JVM_ENTRY`/`JVM_END` braces again. src/java.base/share/classes/java/lang/invoke/MemoryAccessVarHandleBase.java line 45: > 43: final boolean skipAlignmentMaskCheck; > 44: > 45: MemoryAccessVarHandleBase(VarForm form, boolean skipOffetCheck, boolean be, long length, long alignmentMask) { Typo: `skipOff*s*etCheck`. Should it be `skipAlignmentMaskCheck` to begin with? test/jdk/java/foreign/TestMismatch.java line 26: > 24: /* > 25: * @test > 26: * @run testng/othervm -XX:MaxDirectMemorySize=5000000000 TestMismatch Whoa, allocating 5 GB? That might fail on 32-bit platforms... Anyhow, this flag accepts suffixes, so `-XX:MaxDirectMemorySize=5g`. ------------- PR: https://git.openjdk.java.net/jdk/pull/548 From shade at openjdk.java.net Thu Oct 8 08:13:49 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 8 Oct 2020 08:13:49 GMT Subject: Integrated: 8254166: Zero: return-type warning in zeroInterpreter_zero.cpp In-Reply-To: References: Message-ID: On Wed, 7 Oct 2020 15:37:45 GMT, Aleksey Shipilev wrote: > This breaks 11u without disabled warnings as errors, but the issue exists in head JDK as well. Might as well fix it > everywhere. > Testing: > - [x] Linux ARM32 Zero build on HEAD JDK > - [x] Linux ARM32 Zero build on 11u This pull request has now been integrated. Changeset: 7952c06b Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/7952c06b Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod 8254166: Zero: return-type warning in zeroInterpreter_zero.cpp Reviewed-by: sgehwolf ------------- PR: https://git.openjdk.java.net/jdk/pull/545 From rrich at openjdk.java.net Thu Oct 8 08:54:59 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Thu, 8 Oct 2020 08:54:59 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v8] In-Reply-To: References: Message-ID: > Hi, > > this is the continuation of the review of the implementation for: > > https://bugs.openjdk.java.net/browse/JDK-8227745 > https://bugs.openjdk.java.net/browse/JDK-8233915 > > It allows for JIT optimizations based on escape analysis even if JVMTI agents acquire capabilities to access references > to objects that are subject to such optimizations, e.g. scalar replacement. The implementation reverts such > optimizations just before access very much as when switching from JIT compiled execution to the interpreter, aka > "deoptimization". Webrev.8 was the last one before before the transition to Git/Github: > > http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ > > Thanks, Richard. Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 18 commits: - Merge branch 'master' into JDK-8227745 - Factorized fragment out of EscapeBarrier::deoptimize_objects_internal into new method in compiledVFrame. - More smaller changes proposed by Serguei. - jvmtiDeferredUpdates.hpp: remove forward declarations. - jvmtiDeferredLocalVariable: move member variables to the beginning of the class definition. - jvmtiDeferredUpdates.hpp: add/remove empty lines and improve indentation. - Merge branch 'master' into JDK-8227745 - Merge branch 'master' into JDK-8227745 - Make parameter current_thread of JvmtiEnvBase::check_top_frame() a JavaThread* again. With Asynchronous handshakes the type was changed from JavaThread* to Thread* but this is not necessary as check_top_frame() is not executed during a handshake / safepoint (robehn confirmed). - Merge branch 'master' into JDK-8227745 - ... and 8 more: https://git.openjdk.java.net/jdk/compare/8f9e4792...94c89691 ------------- Changes: https://git.openjdk.java.net/jdk/pull/119/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=119&range=07 Stats: 5815 lines in 52 files changed: 5595 ins; 116 del; 104 mod Patch: https://git.openjdk.java.net/jdk/pull/119.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/119/head:pull/119 PR: https://git.openjdk.java.net/jdk/pull/119 From rrich at openjdk.java.net Thu Oct 8 08:55:00 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Thu, 8 Oct 2020 08:55:00 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v7] In-Reply-To: References: <6Scp6XjVCcdJN0tUKionVwGKoiBG8UeA-OpBXHrCYqk=.01170b1e-9722-4461-84e4-77e8fd447ac4@github.com> Message-ID: On Wed, 7 Oct 2020 19:47:07 GMT, Serguei Spitsyn wrote: > > > Richard, > Thank you for the formatting and refactoring changes. > The fix looks good to me. Thank you very much Serguei! ------------- PR: https://git.openjdk.java.net/jdk/pull/119 From rrich at openjdk.java.net Thu Oct 8 09:13:45 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Thu, 8 Oct 2020 09:13:45 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v7] In-Reply-To: References: <6Scp6XjVCcdJN0tUKionVwGKoiBG8UeA-OpBXHrCYqk=.01170b1e-9722-4461-84e4-77e8fd447ac4@github.com> Message-ID: On Thu, 8 Oct 2020 08:50:08 GMT, Richard Reingruber wrote: >> Richard, >> Thank you for the formatting and refactoring changes. >> The fix looks good to me. > >> >> >> Richard, >> Thank you for the formatting and refactoring changes. >> The fix looks good to me. > > Thank you very much Serguei! I'm planning to integrate this pull request on Monday 2020-10-12. There was one failing test on Windows x64: [tools/javac/launcher/SourceLauncherTest.java](https://github.com/reinrich/jdk/runs/1222200722). This was most likely caused by [JDK-8249095](https://bugs.openjdk.java.net/browse/JDK-8249095). I've merged master to verify this. ------------- PR: https://git.openjdk.java.net/jdk/pull/119 From kim.barrett at oracle.com Thu Oct 8 09:24:27 2020 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 8 Oct 2020 05:24:27 -0400 Subject: =?utf-8?Q?CFV=3A_New_HotSpot_Group_Member=3A_Erik_=C3=96sterlund_?= Message-ID: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> I hearby nominate Erik ?sterlund to Membership in the HotSpot Group. Erik has been a JDK Reviewer and member of the Oracle GC team for several years, currently working on ZGC, though his reach and influence extends significantly beyond that project. He has made many substantial contributions [1] including (most recently) JEP 376: ZGC: Concurrent Thread-Stack Processing. Votes are due by Friday, 23-Oct-2020 at 12h00 UTC. Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list For Lazy Consensus voting instructions, see [3]. Kim Barrett [1] https://github.com/search?q=author-name%3A%22Erik+%C3%96sterlund%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits [2] https://openjdk.java.net/census [3] https://openjdk.java.net/groups/#member-vote From thomas.schatzl at oracle.com Thu Oct 8 09:40:45 2020 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 8 Oct 2020 11:40:45 +0200 Subject: =?UTF-8?Q?Re=3a_CFV=3a_New_HotSpot_Group_Member=3a_Erik_=c3=96sterl?= =?UTF-8?Q?und?= In-Reply-To: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> References: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> Message-ID: Vote: yes Thomas On 08.10.20 11:24, Kim Barrett wrote: > I hearby nominate Erik ?sterlund to Membership in the HotSpot Group. > > Erik has been a JDK Reviewer and member of the Oracle GC team for several > years, currently working on ZGC, though his reach and influence extends > significantly beyond that project. He has made many substantial > contributions [1] including (most recently) JEP 376: ZGC: Concurrent > Thread-Stack Processing. > > Votes are due by Friday, 23-Oct-2020 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Erik+%C3%96sterlund%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From martin.doerr at sap.com Thu Oct 8 09:42:48 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Thu, 8 Oct 2020 09:42:48 +0000 Subject: =?utf-8?B?UkU6IENGVjogTmV3IEhvdFNwb3QgR3JvdXAgTWVtYmVyOiBFcmlrIMOWc3Rl?= =?utf-8?Q?rlund_?= In-Reply-To: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> References: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> Message-ID: Vote: yes Best regards, Martin > -----Original Message----- > From: hotspot-dev On Behalf Of Kim > Barrett > Sent: Donnerstag, 8. Oktober 2020 11:24 > To: hotspot-dev Source Developers > Subject: CFV: New HotSpot Group Member: Erik ?sterlund > > I hearby nominate Erik ?sterlund to Membership in the HotSpot Group. > > Erik has been a JDK Reviewer and member of the Oracle GC team for several > years, currently working on ZGC, though his reach and influence extends > significantly beyond that project. He has made many substantial > contributions [1] including (most recently) JEP 376: ZGC: Concurrent > Thread-Stack Processing. > > Votes are due by Friday, 23-Oct-2020 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author- > name%3A%22Erik+%C3%96sterlund%22+repo%3Aopenjdk%2Fjdk+merge%3 > Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote From shade at redhat.com Thu Oct 8 09:53:27 2020 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 8 Oct 2020 11:53:27 +0200 Subject: =?UTF-8?Q?Re=3a_CFV=3a_New_HotSpot_Group_Member=3a_Erik_=c3=96sterl?= =?UTF-8?Q?und?= In-Reply-To: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> References: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> Message-ID: Vote: yes On 10/8/20 11:24 AM, Kim Barrett wrote: > I hearby nominate Erik ?sterlund to Membership in the HotSpot Group. -- Thanks, -Aleksey From mcimadamore at openjdk.java.net Thu Oct 8 10:29:24 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 8 Oct 2020 10:29:24 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v2] In-Reply-To: References: Message-ID: <-K8a4gV16AZ7Se7-G2DWZrSEMr5FjLPzlUlo4nXnTE0=.c33f2a73-6ee1-4a9f-b992-d51fc1f2f481@github.com> > This patch contains the changes associated with the third incubation round of the foreign memory access API incubation > (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: > * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from > multiple threads > * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee > that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class > has been added, which defines several useful dereference routines; these are really just thin wrappers around memory > access var handles, but they make the barrier of entry for using this API somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not > the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link > to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit > of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which > wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as > dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability > in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; > secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can > use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done > by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided > below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be > happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, > Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd > like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio > Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative > to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a > carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access > base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte > offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. > `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which > it is easy to derive all the other handles using plain var handle combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both > `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients > can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. > * `MemoryLayouts` > * A new layout, for machine addresses, has been added to the mix. > > > > ### Implementation changes > > There are two main things to discuss here: support for shared segments, and the general simplification of the memory > access var handle support. > #### Shared segments > > The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to > achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment > is shared, it would be possible for a thread to close it while another is accessing it. After considering several > options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he > reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world > (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a > close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and > the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). > It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of > these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, > we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to > whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of > stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, > once we detect that a thread is accessing the very segment we're about to close, what should happen? We first > experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it > fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the > machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to > minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread > is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and > try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should > be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single > place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in > addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` > annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) > class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, > like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is > tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during > access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory > access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead > of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a > `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed > successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two > implementations, one for confined segments and one for shared segments; the main difference between the two is what > happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared > segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or > `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` > state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. #### > Memory access var handles overhaul The key realization here was that if all memory access var handles took a > coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle > form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var > handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that > e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the > implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level > access on the innards of the memory access var handle. All that code is now gone. #### Test changes Not much to see > here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, > since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` > functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the > microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared > segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - > https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Address review comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/548/files - new: https://git.openjdk.java.net/jdk/pull/548/files/e4eb2c74..fa051abf Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=00-01 Stats: 67 lines in 5 files changed: 33 ins; 3 del; 31 mod Patch: https://git.openjdk.java.net/jdk/pull/548.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/548/head:pull/548 PR: https://git.openjdk.java.net/jdk/pull/548 From mcimadamore at openjdk.java.net Thu Oct 8 10:29:25 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 8 Oct 2020 10:29:25 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v2] In-Reply-To: <2m-ZUtCAOMDADZVduYjzmcNHYziHMIC3axl1CNolulo=.4e19ccf9-4320-434d-b8c3-b186f74f1b5e@github.com> References: <2m-ZUtCAOMDADZVduYjzmcNHYziHMIC3axl1CNolulo=.4e19ccf9-4320-434d-b8c3-b186f74f1b5e@github.com> Message-ID: On Thu, 8 Oct 2020 06:53:41 GMT, Aleksey Shipilev wrote: >> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: >> >> Address review comments > > test/jdk/java/foreign/TestMismatch.java line 26: > >> 24: /* >> 25: * @test >> 26: * @run testng/othervm -XX:MaxDirectMemorySize=5000000000 TestMismatch > > Whoa, allocating 5 GB? That might fail on 32-bit platforms... Anyhow, this flag accepts suffixes, so > `-XX:MaxDirectMemorySize=5g`. I've done two things here: * the limit isn't really doing much in this test, so I've removed * I moved the limit in TestSegments; the limit is set to much lower threshold (2M) which should work regardless of 32/64 * For TestMismatch, which needs to allocate a segment bigger than 2^32 in one of the tests, I've added a guard in the offending test which verifies that we're indeed on a 64-bit platform ------------- PR: https://git.openjdk.java.net/jdk/pull/548 From shade at openjdk.java.net Thu Oct 8 10:32:49 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 8 Oct 2020 10:32:49 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v2] In-Reply-To: <-K8a4gV16AZ7Se7-G2DWZrSEMr5FjLPzlUlo4nXnTE0=.c33f2a73-6ee1-4a9f-b992-d51fc1f2f481@github.com> References: <-K8a4gV16AZ7Se7-G2DWZrSEMr5FjLPzlUlo4nXnTE0=.c33f2a73-6ee1-4a9f-b992-d51fc1f2f481@github.com> Message-ID: On Thu, 8 Oct 2020 10:29:24 GMT, Maurizio Cimadamore wrote: >> This patch contains the changes associated with the third incubation round of the foreign memory access API incubation >> (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: >> * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from >> multiple threads >> * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee >> that the memory will be deallocated, eventually >> * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class >> has been added, which defines several useful dereference routines; these are really just thin wrappers around memory >> access var handles, but they make the barrier of entry for using this API somewhat lower. >> >> A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not >> the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link >> to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit >> of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which >> wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as >> dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability >> in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; >> secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can >> use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done >> by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided >> below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be >> happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, >> Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd >> like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio >> Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html >> Specdiff: >> >> http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html >> >> CSR: >> >> https://bugs.openjdk.java.net/browse/JDK-8254163 >> >> >> >> ### API Changes >> >> * `MemorySegment` >> * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) >> * added a no-arg factory for a native restricted segment representing entire native heap >> * rename `withOwnerThread` to `handoff` >> * add new `share` method, to create shared segments >> * add new `registerCleaner` method, to register a segment against a cleaner >> * add more helpers to create arrays from a segment e.g. `toIntArray` >> * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) >> * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) >> * `MemoryAddress` >> * drop `segment` accessor >> * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative >> to a given segment >> * `MemoryAccess` >> * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a >> carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access >> base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte >> offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. >> `getByteAtOffset` vs `getByteAtIndex`). >> * `MemoryHandles` >> * drop `withOffset` combinator >> * drop `withStride` combinator >> * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which >> it is easy to derive all the other handles using plain var handle combinators. >> * `Addressable` >> * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both >> `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients >> can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. >> * `MemoryLayouts` >> * A new layout, for machine addresses, has been added to the mix. >> >> >> >> ### Implementation changes >> >> There are two main things to discuss here: support for shared segments, and the general simplification of the memory >> access var handle support. >> #### Shared segments >> >> The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to >> achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment >> is shared, it would be possible for a thread to close it while another is accessing it. After considering several >> options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he >> reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world >> (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a >> close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and >> the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). >> It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of >> these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, >> we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to >> whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of >> stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, >> once we detect that a thread is accessing the very segment we're about to close, what should happen? We first >> experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it >> fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the >> machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to >> minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread >> is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and >> try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should >> be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single >> place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in >> addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` >> annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) >> class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, >> like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is >> tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during >> access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory >> access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead >> of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a >> `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed >> successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two >> implementations, one for confined segments and one for shared segments; the main difference between the two is what >> happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared >> segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or >> `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` >> state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. #### >> Memory access var handles overhaul The key realization here was that if all memory access var handles took a >> coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle >> form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var >> handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that >> e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the >> implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level >> access on the innards of the memory access var handle. All that code is now gone. #### Test changes Not much to see >> here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, >> since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` >> functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the >> microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared >> segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - >> https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 > > Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments Thanks, these changes make sense to me. ------------- PR: https://git.openjdk.java.net/jdk/pull/548 From mdoerr at openjdk.java.net Thu Oct 8 10:46:51 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Thu, 8 Oct 2020 10:46:51 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v3] In-Reply-To: References: Message-ID: On Thu, 8 Oct 2020 06:53:26 GMT, CoreyAshford wrote: >> This patch set encompasses the following commits: >> >> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for >> the intrinsic. The API is similar to the existing encodeBlock intrinsic. >> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic >> implementation >> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. >> - Adds a JMH microbenchmark for both Base64 encoding and encoding. >> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. > > CoreyAshford has updated the pull request incrementally with seven additional commits since the last revision: > > - stubGenerator_ppc.cpp: Fix multiple issues as per Martin Doerr's v2 review > > * Remove extraneous comma from SAP copyright notice > * Move align(32) to the head of the loop rather than the beginning of the unwound code > * Simplified looping condition to use a loop counter instead of a final > address. This eliminated the need for the "end" variable, and > essentially replaced it with CTR, which is computed using a simple > bitwise shift of the size. > * Re-ran benchmarks against loop_unrolls values: 1, 2, 4, 8, 16 to find > optimal value, now 4. > * Corrected a typo in the word "elements" > - vm_version_ppc.cpp: per Martin Doerr's review of v2: fix copy/paste error > - vmIntrinsics.cpp: Per Martin Doerr's v2 review: rearrange order of case statement to be consistent with others. > - runtime.cpp: per Martin Doerr's review of v2, correct comment as per current semantics of decodeBlock() > > * The reference to "ofs" seems to be a copy/paste error. > * -1 is no longer returned from decodeBlock() in the event of a > non-base64 character being encountered; only a count of bytes written > to dst. > - TestBase64.java: Change comment as per Martin Doerr's v2 review > - Base64.java: Make changes as per Roger Riggs and Martin Doerr's v2 Review > > * Make comment about the sl parameter more precise > * Fix comparison to avoid possible integer overflow of sp > - library_call.cpp: Fix rebase merge error Changes requested by mdoerr (Reviewer). src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3806: > 3804: // Load CTR with the number of passes through the unrolled loop > 3805: // = sl >> block_size_shift > 3806: __ srawi(sl, sl, block_size_shift); Thanks, this is more simple. Unfortunately, sl can become 0. So I think we should move these 2 lines down before the align and use: srawi_ beq(CCR0, unrolled_loop_exit) mtctr test/hotspot/jtreg/compiler/intrinsics/base64/TestBase64.java line 90: > 88: > 89: // This should be enough to get both encodeBlock() and > 90: // decodeBlock() compiled on the highest tier. It's actually encode() and decode() which should get compiled. You should see them when testing with -XX:+PrintCompilation. And you should see usage of the intrinsics by -XX:+PrintInlining. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From erik.helin at oracle.com Thu Oct 8 10:51:16 2020 From: erik.helin at oracle.com (Erik Helin) Date: Thu, 8 Oct 2020 12:51:16 +0200 Subject: =?UTF-8?Q?Re=3a_CFV=3a_New_HotSpot_Group_Member=3a_Erik_=c3=96sterl?= =?UTF-8?Q?und?= In-Reply-To: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> References: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> Message-ID: <503e9e46-8b41-5a54-53ae-18e79f295e13@oracle.com> Vote: yes Thanks, Erik On 10/8/20 11:24 AM, Kim Barrett wrote: > I hearby nominate Erik ?sterlund to Membership in the HotSpot Group. > > Erik has been a JDK Reviewer and member of the Oracle GC team for several > years, currently working on ZGC, though his reach and influence extends > significantly beyond that project. He has made many substantial > contributions [1] including (most recently) JEP 376: ZGC: Concurrent > Thread-Stack Processing. > > Votes are due by Friday, 23-Oct-2020 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Erik+%C3%96sterlund%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From avoitylov at openjdk.java.net Thu Oct 8 10:52:53 2020 From: avoitylov at openjdk.java.net (Aleksei Voitylov) Date: Thu, 8 Oct 2020 10:52:53 GMT Subject: RFR: JDK-8247589: Implementation of Alpine Linux/x64 Port [v6] In-Reply-To: References: Message-ID: > continuing the review thread from here https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-September/068546.html > >> The download side of using JNI in these tests is that it complicates the >> setup a bit for those that run jtreg directly and/or just build the JDK >> and not the test libraries. You could reduce this burden a bit by >> limiting the load library/isMusl check to Linux only, meaning isMusl >> would not be called on other platforms. >> >> The alternative you suggest above might indeed be better. I assume you >> don't mean splitting the tests but rather just adding a second @test >> description so that the vm.musl case runs the test with a system >> property that allows the test know the expected load library path behavior. > > I have updated the PR to split the two tests in multiple @test s. > >> The updated comment in java_md.c in this looks good. A minor comment on >> Platform.isBusybox is Files.isSymbolicLink returning true implies that >> the link exists so no need to check for exists too. Also the >> if-then-else style for the new class in ProcessBuilder/Basic.java is >> inconsistent with the rest of the test so it stands out. > > Thank you, these changes are done in the updated PR. > >> Given the repo transition this weekend then I assume you'll create a PR >> for the final review at least. Also I see JEP 386 hasn't been targeted >> yet but I assume Boris, as owner, will propose-to-target and wait for it >> to be targeted before it is integrated. > > Yes. How can this be best accomplished with the new git workflow? > - we can continue the review process till the end and I will request the integration to happen only after the JEP is > targeted. I guess this step is now done by typing "slash integrate" in a comment. > - we can pause the review process now until the JEP is targeted. > > In the first case I'm kindly asking the Reviewers who already chimed in on that to re-confirm the review here. Aleksei Voitylov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: - Merge branch 'master' into JDK-8247589 - JDK-8247589: Implementation of Alpine Linux/x64 Port - JDK-8247589: Implementation of Alpine Linux/x64 Port ------------- Changes: https://git.openjdk.java.net/jdk/pull/49/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=49&range=05 Stats: 403 lines in 30 files changed: 348 ins; 17 del; 38 mod Patch: https://git.openjdk.java.net/jdk/pull/49.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/49/head:pull/49 PR: https://git.openjdk.java.net/jdk/pull/49 From avoitylov at openjdk.java.net Thu Oct 8 11:03:48 2020 From: avoitylov at openjdk.java.net (Aleksei Voitylov) Date: Thu, 8 Oct 2020 11:03:48 GMT Subject: RFR: JDK-8247589: Implementation of Alpine Linux/x64 Port [v2] In-Reply-To: References: <6jqlCPXe69fPRvYFrytJsECkaa9tJ1hYWISNgyPP4Eg=.40944ef5-93b0-4db4-948b-80bb7898e9e8@github.com> Message-ID: On Thu, 8 Oct 2020 10:58:56 GMT, Aleksei Voitylov wrote: >> @voitylov For future reference please don't force-push commits on open PRs as it breaks the commit history. I can no >> longer just look at the two most recent commits and see what they added relative to what I had previously reviewed. >> Thanks. > > @dholmes-ora yes, sorry about that. I updated the branch to pull the recent changes to enable pre-submit tests and > ensure everything is well before integration and though the branch looked good it confused the pull request. So I had > to force push it back to the original state. @iignatev I resolved the conflict in whitebox.cpp and fixed a minor style nit on the way. Could you take a look? ------------- PR: https://git.openjdk.java.net/jdk/pull/49 From avoitylov at openjdk.java.net Thu Oct 8 11:03:48 2020 From: avoitylov at openjdk.java.net (Aleksei Voitylov) Date: Thu, 8 Oct 2020 11:03:48 GMT Subject: RFR: JDK-8247589: Implementation of Alpine Linux/x64 Port [v2] In-Reply-To: References: <6jqlCPXe69fPRvYFrytJsECkaa9tJ1hYWISNgyPP4Eg=.40944ef5-93b0-4db4-948b-80bb7898e9e8@github.com> Message-ID: On Tue, 6 Oct 2020 02:00:06 GMT, David Holmes wrote: >> I added the contributors that could be found in the portola project commits. If anyone knows some other contributors I >> missed, I'll be happy to stand corrected. > > @voitylov For future reference please don't force-push commits on open PRs as it breaks the commit history. I can no > longer just look at the two most recent commits and see what they added relative to what I had previously reviewed. > Thanks. @dholmes-ora yes, sorry about that. I updated the branch to pull the recent changes to enable pre-submit tests and ensure everything is well before integration and though the branch looked good it confused the pull request. So I had to force push it back to the original state. ------------- PR: https://git.openjdk.java.net/jdk/pull/49 From coleenp at openjdk.java.net Thu Oct 8 11:29:47 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 8 Oct 2020 11:29:47 GMT Subject: RFR: 8253717: Relocate stack overflow code out of thread.hpp/cpp [v6] In-Reply-To: <1_g79Qy5B8dtub9THxK25LWJ5_Md7cqZ6HgRgB7mNso=.ffcd7e46-c0bb-4248-9da0-a9bf1be549c8@github.com> References: <1_g79Qy5B8dtub9THxK25LWJ5_Md7cqZ6HgRgB7mNso=.ffcd7e46-c0bb-4248-9da0-a9bf1be549c8@github.com> Message-ID: On Thu, 8 Oct 2020 05:03:04 GMT, Thomas Stuefe wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Revised comment > > @coleenp: Looks all good to me now. Thanks for that work! > > @dholmes-ora : I knew remember now the discussion as well. Seeing that it was only in February this is embarrassing :) > I still find the fact the base values are off by one odd but having them at crooked values (page ends) would be odd > too, so maybe its just a matter of brushing up comments. Thank you for all the reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/522 From coleenp at openjdk.java.net Thu Oct 8 11:29:50 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 8 Oct 2020 11:29:50 GMT Subject: Integrated: 8253717: Relocate stack overflow code out of thread.hpp/cpp In-Reply-To: References: Message-ID: <1Evq7DrRpxiqEl6T3Opi9gSDSUx2d5PxUn1Tn2cDLRk=.a128c5eb-d031-4028-97c0-618c5911cb3d@github.com> On Tue, 6 Oct 2020 12:13:00 GMT, Coleen Phillimore wrote: > This change moves the significant amount of stack overflow related code (with ascii art!) out of thread files into a > new file. Many of the functions are static functions and some go through JavaThread::_stack_overflow_state where > needed. All functions are moved and not modified except for qualification. I also added a delegating constructor to > JavaThread::JavaThread so reordered the assignments as initializers from JavaThread::initialize. > Tested with tier1-6 and builds on arm32, ppc, s390 and zero. This pull request has now been integrated. Changeset: 6bc49318 Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/6bc49318 Stats: 1276 lines in 50 files changed: 615 ins; 503 del; 158 mod 8253717: Relocate stack overflow code out of thread.hpp/cpp Reviewed-by: rehn, dcubed, dholmes, stuefe ------------- PR: https://git.openjdk.java.net/jdk/pull/522 From coleen.phillimore at oracle.com Thu Oct 8 11:35:09 2020 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Thu, 8 Oct 2020 07:35:09 -0400 Subject: =?UTF-8?Q?Re=3a_CFV=3a_New_HotSpot_Group_Member=3a_Erik_=c3=96sterl?= =?UTF-8?Q?und?= In-Reply-To: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> References: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> Message-ID: Vote: yes On 10/8/20 5:24 AM, Kim Barrett wrote: > I hearby nominate Erik ?sterlund to Membership in the HotSpot Group. > > Erik has been a JDK Reviewer and member of the Oracle GC team for several > years, currently working on ZGC, though his reach and influence extends > significantly beyond that project. He has made many substantial > contributions [1] including (most recently) JEP 376: ZGC: Concurrent > Thread-Stack Processing. > > Votes are due by Friday, 23-Oct-2020 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Erik+%C3%96sterlund%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From stefan.karlsson at oracle.com Thu Oct 8 11:39:54 2020 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 8 Oct 2020 13:39:54 +0200 Subject: =?UTF-8?Q?Re=3a_CFV=3a_New_HotSpot_Group_Member=3a_Erik_=c3=96sterl?= =?UTF-8?Q?und?= In-Reply-To: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> References: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> Message-ID: Vote: yes StefanK On 2020-10-08 11:24, Kim Barrett wrote: > I hearby nominate Erik ?sterlund to Membership in the HotSpot Group. > > Erik has been a JDK Reviewer and member of the Oracle GC team for several > years, currently working on ZGC, though his reach and influence extends > significantly beyond that project. He has made many substantial > contributions [1] including (most recently) JEP 376: ZGC: Concurrent > Thread-Stack Processing. > > Votes are due by Friday, 23-Oct-2020 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Erik+%C3%96sterlund%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From burban at openjdk.java.net Thu Oct 8 12:33:46 2020 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Thu, 8 Oct 2020 12:33:46 GMT Subject: RFR: 8253757: Add LLVM-based backend for hsdis In-Reply-To: References: <91erxiMDb4ftvSomuJYHPi9SX-v8Z2VLD2qEwCbz5tk=.b9ed01b5-f0e0-4ed7-9c1a-b06bc0e64640@github.com> <8Eqswd7tsVaGEXHdKDncXqKpW2tBsSeuY0PV6aTB9_c=.a6cf4957-9d31-4e89-bf44-e7b7852205d5@github.com> Message-ID: On Wed, 7 Oct 2020 08:02:59 GMT, Xin Liu wrote: >> Can you separate LLVM and binutils from hsdis.cpp? >> >> I guess you say that the problem is both GCC and binutils are not available on Windows AArch64. Is it right? >> 1 question: binutils seems to support Windows AArch64. Did you try recently binutils? If we can use binutils on Windows >> AArch64, you can fix makefile only. >> https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=binutils/dlltool.c;h=ed016b97dc38cdb1b85d2f6df676b9c9750f0d41;hb=HEAD#l248 > > IMHO, it's great to have an alternative disassembler. I personally had better experience using llvm MC when I decoded > aarch64 and AVX instructions than BFD. Another argument is that LLVM toolchain is supposed to provide the premium > experience on non-gnu platforms such as FreeBSD. @luhenry I tried to build it with LLVM10.0.1 > on my x86_64, ubuntu, I ran into a small problem. here is how I build. > `$make ARCH=amd64 CC=/opt/llvm/bin/clang CXX=/opt/llvm/bin/clang++ LLVM=/opt/llvm/` > > I can't meet this condition because Makefile defines LIBOS_linux. > #elif defined(LIBOS_Linux) && defined(LIBARCH_amd64) > return "x86_64-pc-linux-gnu"; > > Actually, Makefile assigns OS to windows/linux/aix/macosx (all lower case)and then > `CPPFLAGS += -DLIBOS_$(OS) -DLIBOS="$(OS)" -DLIBARCH_$(LIBARCH) -DLIBARCH="$(LIBARCH)" -DLIB_EXT="$(LIB_EXT)"` > > In hsdis.cpp, `native_target_triple` needs to match whatever Makefile defined. With that fix, I generate llvm version > hsdis-amd64.so and it works flawlessly > 1 question: binutils seems to support Windows AArch64. Did you try recently binutils? If we can use binutils on Windows > AArch64, you can fix makefile only. > https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=binutils/dlltool.c;h=ed016b97dc38cdb1b85d2f6df676b9c9750f0d41;hb=HEAD#l248 This is armv7, I don't see any support for armv8/AArch64 in `dlltool.c`. ------------- PR: https://git.openjdk.java.net/jdk/pull/392 From zgu at redhat.com Thu Oct 8 12:40:28 2020 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 8 Oct 2020 08:40:28 -0400 Subject: =?UTF-8?Q?Re=3a_CFV=3a_New_HotSpot_Group_Member=3a_Erik_=c3=96sterl?= =?UTF-8?Q?und?= In-Reply-To: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> References: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> Message-ID: <107c3897-aedd-1dd8-a50c-959f68c39617@redhat.com> Vote: yes -Zhengyu On 10/8/20 5:24 AM, Kim Barrett wrote: > I hearby nominate Erik ?sterlund to Membership in the HotSpot Group. > > Erik has been a JDK Reviewer and member of the Oracle GC team for several > years, currently working on ZGC, though his reach and influence extends > significantly beyond that project. He has made many substantial > contributions [1] including (most recently) JEP 376: ZGC: Concurrent > Thread-Stack Processing. > > Votes are due by Friday, 23-Oct-2020 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Erik+%C3%96sterlund%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From ChrisPhi at LGonQn.Org Thu Oct 8 12:52:38 2020 From: ChrisPhi at LGonQn.Org (Chris Phillips) Date: Thu, 8 Oct 2020 08:52:38 -0400 Subject: =?UTF-8?Q?Re=3a_CFV=3a_New_HotSpot_Group_Member=3a_Erik_=c3=96sterl?= =?UTF-8?Q?und?= In-Reply-To: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> References: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> Message-ID: Vote: yes Cheers! ChrisPhi On 2020-10-08 05:24, Kim Barrett wrote: > I hearby nominate Erik ?sterlund to Membership in the HotSpot Group. > > Erik has been a JDK Reviewer and member of the Oracle GC team for several > years, currently working on ZGC, though his reach and influence extends > significantly beyond that project. He has made many substantial > contributions [1] including (most recently) JEP 376: ZGC: Concurrent > Thread-Stack Processing. > > Votes are due by Friday, 23-Oct-2020 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Erik+%C3%96sterlund%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > > > From erikj at openjdk.java.net Thu Oct 8 12:56:43 2020 From: erikj at openjdk.java.net (Erik Joelsson) Date: Thu, 8 Oct 2020 12:56:43 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v2] In-Reply-To: <-K8a4gV16AZ7Se7-G2DWZrSEMr5FjLPzlUlo4nXnTE0=.c33f2a73-6ee1-4a9f-b992-d51fc1f2f481@github.com> References: <-K8a4gV16AZ7Se7-G2DWZrSEMr5FjLPzlUlo4nXnTE0=.c33f2a73-6ee1-4a9f-b992-d51fc1f2f481@github.com> Message-ID: On Thu, 8 Oct 2020 10:29:24 GMT, Maurizio Cimadamore wrote: >> This patch contains the changes associated with the third incubation round of the foreign memory access API incubation >> (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: >> * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from >> multiple threads >> * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee >> that the memory will be deallocated, eventually >> * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class >> has been added, which defines several useful dereference routines; these are really just thin wrappers around memory >> access var handles, but they make the barrier of entry for using this API somewhat lower. >> >> A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not >> the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link >> to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit >> of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which >> wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as >> dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability >> in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; >> secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can >> use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done >> by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided >> below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be >> happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, >> Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd >> like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio >> Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html >> Specdiff: >> >> http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html >> >> CSR: >> >> https://bugs.openjdk.java.net/browse/JDK-8254163 >> >> >> >> ### API Changes >> >> * `MemorySegment` >> * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) >> * added a no-arg factory for a native restricted segment representing entire native heap >> * rename `withOwnerThread` to `handoff` >> * add new `share` method, to create shared segments >> * add new `registerCleaner` method, to register a segment against a cleaner >> * add more helpers to create arrays from a segment e.g. `toIntArray` >> * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) >> * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) >> * `MemoryAddress` >> * drop `segment` accessor >> * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative >> to a given segment >> * `MemoryAccess` >> * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a >> carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access >> base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte >> offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. >> `getByteAtOffset` vs `getByteAtIndex`). >> * `MemoryHandles` >> * drop `withOffset` combinator >> * drop `withStride` combinator >> * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which >> it is easy to derive all the other handles using plain var handle combinators. >> * `Addressable` >> * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both >> `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients >> can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. >> * `MemoryLayouts` >> * A new layout, for machine addresses, has been added to the mix. >> >> >> >> ### Implementation changes >> >> There are two main things to discuss here: support for shared segments, and the general simplification of the memory >> access var handle support. >> #### Shared segments >> >> The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to >> achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment >> is shared, it would be possible for a thread to close it while another is accessing it. After considering several >> options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he >> reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world >> (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a >> close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and >> the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). >> It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of >> these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, >> we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to >> whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of >> stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, >> once we detect that a thread is accessing the very segment we're about to close, what should happen? We first >> experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it >> fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the >> machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to >> minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread >> is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and >> try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should >> be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single >> place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in >> addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` >> annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) >> class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, >> like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is >> tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during >> access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory >> access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead >> of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a >> `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed >> successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two >> implementations, one for confined segments and one for shared segments; the main difference between the two is what >> happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared >> segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or >> `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` >> state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. #### >> Memory access var handles overhaul The key realization here was that if all memory access var handles took a >> coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle >> form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var >> handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that >> e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the >> implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level >> access on the innards of the memory access var handle. All that code is now gone. #### Test changes Not much to see >> here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, >> since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` >> functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the >> microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared >> segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - >> https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 > > Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments make/modules/java.base/gensrc/GensrcScopedMemoryAccess.gmk line 145: > 143: SCOPE_MEMORY_ACCESS_TYPES := Byte Short Char Int Long Float Double > 144: $(foreach t, $(SCOPE_MEMORY_ACCESS_TYPES), \ > 145: $(eval $(call GenerateScopedOp,BIN_$t,$t))) This indent was fine at 2 spaces. I meant the one below inside the recipe. ------------- PR: https://git.openjdk.java.net/jdk/pull/548 From david.holmes at oracle.com Thu Oct 8 13:00:04 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 8 Oct 2020 23:00:04 +1000 Subject: =?UTF-8?Q?Re=3a_CFV=3a_New_HotSpot_Group_Member=3a_Erik_=c3=96sterl?= =?UTF-8?Q?und?= In-Reply-To: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> References: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> Message-ID: Vote: yes Thanks, David On 8/10/2020 7:24 pm, Kim Barrett wrote: > I hearby nominate Erik ?sterlund to Membership in the HotSpot Group. > > Erik has been a JDK Reviewer and member of the Oracle GC team for several > years, currently working on ZGC, though his reach and influence extends > significantly beyond that project. He has made many substantial > contributions [1] including (most recently) JEP 376: ZGC: Concurrent > Thread-Stack Processing. > > Votes are due by Friday, 23-Oct-2020 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Erik+%C3%96sterlund%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From hohensee at amazon.com Thu Oct 8 13:48:52 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Thu, 8 Oct 2020 13:48:52 +0000 Subject: =?utf-8?B?UkU6IENGVjogTmV3IEhvdFNwb3QgR3JvdXAgTWVtYmVyOiBFcmlrIMOWc3Rl?= =?utf-8?Q?rlund?= Message-ID: <5979AF11-00BF-4CB9-B959-5BE48AF212BB@amazon.com> Vote: yes ?On 10/8/20, 2:25 AM, "hotspot-dev on behalf of Kim Barrett" wrote: I hearby nominate Erik ?sterlund to Membership in the HotSpot Group. Erik has been a JDK Reviewer and member of the Oracle GC team for several years, currently working on ZGC, though his reach and influence extends significantly beyond that project. He has made many substantial contributions [1] including (most recently) JEP 376: ZGC: Concurrent Thread-Stack Processing. Votes are due by Friday, 23-Oct-2020 at 12h00 UTC. Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list For Lazy Consensus voting instructions, see [3]. Kim Barrett [1] https://github.com/search?q=author-name%3A%22Erik+%C3%96sterlund%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits [2] https://openjdk.java.net/census [3] https://openjdk.java.net/groups/#member-vote From mcimadamore at openjdk.java.net Thu Oct 8 13:49:47 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 8 Oct 2020 13:49:47 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v2] In-Reply-To: References: <-K8a4gV16AZ7Se7-G2DWZrSEMr5FjLPzlUlo4nXnTE0=.c33f2a73-6ee1-4a9f-b992-d51fc1f2f481@github.com> Message-ID: On Thu, 8 Oct 2020 12:54:12 GMT, Erik Joelsson wrote: >> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: >> >> Address review comments > > make/modules/java.base/gensrc/GensrcScopedMemoryAccess.gmk line 145: > >> 143: SCOPE_MEMORY_ACCESS_TYPES := Byte Short Char Int Long Float Double >> 144: $(foreach t, $(SCOPE_MEMORY_ACCESS_TYPES), \ >> 145: $(eval $(call GenerateScopedOp,BIN_$t,$t))) > > This indent was fine at 2 spaces. I meant the one below inside the recipe. Gotcha - I fixed the wrong foreach... ------------- PR: https://git.openjdk.java.net/jdk/pull/548 From mcimadamore at openjdk.java.net Thu Oct 8 13:59:20 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 8 Oct 2020 13:59:20 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v3] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the third incubation round of the foreign memory access API incubation > (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: > * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from > multiple threads > * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee > that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class > has been added, which defines several useful dereference routines; these are really just thin wrappers around memory > access var handles, but they make the barrier of entry for using this API somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not > the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link > to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit > of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which > wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as > dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability > in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; > secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can > use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done > by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided > below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be > happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, > Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd > like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio > Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative > to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a > carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access > base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte > offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. > `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which > it is easy to derive all the other handles using plain var handle combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both > `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients > can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. > * `MemoryLayouts` > * A new layout, for machine addresses, has been added to the mix. > > > > ### Implementation changes > > There are two main things to discuss here: support for shared segments, and the general simplification of the memory > access var handle support. > #### Shared segments > > The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to > achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment > is shared, it would be possible for a thread to close it while another is accessing it. After considering several > options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he > reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world > (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a > close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and > the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). > It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of > these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, > we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to > whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of > stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, > once we detect that a thread is accessing the very segment we're about to close, what should happen? We first > experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it > fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the > machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to > minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread > is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and > try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should > be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single > place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in > addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` > annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) > class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, > like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is > tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during > access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory > access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead > of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a > `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed > successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two > implementations, one for confined segments and one for shared segments; the main difference between the two is what > happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared > segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or > `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` > state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. #### > Memory access var handles overhaul The key realization here was that if all memory access var handles took a > coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle > form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var > handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that > e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the > implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level > access on the innards of the memory access var handle. All that code is now gone. #### Test changes Not much to see > here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, > since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` > functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the > microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared > segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - > https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Fix indent in GensrcScopedMemoryAccess.gmk ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/548/files - new: https://git.openjdk.java.net/jdk/pull/548/files/fa051abf..b941c4a2 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=01-02 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/548.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/548/head:pull/548 PR: https://git.openjdk.java.net/jdk/pull/548 From daniel.daugherty at oracle.com Thu Oct 8 14:37:14 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 8 Oct 2020 10:37:14 -0400 Subject: =?UTF-8?Q?Re=3a_CFV=3a_New_HotSpot_Group_Member=3a_Erik_=c3=96sterl?= =?UTF-8?Q?und?= In-Reply-To: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> References: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> Message-ID: Vote: yes Dan On 10/8/20 5:24 AM, Kim Barrett wrote: > I hearby nominate Erik ?sterlund to Membership in the HotSpot Group. > > Erik has been a JDK Reviewer and member of the Oracle GC team for several > years, currently working on ZGC, though his reach and influence extends > significantly beyond that project. He has made many substantial > contributions [1] including (most recently) JEP 376: ZGC: Concurrent > Thread-Stack Processing. > > Votes are due by Friday, 23-Oct-2020 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Erik+%C3%96sterlund%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From tobias.hartmann at oracle.com Thu Oct 8 14:51:56 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 8 Oct 2020 16:51:56 +0200 Subject: =?UTF-8?Q?Re=3a_CFV=3a_New_HotSpot_Group_Member=3a_Erik_=c3=96sterl?= =?UTF-8?Q?und?= In-Reply-To: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> References: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> Message-ID: <11c116e4-5233-d23e-8019-4623ca3b2c0f@oracle.com> Vote: yes Best regards, Tobias On 08.10.20 11:24, Kim Barrett wrote: > I hearby nominate Erik ?sterlund to Membership in the HotSpot Group. > > Erik has been a JDK Reviewer and member of the Oracle GC team for several > years, currently working on ZGC, though his reach and influence extends > significantly beyond that project. He has made many substantial > contributions [1] including (most recently) JEP 376: ZGC: Concurrent > Thread-Stack Processing. > > Votes are due by Friday, 23-Oct-2020 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Erik+%C3%96sterlund%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From volker.simonis at gmail.com Thu Oct 8 15:05:29 2020 From: volker.simonis at gmail.com (Volker Simonis) Date: Thu, 8 Oct 2020 17:05:29 +0200 Subject: =?UTF-8?Q?Re=3A_CFV=3A_New_HotSpot_Group_Member=3A_Erik_=C3=96sterlund?= In-Reply-To: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> References: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> Message-ID: Vote: yes Kim Barrett schrieb am Do., 8. Okt. 2020, 11:24: > I hearby nominate Erik ?sterlund to Membership in the HotSpot Group. > > Erik has been a JDK Reviewer and member of the Oracle GC team for several > years, currently working on ZGC, though his reach and influence extends > significantly beyond that project. He has made many substantial > contributions [1] including (most recently) JEP 376: ZGC: Concurrent > Thread-Stack Processing. > > Votes are due by Friday, 23-Oct-2020 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] > https://github.com/search?q=author-name%3A%22Erik+%C3%96sterlund%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > > From erikj at openjdk.java.net Thu Oct 8 15:31:04 2020 From: erikj at openjdk.java.net (Erik Joelsson) Date: Thu, 8 Oct 2020 15:31:04 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v3] In-Reply-To: References: Message-ID: On Thu, 8 Oct 2020 13:59:20 GMT, Maurizio Cimadamore wrote: >> This patch contains the changes associated with the third incubation round of the foreign memory access API incubation >> (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: >> * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from >> multiple threads >> * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee >> that the memory will be deallocated, eventually >> * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class >> has been added, which defines several useful dereference routines; these are really just thin wrappers around memory >> access var handles, but they make the barrier of entry for using this API somewhat lower. >> >> A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not >> the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link >> to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit >> of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which >> wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as >> dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability >> in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; >> secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can >> use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done >> by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided >> below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be >> happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, >> Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd >> like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio >> Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html >> Specdiff: >> >> http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html >> >> CSR: >> >> https://bugs.openjdk.java.net/browse/JDK-8254163 >> >> >> >> ### API Changes >> >> * `MemorySegment` >> * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) >> * added a no-arg factory for a native restricted segment representing entire native heap >> * rename `withOwnerThread` to `handoff` >> * add new `share` method, to create shared segments >> * add new `registerCleaner` method, to register a segment against a cleaner >> * add more helpers to create arrays from a segment e.g. `toIntArray` >> * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) >> * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) >> * `MemoryAddress` >> * drop `segment` accessor >> * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative >> to a given segment >> * `MemoryAccess` >> * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a >> carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access >> base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte >> offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. >> `getByteAtOffset` vs `getByteAtIndex`). >> * `MemoryHandles` >> * drop `withOffset` combinator >> * drop `withStride` combinator >> * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which >> it is easy to derive all the other handles using plain var handle combinators. >> * `Addressable` >> * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both >> `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients >> can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. >> * `MemoryLayouts` >> * A new layout, for machine addresses, has been added to the mix. >> >> >> >> ### Implementation changes >> >> There are two main things to discuss here: support for shared segments, and the general simplification of the memory >> access var handle support. >> #### Shared segments >> >> The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to >> achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment >> is shared, it would be possible for a thread to close it while another is accessing it. After considering several >> options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he >> reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world >> (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a >> close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and >> the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). >> It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of >> these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, >> we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to >> whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of >> stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, >> once we detect that a thread is accessing the very segment we're about to close, what should happen? We first >> experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it >> fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the >> machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to >> minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread >> is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and >> try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should >> be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single >> place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in >> addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` >> annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) >> class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, >> like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is >> tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during >> access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory >> access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead >> of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a >> `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed >> successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two >> implementations, one for confined segments and one for shared segments; the main difference between the two is what >> happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared >> segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or >> `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` >> state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. #### >> Memory access var handles overhaul The key realization here was that if all memory access var handles took a >> coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle >> form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var >> handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that >> e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the >> implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level >> access on the innards of the memory access var handle. All that code is now gone. #### Test changes Not much to see >> here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, >> since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` >> functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the >> microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared >> segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - >> https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 > > Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: > > Fix indent in GensrcScopedMemoryAccess.gmk Build changes look ok. ------------- Marked as reviewed by erikj (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/548 From gnu.andrew at redhat.com Thu Oct 8 16:03:48 2020 From: gnu.andrew at redhat.com (Andrew Hughes) Date: Thu, 8 Oct 2020 17:03:48 +0100 Subject: [8u] RFR: 8244225: stringop-overflow warning on strncpy call from compile_the_world_in In-Reply-To: References: Message-ID: <20201008160348.GB707812@stopbrexit> On 14:23 Tue 01 Sep , Severin Gehwolf wrote: > Hi, > > Could I please get a review of this 8u backport? The JDK 11u patch > doesn't apply cleanly since string_ends_with() function isn't in 8u. > Therefore, the context is different enough for the patch to not apply > cleanly. This is in a NOT_PRODUCT() path of the VM and should be low > risk. I'm proposing this for review for JDK 8u parity. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8244225 > webrev: https://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8244225/01/webrev/ > > Testing: Manual on a fastdebug build of the JVM. Compiling classLoader.o > with -Wstringop-overflow. Warning present before the fix and it's > gone after. > > Thoughts? I'm not sure if this is worth the churn of doing the > backport, though. > > Thanks, > Severin > This looks fine to me. I think it's worth fixing what looks like a potential buffer overflow, even if it is only in debug code. Thanks, -- Andrew :) Senior Free Java Software Engineer OpenJDK Package Owner Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/0xCFDA0F9B35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 From rrich at openjdk.java.net Thu Oct 8 16:55:31 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Thu, 8 Oct 2020 16:55:31 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v9] In-Reply-To: References: Message-ID: <5O9n8cKBJyjhp2cNOVD2PcpKQiXqEs5BJjkW1lH-5EM=.044a510e-6517-4564-a3db-00c7951f0b22@github.com> > Hi, > > this is the continuation of the review of the implementation for: > > https://bugs.openjdk.java.net/browse/JDK-8227745 > https://bugs.openjdk.java.net/browse/JDK-8233915 > > It allows for JIT optimizations based on escape analysis even if JVMTI agents acquire capabilities to access references > to objects that are subject to such optimizations, e.g. scalar replacement. The implementation reverts such > optimizations just before access very much as when switching from JIT compiled execution to the interpreter, aka > "deoptimization". Webrev.8 was the last one before before the transition to Git/Github: > > http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ > > Thanks, Richard. Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 19 commits: - Merge branch 'master' into JDK-8227745 - Merge branch 'master' into JDK-8227745 - Factorized fragment out of EscapeBarrier::deoptimize_objects_internal into new method in compiledVFrame. - More smaller changes proposed by Serguei. - jvmtiDeferredUpdates.hpp: remove forward declarations. - jvmtiDeferredLocalVariable: move member variables to the beginning of the class definition. - jvmtiDeferredUpdates.hpp: add/remove empty lines and improve indentation. - Merge branch 'master' into JDK-8227745 - Merge branch 'master' into JDK-8227745 - Make parameter current_thread of JvmtiEnvBase::check_top_frame() a JavaThread* again. With Asynchronous handshakes the type was changed from JavaThread* to Thread* but this is not necessary as check_top_frame() is not executed during a handshake / safepoint (robehn confirmed). - ... and 9 more: https://git.openjdk.java.net/jdk/compare/d036dca0...d463b4f3 ------------- Changes: https://git.openjdk.java.net/jdk/pull/119/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=119&range=08 Stats: 5814 lines in 52 files changed: 5595 ins; 116 del; 103 mod Patch: https://git.openjdk.java.net/jdk/pull/119.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/119/head:pull/119 PR: https://git.openjdk.java.net/jdk/pull/119 From github.com+51754783+coreyashford at openjdk.java.net Thu Oct 8 17:19:28 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Thu, 8 Oct 2020 17:19:28 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v3] In-Reply-To: References: Message-ID: On Thu, 8 Oct 2020 10:41:32 GMT, Martin Doerr wrote: >> CoreyAshford has updated the pull request incrementally with seven additional commits since the last revision: >> >> - stubGenerator_ppc.cpp: Fix multiple issues as per Martin Doerr's v2 review >> >> * Remove extraneous comma from SAP copyright notice >> * Move align(32) to the head of the loop rather than the beginning of the unwound code >> * Simplified looping condition to use a loop counter instead of a final >> address. This eliminated the need for the "end" variable, and >> essentially replaced it with CTR, which is computed using a simple >> bitwise shift of the size. >> * Re-ran benchmarks against loop_unrolls values: 1, 2, 4, 8, 16 to find >> optimal value, now 4. >> * Corrected a typo in the word "elements" >> - vm_version_ppc.cpp: per Martin Doerr's review of v2: fix copy/paste error >> - vmIntrinsics.cpp: Per Martin Doerr's v2 review: rearrange order of case statement to be consistent with others. >> - runtime.cpp: per Martin Doerr's review of v2, correct comment as per current semantics of decodeBlock() >> >> * The reference to "ofs" seems to be a copy/paste error. >> * -1 is no longer returned from decodeBlock() in the event of a >> non-base64 character being encountered; only a count of bytes written >> to dst. >> - TestBase64.java: Change comment as per Martin Doerr's v2 review >> - Base64.java: Make changes as per Roger Riggs and Martin Doerr's v2 Review >> >> * Make comment about the sl parameter more precise >> * Fix comparison to avoid possible integer overflow of sp >> - library_call.cpp: Fix rebase merge error > > test/hotspot/jtreg/compiler/intrinsics/base64/TestBase64.java line 90: > >> 88: >> 89: // This should be enough to get both encodeBlock() and >> 90: // decodeBlock() compiled on the highest tier. > > It's actually encode() and decode() which should get compiled. You should see them when testing > with -XX:+PrintCompilation. And you should see usage of the intrinsics by -XX:+PrintInlining. Ah, useful tips! Thanks! I will make this comment change. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From jbhateja at openjdk.java.net Thu Oct 8 17:32:19 2020 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Thu, 8 Oct 2020 17:32:19 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions In-Reply-To: References: <_0-zfIDPieC0Xnc17GaSSsS7Sz9EEUrfjRyqWDtphfU=.298bacde-f330-486a-8bea-03ff1523d00c@github.com> Message-ID: On Wed, 23 Sep 2020 15:27:48 GMT, Jatin Bhateja wrote: >> Can you explain why 32 bytes are such a distinct performance cliff? >> >> Is there any performance difference between doing a single 64 bytes masked copy or two 32 bytes? > >> Can you explain why 32 bytes are such a distinct performance cliff? >> >> Is there any performance difference between doing a single 64 bytes masked copy or two 32 bytes? > > Hi Nils, > Copy for sizes <= 32 bytes can be done using one YMM register, AVX-512 vector length extension allows masked > instructions to operate on YMM and XMM registers. Using newly added flag -XX:ArrayCopyPartialInlineSize=64 one can > perform in-lining up to 64 bytes but since it will use a ZMM register CPU will operate at a lower frequency but it > could still give better performance depending on the application. A single 64 byte masked copy may have a performance > hit if for majority of the application runtime, CPU operates at highest frequency. There is a switchover penalty from > higher frequency level to lower frequency level along with some hysteresis which forces subsequent instructions to > operate a lower frequency for some cycles. Current implementation has been kept simple to avoid emitting too many > instruction at call site considering arraycopy is a very high frequency operation. Hi @neliasso , @vnkozlov , kindly let me know your review comments. ------------- PR: https://git.openjdk.java.net/jdk/pull/302 From luhenry at openjdk.java.net Thu Oct 8 18:10:20 2020 From: luhenry at openjdk.java.net (Ludovic Henry) Date: Thu, 8 Oct 2020 18:10:20 GMT Subject: RFR: 8253757: Add LLVM-based backend for hsdis In-Reply-To: References: <91erxiMDb4ftvSomuJYHPi9SX-v8Z2VLD2qEwCbz5tk=.b9ed01b5-f0e0-4ed7-9c1a-b06bc0e64640@github.com> <8Eqswd7tsVaGEXHdKDncXqKpW2tBsSeuY0PV6aTB9_c=.a6cf4957-9d31-4e89-bf44-e7b7852205d5@github.com> Message-ID: On Thu, 8 Oct 2020 12:30:13 GMT, Bernhard Urban-Forster wrote: >> IMHO, it's great to have an alternative disassembler. I personally had better experience using llvm MC when I decoded >> aarch64 and AVX instructions than BFD. Another argument is that LLVM toolchain is supposed to provide the premium >> experience on non-gnu platforms such as FreeBSD. @luhenry I tried to build it with LLVM10.0.1 >> on my x86_64, ubuntu, I ran into a small problem. here is how I build. >> `$make ARCH=amd64 CC=/opt/llvm/bin/clang CXX=/opt/llvm/bin/clang++ LLVM=/opt/llvm/` >> >> I can't meet this condition because Makefile defines LIBOS_linux. >> #elif defined(LIBOS_Linux) && defined(LIBARCH_amd64) >> return "x86_64-pc-linux-gnu"; >> >> Actually, Makefile assigns OS to windows/linux/aix/macosx (all lower case)and then >> `CPPFLAGS += -DLIBOS_$(OS) -DLIBOS="$(OS)" -DLIBARCH_$(LIBARCH) -DLIBARCH="$(LIBARCH)" -DLIB_EXT="$(LIB_EXT)"` >> >> In hsdis.cpp, `native_target_triple` needs to match whatever Makefile defined. With that fix, I generate llvm version >> hsdis-amd64.so and it works flawlessly > >> 1 question: binutils seems to support Windows AArch64. Did you try recently binutils? If we can use binutils on Windows >> AArch64, you can fix makefile only. >> https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=binutils/dlltool.c;h=ed016b97dc38cdb1b85d2f6df676b9c9750f0d41;hb=HEAD#l248 > > This is armv7, I don't see any support for armv8/AArch64 in `dlltool.c`. @magicus > This is an interesting suggestion. There is a similar attempt at replacing binutils with capstone in > https://bugs.openjdk.java.net/browse/JDK-8188073, which unfortunately has not seen much progress due to lack of > resources; I don't know if you are aware of that? There is also a (extremely low priority) effort to rewrite the hsdis > makefile to be part of the normal build system, see e.g. https://bugs.openjdk.java.net/browse/JDK-8208495. Neither of > these should be any blocker for your change, but I think it might be good if you know about them. I was not aware of the effort to use capstone to replace/complement binutils in hsdis. I wonder how easy it is to port capstone to platforms in case it doesn't support them. > I have couple of concerns with your patch. One is the method in which LLVM is selected instead of binutils; afaict this > depends on having the LLVM variable set when executing the makefile. At the very least, this should be documented in > the README. I don't think any more complicated configuration is really necessary at this point. With full integration > with the build system, a more user-friendly way of selecting hsdis backend should be implemented, though. I'll add documentation to the Makefile. And I agree, I would prefer not to have to go through the whole build integration to integrate the support for LLVM. > Second, and I don't know if this is an artifact of git/github/the new skara tooling, but if you renamed hsdis.c to > hsdis.cpp, this relationship does not show up, not even in the generated webrevs. Instead they are considered a new + a > deleted file. This makes it hard to see what code changes you have done in that file. That is Git not detecting enough similarities between the two files. I could probably hack my way around and find a way to reduce the code diff if that's something you want. > And third; have you tested that your changes (both changing the main file from C to C++, and any code changes in it) > does not break the old binutils functionality? Afaic there are no test suites for exercising hsdis :-( so manual ad-hoc > testing is likely needed. I've tested on Linux-x86_64 and Linux-AArch64 on top of Windows-AArch64 and macOS-AArch64, and checked that both the binutils builds and works as previously and that the LLVM-based hsdis has an equivalent output. ------------- PR: https://git.openjdk.java.net/jdk/pull/392 From luhenry at openjdk.java.net Thu Oct 8 18:18:25 2020 From: luhenry at openjdk.java.net (Ludovic Henry) Date: Thu, 8 Oct 2020 18:18:25 GMT Subject: RFR: 8253757: Add LLVM-based backend for hsdis In-Reply-To: References: <91erxiMDb4ftvSomuJYHPi9SX-v8Z2VLD2qEwCbz5tk=.b9ed01b5-f0e0-4ed7-9c1a-b06bc0e64640@github.com> <8Eqswd7tsVaGEXHdKDncXqKpW2tBsSeuY0PV6aTB9_c=.a6cf4957-9d31-4e89-bf44-e7b7852205d5@github.com> Message-ID: On Thu, 8 Oct 2020 18:07:59 GMT, Ludovic Henry wrote: >>> 1 question: binutils seems to support Windows AArch64. Did you try recently binutils? If we can use binutils on Windows >>> AArch64, you can fix makefile only. >>> https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=binutils/dlltool.c;h=ed016b97dc38cdb1b85d2f6df676b9c9750f0d41;hb=HEAD#l248 >> >> This is armv7, I don't see any support for armv8/AArch64 in `dlltool.c`. > > @magicus > >> This is an interesting suggestion. There is a similar attempt at replacing binutils with capstone in >> https://bugs.openjdk.java.net/browse/JDK-8188073, which unfortunately has not seen much progress due to lack of >> resources; I don't know if you are aware of that? There is also a (extremely low priority) effort to rewrite the hsdis >> makefile to be part of the normal build system, see e.g. https://bugs.openjdk.java.net/browse/JDK-8208495. Neither of >> these should be any blocker for your change, but I think it might be good if you know about them. > > I was not aware of the effort to use capstone to replace/complement binutils in hsdis. I wonder how easy it is to port > capstone to platforms in case it doesn't support them. >> I have couple of concerns with your patch. One is the method in which LLVM is selected instead of binutils; afaict this >> depends on having the LLVM variable set when executing the makefile. At the very least, this should be documented in >> the README. I don't think any more complicated configuration is really necessary at this point. With full integration >> with the build system, a more user-friendly way of selecting hsdis backend should be implemented, though. > > I'll add documentation to the Makefile. And I agree, I would prefer not to have to go through the whole build > integration to integrate the support for LLVM. >> Second, and I don't know if this is an artifact of git/github/the new skara tooling, but if you renamed hsdis.c to >> hsdis.cpp, this relationship does not show up, not even in the generated webrevs. Instead they are considered a new + a >> deleted file. This makes it hard to see what code changes you have done in that file. > > That is Git not detecting enough similarities between the two files. I could probably hack my way around and find a way > to reduce the code diff if that's something you want. >> And third; have you tested that your changes (both changing the main file from C to C++, and any code changes in it) >> does not break the old binutils functionality? Afaic there are no test suites for exercising hsdis :-( so manual ad-hoc >> testing is likely needed. > > I've tested on Linux-x86_64 and Linux-AArch64 on top of Windows-AArch64 and macOS-AArch64, and checked that both the > binutils builds and works as previously and that the LLVM-based hsdis has an equivalent output. @navyxliu > @luhenry I tried to build it with LLVM10.0.1 > on my x86_64, ubuntu, I ran into a small problem. here is how I build. > $make ARCH=amd64 CC=/opt/llvm/bin/clang CXX=/opt/llvm/bin/clang++ LLVM=/opt/llvm/ > > I can't meet this condition because Makefile defines LIBOS_linux. > > #elif defined(LIBOS_Linux) && defined(LIBARCH_amd64) > return "x86_64-pc-linux-gnu"; > > Actually, Makefile assigns OS to windows/linux/aix/macosx (all lower case)and then > CPPFLAGS += -DLIBOS_$(OS) -DLIBOS="$(OS)" -DLIBARCH_$(LIBARCH) -DLIBARCH="$(LIBARCH)" -DLIB_EXT="$(LIB_EXT)" Interestingly, I did it this way because on my machine `LIBOS_Linux` would get defined instead of `LIBOS_linux`. I tried on WSL which might explain the difference. Could you please share more details on what environment you are using? > In hsdis.cpp, native_target_triple needs to match whatever Makefile defined. With that fix, I generate llvm version > hsdis-amd64.so and it works flawlessly I'm not sure I understand what you mean. Are you saying we should define the native target triple based on the variables in the Makefile? A difficulty I ran into is that there is not always a 1-to-1 mapping between the autoconf/gcc target triple and the LLVM one. For example. you pass `x86_64-gnu-linux` to the OpenJDK's `configure` script, but the equivalent target triple for LLVM is `x86_64-pc-linux-gnu`. Since my plan isn't to use LLVM as the default for all platforms, and because there aren't that many combinations of target OS/ARCH, I am taking the approach of hardcoding the combinations we care about in `hsdis.cpp`. ------------- PR: https://git.openjdk.java.net/jdk/pull/392 From iignatyev at openjdk.java.net Thu Oct 8 18:41:23 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Thu, 8 Oct 2020 18:41:23 GMT Subject: RFR: JDK-8247589: Implementation of Alpine Linux/x64 Port [v2] In-Reply-To: References: <6jqlCPXe69fPRvYFrytJsECkaa9tJ1hYWISNgyPP4Eg=.40944ef5-93b0-4db4-948b-80bb7898e9e8@github.com> Message-ID: On Thu, 8 Oct 2020 11:00:41 GMT, Aleksei Voitylov wrote: > @iignatev I resolved the conflict in whitebox.cpp and fixed a minor style nit on the way. Could you take a look? LGTM ------------- PR: https://git.openjdk.java.net/jdk/pull/49 From mdoerr at openjdk.java.net Thu Oct 8 19:36:27 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Thu, 8 Oct 2020 19:36:27 GMT Subject: RFR: 8254265: s390 and linux 32 bit builds broken Message-ID: JDK-8253717 missed 2 ocurrances of JavaThread::stack_guard_zone_size() which was moved to class StackOverflow. I just verified s390 build. Can anybody check linux 32 bit? ------------- Commit messages: - 8254265: s390 and linux 32 bit builds broken Changes: https://git.openjdk.java.net/jdk/pull/568/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=568&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254265 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/568.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/568/head:pull/568 PR: https://git.openjdk.java.net/jdk/pull/568 From iklam at openjdk.java.net Thu Oct 8 20:07:24 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 8 Oct 2020 20:07:24 GMT Subject: RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive [v12] In-Reply-To: References: <9emWKl6fr-GA5LN0uHhuEd5D123QcoCiHQR1M9bAbag=.cc4b6129-8b33-47e4-a421-9e6b4817933b@github.com> <-y8lEorT3v4i2G_yBbrmd6faeU584CXFnPEiM8GcPec=.d9e8a152-c319-43dc-a6d1-f745b7eccb20@github.com> Message-ID: <-6SMAKzxQbKz43iXA6iLgLkDWEYfkwu-DWQbo7FAX8Q=.f3184cf8-756c-4da7-bf8e-2b2dca998922@github.com> On Wed, 7 Oct 2020 21:49:24 GMT, Yumin Qi wrote: >> src/java.base/share/classes/jdk/internal/misc/CDS.java line 144: >> >>> 142: String line = s.trim(); >>> 143: if (!line.startsWith("[LF_RESOLVE]") && !line.startsWith("[SPECIES_RESOLVE]")) { >>> 144: System.out.println("Wrong prefix: " + line); >> >> Should this throw an exception instead? > > This part is for check the format only, throw exceptions will lead more objects generated which should not be archived > in shared heap. Since this is only called from VM, so decide not to throw exception here. The exception object will not be automatically added to the shared heap, so it's OK to throw exceptions here. ------------- PR: https://git.openjdk.java.net/jdk/pull/193 From mikael.vidstedt at oracle.com Thu Oct 8 20:15:11 2020 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Thu, 8 Oct 2020 13:15:11 -0700 Subject: =?utf-8?Q?Re=3A_CFV=3A_New_HotSpot_Group_Member=3A_Erik_=C3=96ste?= =?utf-8?Q?rlund?= In-Reply-To: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> References: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> Message-ID: <9B522FBE-55EF-408E-967A-EA125C743B20@oracle.com> Vote: yes Cheers, Mikael > On Oct 8, 2020, at 2:24 AM, Kim Barrett wrote: > > I hearby nominate Erik ?sterlund to Membership in the HotSpot Group. > > Erik has been a JDK Reviewer and member of the Oracle GC team for several > years, currently working on ZGC, though his reach and influence extends > significantly beyond that project. He has made many substantial > contributions [1] including (most recently) JEP 376: ZGC: Concurrent > Thread-Stack Processing. > > Votes are due by Friday, 23-Oct-2020 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Erik+%C3%96sterlund%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From burban at openjdk.java.net Thu Oct 8 20:28:33 2020 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Thu, 8 Oct 2020 20:28:33 GMT Subject: RFR: 8254072: AArch64: Get rid of --disable-warnings-as-errors on Windows+ARM64 build [v2] In-Reply-To: References: Message-ID: > I organized this PR so that each commit contains the warning emitted by MSVC as commit message and its relevant fix. > > Verified on > * Linux+ARM64: `{hotspot,jdk,langtools}:tier1`, no failures. > * Windows+ARM64: `{hotspot,jdk,langtools}:tier1`, no (new) failures. > * internal macOS+ARM64 port: build without `--disable-warnings-as-errors` still works. Just mentioning this here, because > it's yet another toolchain (Xcode / clang) that needs to be kept happy [going > forward](https://openjdk.java.net/jeps/391). Bernhard Urban-Forster has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 18 commits: - Merge remote-tracking branch 'upstream/master' into 8254072-fix-windows-arm64-warnings - ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1441): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1446): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1654): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data - Revert changes for "warning C4146: unary minus operator applied to unsigned type, result still unsigned" - msvc: disable unary minus warning for unsigned types - ./src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp(1123): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data ./src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp(1123): warning C4267: 'initializing': conversion from 'size_t' to 'const int', possible loss of data - ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1312): warning C4267: 'argument': conversion from 'size_t' to 'unsigned int', possible loss of data ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1370): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1441): warning C4146: unary minus operator applied to unsigned type, result still unsigned ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1441): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data - ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp(2472): warning C4312: 'type cast': conversion from 'unsigned int' to 'address' of greater size - ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp(1527): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data - ./src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp(2901): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data ./src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp(2901): warning C4267: 'initializing': conversion from 'size_t' to 'const int', possible loss of data - ./src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp(2756): warning C4146: unary minus operator applied to unsigned type, result still unsigned - ... and 8 more: https://git.openjdk.java.net/jdk/compare/5351ba6c...a081dfb4 ------------- Changes: https://git.openjdk.java.net/jdk/pull/530/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=530&range=01 Stats: 22 lines in 8 files changed: 2 ins; 0 del; 20 mod Patch: https://git.openjdk.java.net/jdk/pull/530.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/530/head:pull/530 PR: https://git.openjdk.java.net/jdk/pull/530 From burban at openjdk.java.net Thu Oct 8 20:28:34 2020 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Thu, 8 Oct 2020 20:28:34 GMT Subject: RFR: 8254072: AArch64: Get rid of --disable-warnings-as-errors on Windows+ARM64 build In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 18:09:05 GMT, Bernhard Urban-Forster wrote: > I organized this PR so that each commit contains the warning emitted by MSVC as commit message and its relevant fix. > > Verified on > * Linux+ARM64: `{hotspot,jdk,langtools}:tier1`, no failures. > * Windows+ARM64: `{hotspot,jdk,langtools}:tier1`, no (new) failures. > * internal macOS+ARM64 port: build without `--disable-warnings-as-errors` still works. Just mentioning this here, because > it's yet another toolchain (Xcode / clang) that needs to be kept happy [going > forward](https://openjdk.java.net/jeps/391). Thank you Andrew for your comments! > _Mailing list message from [Andrew Haley](mailto:aph at redhat.com) on [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ > IMO this warning: > > warning C4146: unary minus operator applied to unsigned type, result still unsigned > > should not be used. Okay, added to the Makefile and reverted those changes. > // Generate stack overflow check > if (UseStackBanging) { > - __ bang_stack_with_offset(JavaThread::stack_shadow_zone_size()); > + __ bang_stack_with_offset((int)JavaThread::stack_shadow_zone_size()); > } else { > Unimplemented(); > > Could this one be fixed by changing stack_shadow_zone_size() or > bang_stack_with_offset() ? I would have thought that whatever type > stack_shadow_zone_size() returns should be compatible with > bang_stack_with_offset(). The x86_64 backend and others do the same: https://github.com/openjdk/jdk/blob/5351ba6cfa8078f503f1cf0c375b692905c607ff/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp#L2176-L2178 So should we (1) do the same, (2) diverge or (3) fix all of them? For the remaining comments, I've updated the PR, please have another look. ------------- PR: https://git.openjdk.java.net/jdk/pull/530 From github.com+51754783+coreyashford at openjdk.java.net Thu Oct 8 20:31:47 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Thu, 8 Oct 2020 20:31:47 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v4] In-Reply-To: References: Message-ID: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> > This patch set encompasses the following commits: > > - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for > the intrinsic. The API is similar to the existing encodeBlock intrinsic. > - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic > implementation > - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. > - Adds a JMH microbenchmark for both Base64 encoding and encoding. > - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. CoreyAshford has updated the pull request incrementally with two additional commits since the last revision: - TestBase64.java: fix comment to correctly reflect actual intrinsic names. The intrinsic names that are visible with -XX:+PrintCompilation are encode and decode, rather than encodeBlock and decodeBlock. - stubGenerator_ppc.cpp: fix regression caused by change to using loop counter My original fix didn't account for the case where sl < block_size. In the event sl < block_size, the shifted sl will become zero, so it should jump to the code that computes how much data was processed - 0 - and return. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/293/files - new: https://git.openjdk.java.net/jdk/pull/293/files/8932c233..164fa2a9 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=293&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=293&range=02-03 Stats: 13 lines in 2 files changed: 7 ins; 4 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/293.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/293/head:pull/293 PR: https://git.openjdk.java.net/jdk/pull/293 From xliu at openjdk.java.net Thu Oct 8 20:43:26 2020 From: xliu at openjdk.java.net (Xin Liu) Date: Thu, 8 Oct 2020 20:43:26 GMT Subject: RFR: 8253757: Add LLVM-based backend for hsdis In-Reply-To: References: <91erxiMDb4ftvSomuJYHPi9SX-v8Z2VLD2qEwCbz5tk=.b9ed01b5-f0e0-4ed7-9c1a-b06bc0e64640@github.com> <8Eqswd7tsVaGEXHdKDncXqKpW2tBsSeuY0PV6aTB9_c=.a6cf4957-9d31-4e89-bf44-e7b7852205d5@github.com> Message-ID: <2S00ucaPGiAQLeLOejt1kfXeYEc7ctEPeRCIcq1N0N8=.dbf1ea7a-8de4-48a5-8759-03495e3e3c08@github.com> On Thu, 8 Oct 2020 18:15:10 GMT, Ludovic Henry wrote: > @navyxliu > > > @luhenry I tried to build it with LLVM10.0.1 > > on my x86_64, ubuntu, I ran into a small problem. here is how I build. > > $make ARCH=amd64 CC=/opt/llvm/bin/clang CXX=/opt/llvm/bin/clang++ LLVM=/opt/llvm/ > > I can't meet this condition because Makefile defines LIBOS_linux. > > #elif defined(LIBOS_Linux) && defined(LIBARCH_amd64) > > return "x86_64-pc-linux-gnu"; > > Actually, Makefile assigns OS to windows/linux/aix/macosx (all lower case)and then > > CPPFLAGS += -DLIBOS_$(OS) -DLIBOS="$(OS)" -DLIBARCH_$(LIBARCH) -DLIBARCH="$(LIBARCH)" -DLIB_EXT="$(LIB_EXT)" > > Interestingly, I did it this way because on my machine `LIBOS_Linux` would get defined instead of `LIBOS_linux`. I > tried on WSL which might explain the difference. Could you please share more details on what environment you are using? I am using ubuntu 18.04. `OS = $(shell uname)` does initialize OS=Linux in the first place, but later OS is set to "linux" at line 88 of https://openjdk.github.io/cr/?repo=jdk&pr=392&range=05#new-0 At line 186, -DLIBOS_linux -DLIBOS="linux" ... It doesn't match line 564 of https://openjdk.github.io/cr/?repo=jdk&pr=392&range=05#new-2 in my understanding, C/C++ macros are all case sensitive. I got #error "unknown platform" because of Linux/linux discrepancy. > > In hsdis.cpp, native_target_triple needs to match whatever Makefile defined. With that fix, I generate llvm version > > hsdis-amd64.so and it works flawlessly > > I'm not sure I understand what you mean. Are you saying we should define the native target triple based on the > variables in the Makefile? > A difficulty I ran into is that there is not always a 1-to-1 mapping between the autoconf/gcc target triple and the > LLVM one. For example. you pass `x86_64-gnu-linux` to the OpenJDK's `configure` script, but the equivalent target > triple for LLVM is `x86_64-pc-linux-gnu`. Since my plan isn't to use LLVM as the default for all platforms, and > because there aren't that many combinations of target OS/ARCH, I am taking the approach of hardcoding the combinations > we care about in `hsdis.cpp`. ------------- PR: https://git.openjdk.java.net/jdk/pull/392 From mdoerr at openjdk.java.net Thu Oct 8 20:47:23 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Thu, 8 Oct 2020 20:47:23 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v4] In-Reply-To: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> References: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> Message-ID: On Thu, 8 Oct 2020 20:31:47 GMT, CoreyAshford wrote: >> This patch set encompasses the following commits: >> >> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for >> the intrinsic. The API is similar to the existing encodeBlock intrinsic. >> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic >> implementation >> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. >> - Adds a JMH microbenchmark for both Base64 encoding and encoding. >> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. > > CoreyAshford has updated the pull request incrementally with two additional commits since the last revision: > > - TestBase64.java: fix comment to correctly reflect actual intrinsic names. > > The intrinsic names that are visible with -XX:+PrintCompilation are encode > and decode, rather than encodeBlock and decodeBlock. > - stubGenerator_ppc.cpp: fix regression caused by change to using loop counter > > My original fix didn't account for the case where sl < block_size. In the > event sl < block_size, the shifted sl will become zero, so it should > jump to the code that computes how much data was processed - 0 - and return. Marked as reviewed by mdoerr (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From coleenp at openjdk.java.net Thu Oct 8 20:53:29 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 8 Oct 2020 20:53:29 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v3] In-Reply-To: References: Message-ID: <04N7NDBB9K0OIRpJy4FaS7jPns7J65f5vpIPAKdLdIQ=.d72b8f73-6062-4bbf-a9c3-64d41de9165d@github.com> On Thu, 8 Oct 2020 13:59:20 GMT, Maurizio Cimadamore wrote: >> This patch contains the changes associated with the third incubation round of the foreign memory access API incubation >> (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: >> * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from >> multiple threads >> * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee >> that the memory will be deallocated, eventually >> * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class >> has been added, which defines several useful dereference routines; these are really just thin wrappers around memory >> access var handles, but they make the barrier of entry for using this API somewhat lower. >> >> A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not >> the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link >> to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit >> of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which >> wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as >> dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability >> in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; >> secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can >> use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done >> by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided >> below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be >> happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, >> Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd >> like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio >> Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html >> Specdiff: >> >> http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html >> >> CSR: >> >> https://bugs.openjdk.java.net/browse/JDK-8254163 >> >> >> >> ### API Changes >> >> * `MemorySegment` >> * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) >> * added a no-arg factory for a native restricted segment representing entire native heap >> * rename `withOwnerThread` to `handoff` >> * add new `share` method, to create shared segments >> * add new `registerCleaner` method, to register a segment against a cleaner >> * add more helpers to create arrays from a segment e.g. `toIntArray` >> * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) >> * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) >> * `MemoryAddress` >> * drop `segment` accessor >> * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative >> to a given segment >> * `MemoryAccess` >> * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a >> carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access >> base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte >> offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. >> `getByteAtOffset` vs `getByteAtIndex`). >> * `MemoryHandles` >> * drop `withOffset` combinator >> * drop `withStride` combinator >> * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which >> it is easy to derive all the other handles using plain var handle combinators. >> * `Addressable` >> * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both >> `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients >> can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. >> * `MemoryLayouts` >> * A new layout, for machine addresses, has been added to the mix. >> >> >> >> ### Implementation changes >> >> There are two main things to discuss here: support for shared segments, and the general simplification of the memory >> access var handle support. >> #### Shared segments >> >> The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to >> achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment >> is shared, it would be possible for a thread to close it while another is accessing it. After considering several >> options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he >> reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world >> (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a >> close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and >> the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). >> It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of >> these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, >> we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to >> whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of >> stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, >> once we detect that a thread is accessing the very segment we're about to close, what should happen? We first >> experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it >> fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the >> machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to >> minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread >> is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and >> try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should >> be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single >> place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in >> addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` >> annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) >> class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, >> like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is >> tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during >> access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory >> access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead >> of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a >> `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed >> successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two >> implementations, one for confined segments and one for shared segments; the main difference between the two is what >> happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared >> segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or >> `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` >> state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. #### >> Memory access var handles overhaul The key realization here was that if all memory access var handles took a >> coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle >> form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var >> handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that >> e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the >> implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level >> access on the innards of the memory access var handle. All that code is now gone. #### Test changes Not much to see >> here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, >> since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` >> functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the >> microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared >> segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - >> https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 > > Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: > > Fix indent in GensrcScopedMemoryAccess.gmk just a drive-by comment. src/hotspot/share/classfile/vmSymbols.hpp line 290: > 288: template(jdk_internal_vm_annotation_ForceInline_signature, "Ljdk/internal/vm/annotation/ForceInline;") \ > 289: template(jdk_internal_vm_annotation_Hidden_signature, "Ljdk/internal/vm/annotation/Hidden;") \ > 290: template(jdk_internal_misc_Scoped_signature, "Ljdk/internal/misc/ScopedMemoryAccess$Scoped;") \ Can you line this up? ------------- PR: https://git.openjdk.java.net/jdk/pull/548 From coleenp at openjdk.java.net Thu Oct 8 21:03:22 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 8 Oct 2020 21:03:22 GMT Subject: RFR: 8254265: s390 and linux 32 bit builds broken In-Reply-To: References: Message-ID: On Thu, 8 Oct 2020 19:29:49 GMT, Martin Doerr wrote: > JDK-8253717 missed 2 ocurrances of JavaThread::stack_guard_zone_size() which was moved to class StackOverflow. > > I just verified s390 build. Can anybody check linux 32 bit? Marked as reviewed by coleenp (Reviewer). I'm sorry, I really thought I'd built all the platforms. Trying to figure out what happened. This change looks good. I'm trying to verify linux-32 bit now. ------------- PR: https://git.openjdk.java.net/jdk/pull/568 From igor.ignatyev at oracle.com Thu Oct 8 21:41:13 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Thu, 8 Oct 2020 14:41:13 -0700 Subject: =?utf-8?Q?Re=3A_CFV=3A_New_HotSpot_Group_Member=3A_Erik_=C3=96ste?= =?utf-8?Q?rlund?= In-Reply-To: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> References: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> Message-ID: <763C7884-E097-4AFF-8142-A45C11F7152A@oracle.com> > On Oct 8, 2020, at 2:24 AM, Kim Barrett wrote: > > I hearby nominate Erik ?sterlund to Membership in the HotSpot Group. Vote: yes -- Igor From dholmes at openjdk.java.net Thu Oct 8 22:22:20 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 8 Oct 2020 22:22:20 GMT Subject: RFR: 8254265: s390 and linux 32 bit builds broken In-Reply-To: References: Message-ID: On Thu, 8 Oct 2020 19:29:49 GMT, Martin Doerr wrote: > JDK-8253717 missed 2 ocurrances of JavaThread::stack_guard_zone_size() which was moved to class StackOverflow. > > I just verified s390 build. Can anybody check linux 32 bit? Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/568 From coleenp at openjdk.java.net Thu Oct 8 22:52:25 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 8 Oct 2020 22:52:25 GMT Subject: RFR: 8254265: s390 and linux 32 bit builds broken In-Reply-To: References: Message-ID: On Thu, 8 Oct 2020 19:29:49 GMT, Martin Doerr wrote: > JDK-8253717 missed 2 ocurrances of JavaThread::stack_guard_zone_size() which was moved to class StackOverflow. > > I just verified s390 build. Can anybody check linux 32 bit? Changes requested by coleenp (Reviewer). src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp line 863: > 861: */ > 862: char* hint = (char*)(Linux::initial_thread_stack_bottom() - > 863: (StackOverflow::stack_guard_zone_size() + page_size)); Lines 841 and 842 also need to be fixed. Still testing. ------------- PR: https://git.openjdk.java.net/jdk/pull/568 From coleenp at openjdk.java.net Thu Oct 8 23:03:19 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 8 Oct 2020 23:03:19 GMT Subject: RFR: 8254265: s390 and linux 32 bit builds broken In-Reply-To: References: Message-ID: <7EhV3aWU-gCnPawQHwmnQ98NaSlpaebgwUFdYbLO07E=.1f67e112-4a09-4a87-89d0-60a0d9eb23b0@github.com> On Thu, 8 Oct 2020 22:49:54 GMT, Coleen Phillimore wrote: >> JDK-8253717 missed 2 ocurrances of JavaThread::stack_guard_zone_size() which was moved to class StackOverflow. >> >> I just verified s390 build. Can anybody check linux 32 bit? > > Changes requested by coleenp (Reviewer). Also could you change this too? `diff --git a/src/hotspot/os/linux/os_linux.cpp b/src/hotspot/os/linux/os_linux.cpp index 85560cc..cf20034 100644 --- a/src/hotspot/os/linux/os_linux.cpp +++ b/src/hotspot/os/linux/os_linux.cpp @@ -1935,7 +1935,7 @@ void * os::Linux::dll_load_in_vmthread(const char *filename, char *ebuf, StackOverflow* overflow_state = jt->stack_overflow_state(); if (!overflow_state->stack_guard_zone_unused() && // Stack not yet fully initialized overflow_state->stack_guards_enabled()) { // No pending stack overflow exceptions - if (!os::guard_memory((char *)jt->stack_end(), overflow_state->stack_guard_zone_size())) { + if (!os::guard_memory((char *)jt->stack_end(), StackOverflow::stack_guard_zone_size())) { warning("Attempt to reguard stack yellow zone failed."); } }` ------------- PR: https://git.openjdk.java.net/jdk/pull/568 From jesper.wilhelmsson at oracle.com Thu Oct 8 23:16:11 2020 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Fri, 9 Oct 2020 01:16:11 +0200 Subject: =?utf-8?Q?Re=3A_CFV=3A_New_HotSpot_Group_Member=3A_Erik_=C3=96ste?= =?utf-8?Q?rlund_?= In-Reply-To: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> References: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> Message-ID: Vote: Yes /Jesper > On 8 Oct 2020, at 11:24, Kim Barrett wrote: > > I hearby nominate Erik ?sterlund to Membership in the HotSpot Group. > > Erik has been a JDK Reviewer and member of the Oracle GC team for several > years, currently working on ZGC, though his reach and influence extends > significantly beyond that project. He has made many substantial > contributions [1] including (most recently) JEP 376: ZGC: Concurrent > Thread-Stack Processing. > > Votes are due by Friday, 23-Oct-2020 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Erik+%C3%96sterlund%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From psandoz at openjdk.java.net Thu Oct 8 23:20:22 2020 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Thu, 8 Oct 2020 23:20:22 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v3] In-Reply-To: References: Message-ID: On Thu, 8 Oct 2020 13:59:20 GMT, Maurizio Cimadamore wrote: >> This patch contains the changes associated with the third incubation round of the foreign memory access API incubation >> (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: >> * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from >> multiple threads >> * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee >> that the memory will be deallocated, eventually >> * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class >> has been added, which defines several useful dereference routines; these are really just thin wrappers around memory >> access var handles, but they make the barrier of entry for using this API somewhat lower. >> >> A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not >> the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link >> to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit >> of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which >> wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as >> dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability >> in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; >> secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can >> use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done >> by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided >> below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be >> happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, >> Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd >> like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio >> Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html >> Specdiff: >> >> http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html >> >> CSR: >> >> https://bugs.openjdk.java.net/browse/JDK-8254163 >> >> >> >> ### API Changes >> >> * `MemorySegment` >> * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) >> * added a no-arg factory for a native restricted segment representing entire native heap >> * rename `withOwnerThread` to `handoff` >> * add new `share` method, to create shared segments >> * add new `registerCleaner` method, to register a segment against a cleaner >> * add more helpers to create arrays from a segment e.g. `toIntArray` >> * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) >> * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) >> * `MemoryAddress` >> * drop `segment` accessor >> * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative >> to a given segment >> * `MemoryAccess` >> * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a >> carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access >> base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte >> offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. >> `getByteAtOffset` vs `getByteAtIndex`). >> * `MemoryHandles` >> * drop `withOffset` combinator >> * drop `withStride` combinator >> * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which >> it is easy to derive all the other handles using plain var handle combinators. >> * `Addressable` >> * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both >> `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients >> can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. >> * `MemoryLayouts` >> * A new layout, for machine addresses, has been added to the mix. >> >> >> >> ### Implementation changes >> >> There are two main things to discuss here: support for shared segments, and the general simplification of the memory >> access var handle support. >> #### Shared segments >> >> The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to >> achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment >> is shared, it would be possible for a thread to close it while another is accessing it. After considering several >> options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he >> reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world >> (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a >> close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and >> the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). >> It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of >> these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, >> we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to >> whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of >> stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, >> once we detect that a thread is accessing the very segment we're about to close, what should happen? We first >> experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it >> fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the >> machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to >> minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread >> is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and >> try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should >> be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single >> place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in >> addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` >> annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) >> class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, >> like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is >> tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during >> access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory >> access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead >> of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a >> `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed >> successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two >> implementations, one for confined segments and one for shared segments; the main difference between the two is what >> happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared >> segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or >> `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` >> state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. #### >> Memory access var handles overhaul The key realization here was that if all memory access var handles took a >> coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle >> form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var >> handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that >> e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the >> implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level >> access on the innards of the memory access var handle. All that code is now gone. #### Test changes Not much to see >> here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, >> since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` >> functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the >> microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared >> segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - >> https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 > > Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: > > Fix indent in GensrcScopedMemoryAccess.gmk Reviewed this when updated in [panama-foreign](https://github.com/openjdk/panama-foreign/tree/foreign-memaccess), hence the lack of substantial comments for this PR. src/java.base/share/classes/jdk/internal/misc/X-ScopedMemoryAccess.java.template line 50: > 48: * a memory region while another thread is releasing it. > 49: *

> 50: * This class provides tools to manages races when multiple threads are accessing and/or releasing the same memory s/manages/manage test/jdk/java/foreign/TestCleaner.java line 183: > 181: for (int cleaner = 0 ; cleaner < cleaners.length ; cleaner++) { > 182: for (int segmentFunction = 0 ; segmentFunction < segmentFunctions.length ; segmentFunction++) { > 183: data[kind + kinds.length * cleaner + (cleaners.length * kinds.length * segmentFunction)] = Using an `ArrayList` with `list.toArray(Object[][]::new)` would make this easier to read: List data = new ArrayList<>(); for (Registered kind : RegisterKind.values()) { for (Object cleaner : cleaners) { for (SegmentFunction segmentFunction : SegmentFunction.values()) { data.add(new Object[] {kind, cleaner, segmentFunction}); } } } return data.toArray(Object[][]::new); ------------- Marked as reviewed by psandoz (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/548 From dholmes at openjdk.java.net Fri Oct 9 06:04:26 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 9 Oct 2020 06:04:26 GMT Subject: RFR: JDK-8247589: Implementation of Alpine Linux/x64 Port [v6] In-Reply-To: References: Message-ID: On Thu, 8 Oct 2020 10:52:53 GMT, Aleksei Voitylov wrote: >> continuing the review thread from here https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-September/068546.html >> >>> The download side of using JNI in these tests is that it complicates the >>> setup a bit for those that run jtreg directly and/or just build the JDK >>> and not the test libraries. You could reduce this burden a bit by >>> limiting the load library/isMusl check to Linux only, meaning isMusl >>> would not be called on other platforms. >>> >>> The alternative you suggest above might indeed be better. I assume you >>> don't mean splitting the tests but rather just adding a second @test >>> description so that the vm.musl case runs the test with a system >>> property that allows the test know the expected load library path behavior. >> >> I have updated the PR to split the two tests in multiple @test s. >> >>> The updated comment in java_md.c in this looks good. A minor comment on >>> Platform.isBusybox is Files.isSymbolicLink returning true implies that >>> the link exists so no need to check for exists too. Also the >>> if-then-else style for the new class in ProcessBuilder/Basic.java is >>> inconsistent with the rest of the test so it stands out. >> >> Thank you, these changes are done in the updated PR. >> >>> Given the repo transition this weekend then I assume you'll create a PR >>> for the final review at least. Also I see JEP 386 hasn't been targeted >>> yet but I assume Boris, as owner, will propose-to-target and wait for it >>> to be targeted before it is integrated. >> >> Yes. How can this be best accomplished with the new git workflow? >> - we can continue the review process till the end and I will request the integration to happen only after the JEP is >> targeted. I guess this step is now done by typing "slash integrate" in a comment. >> - we can pause the review process now until the JEP is targeted. >> >> In the first case I'm kindly asking the Reviewers who already chimed in on that to re-confirm the review here. > > Aleksei Voitylov has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains three commits: > - Merge branch 'master' into JDK-8247589 > - JDK-8247589: Implementation of Alpine Linux/x64 Port > - JDK-8247589: Implementation of Alpine Linux/x64 Port Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/49 From stuefe at openjdk.java.net Fri Oct 9 06:08:20 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 9 Oct 2020 06:08:20 GMT Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on macOS [v3] In-Reply-To: References: <6iVRP-20baz0_46SouR-dj9SyspR5QvaL9iJMdeipDE=.92688b4e-ebd3-4681-8e63-a4aee752c407@github.com> <_XaA5cQEInPMn5Q5gj2y7AFCRprFQiYfI6BeUN49FhA=.9f17ae05-b37e-4f40-a83f-fd34aa812575@github.com> Message-ID: On Wed, 7 Oct 2020 18:42:34 GMT, Thomas Stuefe wrote: >> Sorry, I had not highlighted that was a proof-of-concept patch to show API changes. I've pushed another PoC with >> bookkeeping and no API changes at all. But I don't like the new one either. In the new patch, there is a list of >> (potentially) executable regions that is updated on commit, when the actual desired (non)exec mode become known. If we >> support mixed exec/non-exec commits in a mapping, then after non-exec commit a part of the mapping cannot be reversed >> to a potentially executable one (as we've lost MAP_JIT). Then it can produce some unexpected results under _some_ >> conditions in runtime, while API users can be unconscious about potential issues. Good API should not allow that. >>> specifying exec ... on per-mapping level it should be enough. >> >> With this, it is possible to simplify the implementation without API changes. But it will still be 1) reserve and be >> prepare for the first exec or non-exec commit 2) on commit, finish reserve and turn the mapping to the exec or >> non-exec. All this instead of taking direct parameter "this is a executable mapping" on reserve. The current "commit >> only knows about exec" is just a leak of implementation details, as before it was only required to know executable >> mode. Providing exec parameter to reserve will just bring consistency to the interface. Or, a separate interface for >> exec (code) mappings will serve the same and will be better, as it will simplify the general non-code reserve/commit >> interface. >>> I also see at least three separate cases where we establish a mapping and later need mapping-specific information >>> somewhere until the next interaction - be it commit/uncommit or release: [ AIX SystemV or mmap, Linux THP, macOS >>> MAP_JIT for code ] >> >> Could you explain how the choice between SysV and mmap is made on AIX? It looks like >> >> develop(uintx, Use64KPagesThreshold, 0, \ >> "4K/64K page allocation threshold.") \ >> ... >> if (os::vm_page_size() == 4*K) { >> return reserve_mmaped_memory(bytes, NULL /* requested_addr */); >> } else { >> if (bytes >= Use64KPagesThreshold) { >> return reserve_shmated_memory(bytes, NULL /* requested_addr */); >> } else { >> return reserve_mmaped_memory(bytes, NULL /* requested_addr */); >> } >> } >> (there only two calls to reserve_shmated_memory and both of them are like above. Is SysV SHM used in product builds?) >> For now, the AIX case looks a bit different. The choice is made by the platform and the shared code cannot control >> this. So yes, I cannot see how to avoid handle_t or similar. In contrast, THP and MAP_JIT are the way to implement a >> request from the shared code. Even for THP, shared code seems to know why it should "realign" (not sure why commit has >> an alignment_hint parameter, while it is possible to realign after a regular commit). I assume there is enough context >> in the shared code that can be provided for platform functions, without a handle_t. And the same context should anyway >> be provided to reserve function, so handle_t can be filled with all necessary information. > >> >> Could you explain how the choice between SysV and mmap is made on AIX? It looks like >> >> ``` >> develop(uintx, Use64KPagesThreshold, 0, \ >> "4K/64K page allocation threshold.") \ >> ... >> if (os::vm_page_size() == 4*K) { >> return reserve_mmaped_memory(bytes, NULL /* requested_addr */); >> } else { >> if (bytes >= Use64KPagesThreshold) { >> return reserve_shmated_memory(bytes, NULL /* requested_addr */); >> } else { >> return reserve_mmaped_memory(bytes, NULL /* requested_addr */); >> } >> } >> ``` >> >> (there only two calls to reserve_shmated_memory and both of them are like above. Is SysV SHM used in product builds?) >> For now, the AIX case looks a bit different. The choice is made by the platform and the shared code cannot control >> this. So yes, I cannot see how to avoid handle_t or similar. > > On AIX we have 4K and 64K pages (actually more but those are interesting). 64K pages are desireable for larger areas > like heap. 64K pages can only be allocated with SystemV shared memory. mmap'ed memory is always 4K paged. But SystemV > shared mem has a number of disadvantages, like inability to protect the memory, and a large attach alignment (256M). So > it is cumbersome. os::vm_page_size() on AIX is a fake. The hotspot code assumes that the underlying Operating System > has some sort of "base page size" (usually what is returned by sysconf(_SC_PAGESIZE)), and then optionally some sort of > huge page size which follows different rules (e.g. pinned). On Aix things are more fluid. When investigating 64K page > support on AIX I decided eventually to fool hotspot into thinking that the base page size is 64k. Long story, this was > way before the OpenJDK existed and this was a propietary code base with no possibilty of changing things upstream. > Therefore os::vm_page_size returns 64K ("64K fake mode"). This can be disabled. So above code fragment uses mmaped > memory if 64K fake mode is disabled, and if it is enabeld, it uses mmap for smaller regions and shmget for larger ones. >> >> In contrast, THP and MAP_JIT are the way to implement a request from the shared code. Even for THP, shared code seems >> to know why it should "realign" (not sure why commit has an alignment_hint parameter, while it is possible to realign >> after a regular commit). I assume there is enough context in the shared code that can be provided for platform >> functions, without a handle_t. And the same context should anyway be provided to reserve function, so handle_t can be >> filled with all necessary information. > > I believe the alignment hint and the TPH code had their roots in Solaris code. So its current form (I guess) is heavily > warped by history. A new implementation would maybe just have a "os::set_tph(start, size)" function and leave it at > that. And yes, I do not think it is necessary for os::commit to do this. In fact, Linux could probably set TPH > unconditionally always when UseTransparentHugePages is active. That would alleviate the need for the alignment_hint > parameter and the realign function. I opened https://bugs.openjdk.java.net/browse/JDK-8253890 to follow up on this. (more comments) > Sorry, I had not highlighted that was a proof-of-concept patch to show API changes. I've pushed another PoC with > bookkeeping and no API changes at all. But I don't like the new one either. Interesting idea, but IMHO too heavvy weight for a platform only change. Also GrowableArray maybe not the best choice here since e.g. it requires you to search twice on add. A better solution may be a specialized BST. If there are other uses for such a solution (managing memory regions, melting them together, splitting them maybe on remove) this would be worth a generic class. I believe NMT does something similar when managing virtual memory regions, see VirtualMemoryTracker and friends. > In the new patch, there is a list of (potentially) executable regions that is updated on commit, when the actual > desired (non)exec mode become known. If we support mixed exec/non-exec commits in a mapping, then after non-exec commit > a part of the mapping cannot be reversed to a potentially executable one (as we've lost MAP_JIT). So once you cleared MAP_JIT from a region you cannot re-apply it? Then this is another reason we should not support setting and clearing exec on commit but only on a per-mapping base. > Then it can produce some unexpected results under _some_ conditions in runtime, while API users can be unconscious > about potential issues. Good API should not allow that. ------------- PR: https://git.openjdk.java.net/jdk/pull/294 From stuefe at openjdk.java.net Fri Oct 9 06:19:22 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 9 Oct 2020 06:19:22 GMT Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on macOS [v3] In-Reply-To: References: <6iVRP-20baz0_46SouR-dj9SyspR5QvaL9iJMdeipDE=.92688b4e-ebd3-4681-8e63-a4aee752c407@github.com> <_XaA5cQEInPMn5Q5gj2y7AFCRprFQiYfI6BeUN49FhA=.9f17ae05-b37e-4f40-a83f-fd34aa812575@github.com> Message-ID: On Fri, 9 Oct 2020 06:05:56 GMT, Thomas Stuefe wrote: >>> >>> Could you explain how the choice between SysV and mmap is made on AIX? It looks like >>> >>> ``` >>> develop(uintx, Use64KPagesThreshold, 0, \ >>> "4K/64K page allocation threshold.") \ >>> ... >>> if (os::vm_page_size() == 4*K) { >>> return reserve_mmaped_memory(bytes, NULL /* requested_addr */); >>> } else { >>> if (bytes >= Use64KPagesThreshold) { >>> return reserve_shmated_memory(bytes, NULL /* requested_addr */); >>> } else { >>> return reserve_mmaped_memory(bytes, NULL /* requested_addr */); >>> } >>> } >>> ``` >>> >>> (there only two calls to reserve_shmated_memory and both of them are like above. Is SysV SHM used in product builds?) >>> For now, the AIX case looks a bit different. The choice is made by the platform and the shared code cannot control >>> this. So yes, I cannot see how to avoid handle_t or similar. >> >> On AIX we have 4K and 64K pages (actually more but those are interesting). 64K pages are desireable for larger areas >> like heap. 64K pages can only be allocated with SystemV shared memory. mmap'ed memory is always 4K paged. But SystemV >> shared mem has a number of disadvantages, like inability to protect the memory, and a large attach alignment (256M). So >> it is cumbersome. os::vm_page_size() on AIX is a fake. The hotspot code assumes that the underlying Operating System >> has some sort of "base page size" (usually what is returned by sysconf(_SC_PAGESIZE)), and then optionally some sort of >> huge page size which follows different rules (e.g. pinned). On Aix things are more fluid. When investigating 64K page >> support on AIX I decided eventually to fool hotspot into thinking that the base page size is 64k. Long story, this was >> way before the OpenJDK existed and this was a propietary code base with no possibilty of changing things upstream. >> Therefore os::vm_page_size returns 64K ("64K fake mode"). This can be disabled. So above code fragment uses mmaped >> memory if 64K fake mode is disabled, and if it is enabeld, it uses mmap for smaller regions and shmget for larger ones. >>> >>> In contrast, THP and MAP_JIT are the way to implement a request from the shared code. Even for THP, shared code seems >>> to know why it should "realign" (not sure why commit has an alignment_hint parameter, while it is possible to realign >>> after a regular commit). I assume there is enough context in the shared code that can be provided for platform >>> functions, without a handle_t. And the same context should anyway be provided to reserve function, so handle_t can be >>> filled with all necessary information. >> >> I believe the alignment hint and the TPH code had their roots in Solaris code. So its current form (I guess) is heavily >> warped by history. A new implementation would maybe just have a "os::set_tph(start, size)" function and leave it at >> that. And yes, I do not think it is necessary for os::commit to do this. In fact, Linux could probably set TPH >> unconditionally always when UseTransparentHugePages is active. That would alleviate the need for the alignment_hint >> parameter and the realign function. I opened https://bugs.openjdk.java.net/browse/JDK-8253890 to follow up on this. > > (more comments) > >> Sorry, I had not highlighted that was a proof-of-concept patch to show API changes. I've pushed another PoC with >> bookkeeping and no API changes at all. But I don't like the new one either. > > Interesting idea, but IMHO too heavvy weight for a platform only change. Also GrowableArray maybe not the best choice > here since e.g. it requires you to search twice on add. A better solution may be a specialized BST. If there are other > uses for such a solution (managing memory regions, melting them together, splitting them maybe on remove) this would be > worth a generic class. I believe NMT does something similar when managing virtual memory regions, see > VirtualMemoryTracker and friends. >> In the new patch, there is a list of (potentially) executable regions that is updated on commit, when the actual >> desired (non)exec mode become known. If we support mixed exec/non-exec commits in a mapping, then after non-exec commit >> a part of the mapping cannot be reversed to a potentially executable one (as we've lost MAP_JIT). > > So once you cleared MAP_JIT from a region you cannot re-apply it? Then this is another reason we should not support > setting and clearing exec on commit but only on a per-mapping base. >> Then it can produce some unexpected results under _some_ conditions in runtime, while API users can be unconscious >> about potential issues. Good API should not allow that. Interestingly, if you look at https://github.com/openjdk/jdk/pull/49 (The new Alpine Linux port) it introduces a function called check_pax(). Pax seems to be a switchable restriction of the Linux kernel which disallows setting a memory section executable after it had been established. Another reason for making "exec" a property of the mapping itself and establish it at creation time, and then never touch it again. ------------- PR: https://git.openjdk.java.net/jdk/pull/294 From shade at openjdk.java.net Fri Oct 9 06:51:20 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 9 Oct 2020 06:51:20 GMT Subject: RFR: 8254175: Build no-pch configuration in debug mode for submit checks [v2] In-Reply-To: References: Message-ID: On Thu, 8 Oct 2020 19:10:13 GMT, Erik Joelsson wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> Drop "debug" from the names > > Marked as reviewed by erikj (Reviewer). This affects testing for everyone, so I would like someone from hotspot to chime in. ------------- PR: https://git.openjdk.java.net/jdk/pull/547 From lucy at openjdk.java.net Fri Oct 9 07:41:25 2020 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Fri, 9 Oct 2020 07:41:25 GMT Subject: RFR: 8254265: s390 and linux 32 bit builds broken In-Reply-To: References: Message-ID: On Thu, 8 Oct 2020 19:29:49 GMT, Martin Doerr wrote: > JDK-8253717 missed 2 ocurrances of JavaThread::stack_guard_zone_size() which was moved to class StackOverflow. > > I just verified s390 build. Can anybody check linux 32 bit? Change looks good, but is complete only when you adapt the two lines mentioned by coleenp as well. src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp: limit += JavaThread::stack_red_zone_size() + JavaThread::stack_yellow_zone_size(); ------------- Changes requested by lucy (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/568 From eosterlund at openjdk.java.net Fri Oct 9 07:56:39 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 9 Oct 2020 07:56:39 GMT Subject: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v13] In-Reply-To: References: Message-ID: > This PR the implementation of "JEP 376: ZGC: Concurrent Thread-Stack Processing" (cf. > https://openjdk.java.net/jeps/376). > Basically, this patch modifies the epilog safepoint when returning from a frame (supporting interpreter frames, c1, c2, > and native wrapper frames), to compare the stack pointer against a thread-local value. This turns return polls into > more of a swiss army knife that can be used to poll for safepoints, handshakes, but also returns into not yet safe to > expose frames, denoted by a "stack watermark". ZGC will leave frames (and other thread oops) in a state of a mess in > the GC checkpoint safepoints, rather than processing all threads and their stacks. Processing is initialized > automagically when threads wake up for a safepoint, or get poked by a handshake or safepoint. Said initialization > processes a few (3) frames and other thread oops. The rest - the bulk of the frame processing, is deferred until it is > actually needed. It is needed when a frame is exposed to either 1) execution (returns or unwinding due to exception > handling), or 2) stack walker APIs. A hook is then run to go and finish the lazy processing of frames. Mutator and GC > threads can compete for processing. The processing is therefore performed under a per-thread lock. Note that disarming > of the poll word (that the returns are comparing against) is only performed by the thread itself. So sliding the > watermark up will require one runtime call for a thread to note that nothing needs to be done, and then update the poll > word accordingly. Downgrading the poll word concurrently by other threads was simply not worth the complexity it > brought (and is only possible on TSO machines). So left that one out. Erik ?sterlund has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 19 commits: - Merge branch 'master' into 8253180_conc_stack_scanning - Review: Andrew CR 1 - Review: David CR 1 - Review: Deal with new assert from mainline - Merge branch 'master' into 8253180_conc_stack_scanning - Review: StackWalker hook - Review: Kim CR 1 and exception handling fix - Review: Move barrier detach - Review: Remove assert that has outstayed its welcome - Merge branch 'master' into 8253180_conc_stack_scanning - ... and 9 more: https://git.openjdk.java.net/jdk/compare/a2f65190...66070372 ------------- Changes: https://git.openjdk.java.net/jdk/pull/296/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=296&range=12 Stats: 2740 lines in 131 files changed: 2167 ins; 311 del; 262 mod Patch: https://git.openjdk.java.net/jdk/pull/296.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/296/head:pull/296 PR: https://git.openjdk.java.net/jdk/pull/296 From eosterlund at openjdk.java.net Fri Oct 9 08:43:25 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 9 Oct 2020 08:43:25 GMT Subject: Integrated: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 11:43:41 GMT, Erik ?sterlund wrote: > This PR the implementation of "JEP 376: ZGC: Concurrent Thread-Stack Processing" (cf. > https://openjdk.java.net/jeps/376). > Basically, this patch modifies the epilog safepoint when returning from a frame (supporting interpreter frames, c1, c2, > and native wrapper frames), to compare the stack pointer against a thread-local value. This turns return polls into > more of a swiss army knife that can be used to poll for safepoints, handshakes, but also returns into not yet safe to > expose frames, denoted by a "stack watermark". ZGC will leave frames (and other thread oops) in a state of a mess in > the GC checkpoint safepoints, rather than processing all threads and their stacks. Processing is initialized > automagically when threads wake up for a safepoint, or get poked by a handshake or safepoint. Said initialization > processes a few (3) frames and other thread oops. The rest - the bulk of the frame processing, is deferred until it is > actually needed. It is needed when a frame is exposed to either 1) execution (returns or unwinding due to exception > handling), or 2) stack walker APIs. A hook is then run to go and finish the lazy processing of frames. Mutator and GC > threads can compete for processing. The processing is therefore performed under a per-thread lock. Note that disarming > of the poll word (that the returns are comparing against) is only performed by the thread itself. So sliding the > watermark up will require one runtime call for a thread to note that nothing needs to be done, and then update the poll > word accordingly. Downgrading the poll word concurrently by other threads was simply not worth the complexity it > brought (and is only possible on TSO machines). So left that one out. This pull request has now been integrated. Changeset: b9873e18 Author: Erik ?sterlund URL: https://git.openjdk.java.net/jdk/commit/b9873e18 Stats: 2740 lines in 131 files changed: 2167 ins; 311 del; 262 mod 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing Reviewed-by: stefank, pliden, rehn, neliasso, coleenp, smonteith ------------- PR: https://git.openjdk.java.net/jdk/pull/296 From mdoerr at openjdk.java.net Fri Oct 9 09:25:29 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Fri, 9 Oct 2020 09:25:29 GMT Subject: RFR: 8254265: s390 and linux 32 bit builds broken [v2] In-Reply-To: References: Message-ID: <9KCIqsJHTmlilquLMCz6CktRshbmxETugEUYBWFEcH4=.a3cee42f-fbb7-4976-9954-5ac54a7737e7@github.com> > JDK-8253717 missed ocurrances of JavaThread::stack_guard_zone_size() which was moved to class StackOverflow. > > I just verified s390 build. Can anybody check linux 32 bit? Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: more 32 bit fixes ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/568/files - new: https://git.openjdk.java.net/jdk/pull/568/files/6ec8881e..6ebd826e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=568&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=568&range=00-01 Stats: 3 lines in 2 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/568.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/568/head:pull/568 PR: https://git.openjdk.java.net/jdk/pull/568 From rkennke at redhat.com Fri Oct 9 09:26:54 2020 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 09 Oct 2020 11:26:54 +0200 Subject: CFV: New HotSpot Group Member: Erik =?ISO-8859-1?Q?=D6sterlund?= In-Reply-To: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> References: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> Message-ID: Vote: yes (why was Erik not OpenJDK member before?!) > I hearby nominate Erik ?sterlund to Membership in the HotSpot Group. > > Erik has been a JDK Reviewer and member of the Oracle GC team for > several > years, currently working on ZGC, though his reach and influence > extends > significantly beyond that project. He has made many substantial > contributions [1] including (most recently) JEP 376: ZGC: Concurrent > Thread-Stack Processing. > > Votes are due by Friday, 23-Oct-2020 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] > https://github.com/search?q=author-name%3A%22Erik+%C3%96sterlund%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From dholmes at openjdk.java.net Fri Oct 9 09:41:18 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 9 Oct 2020 09:41:18 GMT Subject: RFR: 8254175: Build no-pch configuration in debug mode for submit checks [v2] In-Reply-To: References: Message-ID: On Thu, 8 Oct 2020 07:21:56 GMT, Aleksey Shipilev wrote: >> no-pch configuration is supposed to expose missing include dependencies. But currently it runs with default (release) >> bits, which misses symbols hidden in debug code. We should consider building it in debug mode. >> Attention @rwestberg. >> >> Testing: >> - [x] GH workflow still works, see the builds in the [latest run](https://github.com/shipilev/jdk/actions/runs/293802758) > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Drop "debug" from the names Seems reasonable. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/547 From shade at openjdk.java.net Fri Oct 9 09:45:20 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 9 Oct 2020 09:45:20 GMT Subject: Integrated: 8254175: Build no-pch configuration in debug mode for submit checks In-Reply-To: References: Message-ID: On Wed, 7 Oct 2020 17:07:07 GMT, Aleksey Shipilev wrote: > no-pch configuration is supposed to expose missing include dependencies. But currently it runs with default (release) > bits, which misses symbols hidden in debug code. We should consider building it in debug mode. > Attention @rwestberg. > > Testing: > - [x] GH workflow still works, see the builds in the [latest run](https://github.com/shipilev/jdk/actions/runs/293802758) This pull request has now been integrated. Changeset: 02307811 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/02307811 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8254175: Build no-pch configuration in debug mode for submit checks Reviewed-by: rwestberg, erikj, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/547 From aph at redhat.com Fri Oct 9 09:52:30 2020 From: aph at redhat.com (Andrew Haley) Date: Fri, 9 Oct 2020 10:52:30 +0100 Subject: RFR: 8254072: AArch64: Get rid of --disable-warnings-as-errors on Windows+ARM64 build In-Reply-To: References: Message-ID: <9460e31d-94da-1d0b-a33c-5aeda206d154@redhat.com> On 08/10/2020 21:28, Bernhard Urban-Forster wrote: > On Tue, 6 Oct 2020 18:09:05 GMT, Bernhard Urban-Forster wrote: > >> I organized this PR so that each commit contains the warning emitted by MSVC as commit message and its relevant fix. >> >> Verified on >> * Linux+ARM64: `{hotspot,jdk,langtools}:tier1`, no failures. >> * Windows+ARM64: `{hotspot,jdk,langtools}:tier1`, no (new) failures. >> * internal macOS+ARM64 port: build without `--disable-warnings-as-errors` still works. Just mentioning this here, because >> it's yet another toolchain (Xcode / clang) that needs to be kept happy [going >> forward](https://openjdk.java.net/jeps/391). > > Thank you Andrew for your comments! > >> _Mailing list message from [Andrew Haley](mailto:aph at redhat.com) on [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ >> IMO this warning: >> >> warning C4146: unary minus operator applied to unsigned type, result still unsigned >> >> should not be used. > > Okay, added to the Makefile and reverted those changes. > >> // Generate stack overflow check >> if (UseStackBanging) { >> - __ bang_stack_with_offset(JavaThread::stack_shadow_zone_size()); >> + __ bang_stack_with_offset((int)JavaThread::stack_shadow_zone_size()); >> } else { >> Unimplemented(); >> >> Could this one be fixed by changing stack_shadow_zone_size() or >> bang_stack_with_offset() ? I would have thought that whatever type >> stack_shadow_zone_size() returns should be compatible with >> bang_stack_with_offset(). > > The x86_64 backend and others do the same: > https://github.com/openjdk/jdk/blob/5351ba6cfa8078f503f1cf0c375b692905c607ff/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp#L2176-L2178 > > So should we (1) do the same, (2) diverge or (3) fix all of them? I hate changing code just to silence compiler warnings. Occasionally, these warnings find real bugs, but there have been several important programs broken by silencing compiler warnings. (http://taint.org/2008/05/13/153959a.html is the most famous.) The problem with "fixing all of them" is that it's real work, because inevitably some of the instructions in the various back ends will take int arguments, so there will be several things to fix. Whenever making changes to "shut the compiler up", as is the case here, we have to consider what the real problem is rather than just throwing in casts. In this case, we "know" that stack_shadow_zone_size() will fit into an int, so there is not a problem. But stack_shadow_zone_size() returns a size_t, and all of the logic used to calculate it is very careful to maintain this. There's a check in bang_stack_with_offset() to make sure offset is positive, which is rather pointless. Maybe the right thing to do is change our bang_stack_with_offset() to take a size_t and fix (or remove) the sanity check. Bear in mind that if you keep a sanity check, pages can be up to a megabyte in size, so you have to consider what the assertion is for. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From mdoerr at openjdk.java.net Fri Oct 9 09:55:22 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Fri, 9 Oct 2020 09:55:22 GMT Subject: RFR: 8254265: s390 and linux 32 bit builds broken [v2] In-Reply-To: References: Message-ID: On Fri, 9 Oct 2020 07:39:02 GMT, Lutz Schmidt wrote: >> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: >> >> more 32 bit fixes > > Change looks good, but is complete only when you adapt the two lines mentioned by coleenp as well. > src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp: > limit += JavaThread::stack_red_zone_size() + > JavaThread::stack_yellow_zone_size(); Thanks for reviewing and checking linux 32 bit. x86_32 is not among the supported dev kit targets (https://openjdk.java.net/groups/build/doc/building.html) and I don't have a machine which is configured for that build, but maybe devkit does work. ------------- PR: https://git.openjdk.java.net/jdk/pull/568 From mcimadamore at openjdk.java.net Fri Oct 9 10:39:56 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Fri, 9 Oct 2020 10:39:56 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v4] In-Reply-To: References: Message-ID: <4h3RYe5xrN6HpjRM1kkVgfo9yDiPoxoS312x9nYDpwA=.3260dd59-63e3-4b55-8a86-d512b1a4ee24@github.com> > This patch contains the changes associated with the third incubation round of the foreign memory access API incubation > (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: > * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from > multiple threads > * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee > that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class > has been added, which defines several useful dereference routines; these are really just thin wrappers around memory > access var handles, but they make the barrier of entry for using this API somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not > the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link > to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit > of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which > wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as > dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability > in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; > secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can > use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done > by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided > below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be > happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, > Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd > like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio > Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative > to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a > carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access > base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte > offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. > `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which > it is easy to derive all the other handles using plain var handle combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both > `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients > can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. > * `MemoryLayouts` > * A new layout, for machine addresses, has been added to the mix. > > > > ### Implementation changes > > There are two main things to discuss here: support for shared segments, and the general simplification of the memory > access var handle support. > #### Shared segments > > The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to > achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment > is shared, it would be possible for a thread to close it while another is accessing it. After considering several > options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he > reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world > (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a > close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and > the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). > It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of > these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, > we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to > whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of > stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, > once we detect that a thread is accessing the very segment we're about to close, what should happen? We first > experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it > fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the > machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to > minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread > is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and > try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should > be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single > place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in > addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` > annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) > class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, > like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is > tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during > access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory > access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead > of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a > `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed > successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two > implementations, one for confined segments and one for shared segments; the main difference between the two is what > happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared > segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or > `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` > state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. #### > Memory access var handles overhaul The key realization here was that if all memory access var handles took a > coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle > form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var > handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that > e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the > implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level > access on the innards of the memory access var handle. All that code is now gone. #### Test changes Not much to see > here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, > since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` > functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the > microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared > segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - > https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Address review comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/548/files - new: https://git.openjdk.java.net/jdk/pull/548/files/b941c4a2..d96c32ac Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=02-03 Stats: 16 lines in 3 files changed: 2 ins; 6 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/548.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/548/head:pull/548 PR: https://git.openjdk.java.net/jdk/pull/548 From coleenp at openjdk.java.net Fri Oct 9 11:14:22 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 9 Oct 2020 11:14:22 GMT Subject: RFR: 8254265: s390 and linux 32 bit builds broken [v2] In-Reply-To: <9KCIqsJHTmlilquLMCz6CktRshbmxETugEUYBWFEcH4=.a3cee42f-fbb7-4976-9954-5ac54a7737e7@github.com> References: <9KCIqsJHTmlilquLMCz6CktRshbmxETugEUYBWFEcH4=.a3cee42f-fbb7-4976-9954-5ac54a7737e7@github.com> Message-ID: On Fri, 9 Oct 2020 09:25:29 GMT, Martin Doerr wrote: >> JDK-8253717 missed ocurrances of JavaThread::stack_guard_zone_size() which was moved to class StackOverflow. >> >> I just verified s390 build. Can anybody check linux 32 bit? > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > more 32 bit fixes Thank you for fixing this! ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/568 From coleenp at openjdk.java.net Fri Oct 9 11:14:22 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 9 Oct 2020 11:14:22 GMT Subject: RFR: 8254265: s390 and linux 32 bit builds broken [v2] In-Reply-To: References: <9KCIqsJHTmlilquLMCz6CktRshbmxETugEUYBWFEcH4=.a3cee42f-fbb7-4976-9954-5ac54a7737e7@github.com> Message-ID: <1LoBj-prKaKOaO5XTGHIMC0d5qCR1yJfbCLxzM2W4XE=.ef5361f9-7f75-4a85-8dff-1b2111dad579@github.com> On Fri, 9 Oct 2020 11:08:50 GMT, Coleen Phillimore wrote: >> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: >> >> more 32 bit fixes > > Thank you for fixing this! Actually the linux-x86 build is broken because of something in the libdwp native code right now but this code compiles. https://bugs.openjdk.java.net/browse/JDK-8254270 ------------- PR: https://git.openjdk.java.net/jdk/pull/568 From coleenp at openjdk.java.net Fri Oct 9 11:31:27 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 9 Oct 2020 11:31:27 GMT Subject: RFR: 8233214: Remove runtime code not needed with CMS removed Message-ID: This change removes CMS code left over for ClassLoaderData walking. Tested with Oracle platforms tier1 and built shenandoah with no errors. ------------- Commit messages: - 8233214: Remove runtime code not needed with CMS removed Changes: https://git.openjdk.java.net/jdk/pull/574/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=574&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8233214 Stats: 72 lines in 6 files changed: 0 ins; 66 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/574.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/574/head:pull/574 PR: https://git.openjdk.java.net/jdk/pull/574 From mcimadamore at openjdk.java.net Fri Oct 9 11:34:56 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Fri, 9 Oct 2020 11:34:56 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v5] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the third incubation round of the foreign memory access API incubation > (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: > * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from > multiple threads > * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee > that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class > has been added, which defines several useful dereference routines; these are really just thin wrappers around memory > access var handles, but they make the barrier of entry for using this API somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not > the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link > to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit > of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which > wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as > dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability > in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; > secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can > use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done > by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided > below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be > happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, > Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd > like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio > Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative > to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a > carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access > base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte > offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. > `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which > it is easy to derive all the other handles using plain var handle combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both > `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients > can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. > * `MemoryLayouts` > * A new layout, for machine addresses, has been added to the mix. > > > > ### Implementation changes > > There are two main things to discuss here: support for shared segments, and the general simplification of the memory > access var handle support. > #### Shared segments > > The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to > achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment > is shared, it would be possible for a thread to close it while another is accessing it. After considering several > options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he > reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world > (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a > close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and > the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). > It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of > these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, > we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to > whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of > stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, > once we detect that a thread is accessing the very segment we're about to close, what should happen? We first > experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it > fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the > machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to > minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread > is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and > try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should > be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single > place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in > addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` > annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) > class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, > like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is > tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during > access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory > access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead > of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a > `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed > successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two > implementations, one for confined segments and one for shared segments; the main difference between the two is what > happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared > segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or > `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` > state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. #### > Memory access var handles overhaul The key realization here was that if all memory access var handles took a > coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle > form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var > handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that > e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the > implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level > access on the innards of the memory access var handle. All that code is now gone. #### Test changes Not much to see > here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, > since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` > functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the > microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared > segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - > https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Fix performance issue with "small" segment mismatch ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/548/files - new: https://git.openjdk.java.net/jdk/pull/548/files/d96c32ac..9b3fc227 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=03-04 Stats: 7 lines in 1 file changed: 0 ins; 5 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/548.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/548/head:pull/548 PR: https://git.openjdk.java.net/jdk/pull/548 From mcimadamore at openjdk.java.net Fri Oct 9 11:37:19 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Fri, 9 Oct 2020 11:37:19 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v3] In-Reply-To: References: Message-ID: On Thu, 8 Oct 2020 23:17:33 GMT, Paul Sandoz wrote: >> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix indent in GensrcScopedMemoryAccess.gmk > > Reviewed this when updated in [panama-foreign](https://github.com/openjdk/panama-foreign/tree/foreign-memaccess), hence > the lack of substantial comments for this PR. When re-running all benchmarks, I noted an issue with the `BulkOps` microbenchmark: calling `MemorySegment::mismatch` on a small segment (< 8 bytes) was 10x slower than with ByteBuffers. After some investigation, I realized that the issue is caused by the fact that the `AbstractMemorySegmentImpl::mismatch` method contains inexact var handle calls, where the segment coordinate has type `AbstractMemorySegmentImpl` instead of the expected `MemorySegment`, so we take the slow path. A simple solution is to avoid using var handles directly here, and use the helper functions in MemoryAccess, which widens the type accordingly and produce an exact var handle call. With this change, perfomance of mismatch on small segment is on par with ByteBuffer. ------------- PR: https://git.openjdk.java.net/jdk/pull/548 From fyang at openjdk.java.net Fri Oct 9 11:53:31 2020 From: fyang at openjdk.java.net (Fei Yang) Date: Fri, 9 Oct 2020 11:53:31 GMT Subject: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v5] In-Reply-To: References: Message-ID: > Contributed-by: ard.biesheuvel at linaro.org, dongbo4 at huawei.com > > This added an intrinsic for SHA3 using aarch64 v8.2 SHA3 Crypto Extensions. > Reference implementation for core SHA-3 transform using ARMv8.2 Crypto Extensions: > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/arm64/crypto/sha3-ce-core.S?h=v5.4.52 > > Trivial adaptation in SHA3. implCompress is needed for the purpose of adding the intrinsic. > For SHA3, we need to pass one extra parameter "digestLength" to the stub for the calculation of block size. > "digestLength" is also used in for the EOR loop before keccak to differentiate different SHA3 variants. > > We added jtreg tests for SHA3 and used QEMU system emulator which supports SHA3 instructions to test the functionality. > Patch passed jtreg tier1-3 tests with QEMU system emulator. > Also verified with jtreg tier1-3 tests without SHA3 instructions on aarch64-linux-gnu and x86_64-linux-gnu, to make > sure that there's no regression. > We used one existing JMH test for performance test: test/micro/org/openjdk/bench/java/security/MessageDigests.java > We measured the performance benefit with an aarch64 cycle-accurate simulator. > Patch delivers 20% - 40% performance improvement depending on specific SHA3 digest length and size of the message. > > For now, this feature will not be enabled automatically for aarch64. We can auto-enable this when it is fully tested on > real hardware. But for the above testing purposes, this is auto-enabled when the corresponding hardware feature is > detected. Fei Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: - Merge master - Add sha3 instructions to cpu/aarch64/aarch64-asmtest.py and regenerate the test in assembler_aarch64.cpp:asm_check - Rebase - Merge master - Fix trailing whitespace issue - 8252204: AArch64: Implement SHA3 accelerator/intrinsic Contributed-by: ard.biesheuvel at linaro.org, dongbo4 at huawei.com ------------- Changes: https://git.openjdk.java.net/jdk/pull/207/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=207&range=04 Stats: 1512 lines in 35 files changed: 1025 ins; 22 del; 465 mod Patch: https://git.openjdk.java.net/jdk/pull/207.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/207/head:pull/207 PR: https://git.openjdk.java.net/jdk/pull/207 From mdoerr at openjdk.java.net Fri Oct 9 11:54:20 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Fri, 9 Oct 2020 11:54:20 GMT Subject: Integrated: 8254265: s390 and linux 32 bit builds broken In-Reply-To: References: Message-ID: On Thu, 8 Oct 2020 19:29:49 GMT, Martin Doerr wrote: > JDK-8253717 missed ocurrances of JavaThread::stack_guard_zone_size() which was moved to class StackOverflow. > > I just verified s390 build. Can anybody check linux 32 bit? This pull request has now been integrated. Changeset: 2bc8bc57 Author: Martin Doerr URL: https://git.openjdk.java.net/jdk/commit/2bc8bc57 Stats: 5 lines in 3 files changed: 0 ins; 0 del; 5 mod 8254265: s390 and linux 32 bit builds broken Reviewed-by: coleenp, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/568 From aph at openjdk.java.net Fri Oct 9 12:22:15 2020 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 9 Oct 2020 12:22:15 GMT Subject: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v5] In-Reply-To: References: Message-ID: On Fri, 9 Oct 2020 11:53:31 GMT, Fei Yang wrote: >> Contributed-by: ard.biesheuvel at linaro.org, dongbo4 at huawei.com >> >> This added an intrinsic for SHA3 using aarch64 v8.2 SHA3 Crypto Extensions. >> Reference implementation for core SHA-3 transform using ARMv8.2 Crypto Extensions: >> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/arm64/crypto/sha3-ce-core.S?h=v5.4.52 >> >> Trivial adaptation in SHA3. implCompress is needed for the purpose of adding the intrinsic. >> For SHA3, we need to pass one extra parameter "digestLength" to the stub for the calculation of block size. >> "digestLength" is also used in for the EOR loop before keccak to differentiate different SHA3 variants. >> >> We added jtreg tests for SHA3 and used QEMU system emulator which supports SHA3 instructions to test the functionality. >> Patch passed jtreg tier1-3 tests with QEMU system emulator. >> Also verified with jtreg tier1-3 tests without SHA3 instructions on aarch64-linux-gnu and x86_64-linux-gnu, to make >> sure that there's no regression. >> We used one existing JMH test for performance test: test/micro/org/openjdk/bench/java/security/MessageDigests.java >> We measured the performance benefit with an aarch64 cycle-accurate simulator. >> Patch delivers 20% - 40% performance improvement depending on specific SHA3 digest length and size of the message. >> >> For now, this feature will not be enabled automatically for aarch64. We can auto-enable this when it is fully tested on >> real hardware. But for the above testing purposes, this is auto-enabled when the corresponding hardware feature is >> detected. > > Fei Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains > six commits: > - Merge master > - Add sha3 instructions to cpu/aarch64/aarch64-asmtest.py and regenerate the test in assembler_aarch64.cpp:asm_check > - Rebase > - Merge master > - Fix trailing whitespace issue > - 8252204: AArch64: Implement SHA3 accelerator/intrinsic > Contributed-by: ard.biesheuvel at linaro.org, dongbo4 at huawei.com Marked as reviewed by aph (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/207 From shade at openjdk.java.net Fri Oct 9 12:37:14 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 9 Oct 2020 12:37:14 GMT Subject: RFR: 8233214: Remove runtime code not needed with CMS removed In-Reply-To: References: Message-ID: <_Vrfj3fyf4iVrZ1_73e2JqFs89NY8rmwhcB1ewpRZms=.5e8dc44a-7115-4b9f-b8eb-f5f343f0d1ae@github.com> On Fri, 9 Oct 2020 11:27:00 GMT, Coleen Phillimore wrote: > This change removes CMS code left over for ClassLoaderData walking. > Tested with Oracle platforms tier1 and built shenandoah with no errors. This looks good to me. I meant to remove it myself this week :) ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/574 From stefank at openjdk.java.net Fri Oct 9 12:57:15 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Fri, 9 Oct 2020 12:57:15 GMT Subject: RFR: 8233214: Remove runtime code not needed with CMS removed In-Reply-To: References: Message-ID: On Fri, 9 Oct 2020 11:27:00 GMT, Coleen Phillimore wrote: > This change removes CMS code left over for ClassLoaderData walking. > Tested with Oracle platforms tier1 and built shenandoah with no errors. Marked as reviewed by stefank (Reviewer). src/hotspot/share/classfile/classLoaderData.hpp line 127: > 125: // Remembered sets support for the oops in the class loader data. > 126: bool _modified_oops; // Card Table Equivalent (YC/CMS support) > 127: bool _accumulated_modified_oops; // Mod Union Equivalent (CMS support) Maybe remove the 'CMS support' comment on line 126? ------------- PR: https://git.openjdk.java.net/jdk/pull/574 From github.com+670087+jrziviani at openjdk.java.net Fri Oct 9 13:00:12 2020 From: github.com+670087+jrziviani at openjdk.java.net (Ziviani) Date: Fri, 9 Oct 2020 13:00:12 GMT Subject: Integrated: 8253900: SA: wrong size computation when JVM was built without AOT In-Reply-To: References: Message-ID: <7y7E1QZytgQzEdRi3LU__E97QBcaOUvB-DNrJiEa_cQ=.79187b75-a950-4929-a0f9-59450ef678ca@github.com> On Fri, 25 Sep 2020 13:13:44 GMT, Ziviani wrote: > TestInstanceKlassSize was failing because, for PowerPC, the following code (instanceKlass.cpp) always compiles to > `return false;` bool InstanceKlass::has_stored_fingerprint() const { > #if INCLUDE_AOT > return should_store_fingerprint() || is_shared(); > #else > return false; > #endif > } > However, in `hasStoredFingerprint()@InstanceKlass.java` the condition `shouldStoreFingerprint() || isShared();` is > always evaluated and may return true (_AFAIK isShared() returns true_). Such condition adds 8 bytes in the > `getSize()@InstanceKlass.java` causing the failure in TestInstanceKlassSize: public long getSize() { // in number of > bytes > ... > if (hasStoredFingerprint()) { > size += 8; // uint64_t > } > return alignSize(size); > } > Considering these tests are failing for PowerPC only (_based on ProblemList.txt_), my solution checks if > `hasStoredFingerprint()` is running on a PowerPC platform. I decided to go this way because there is no existing flag > informing whether AOT is included or not and creating a new one just to handle the PowerPC case seems too much. This > patch is an attempt to fix https://bugs.openjdk.java.net/browse/JDK-8230664 This pull request has now been integrated. Changeset: b1448da1 Author: Jose Ricardo Ziviani Committer: Martin Doerr URL: https://git.openjdk.java.net/jdk/commit/b1448da1 Stats: 27 lines in 6 files changed: 24 ins; 2 del; 1 mod 8253900: SA: wrong size computation when JVM was built without AOT Reviewed-by: cjplummer, sspitsyn ------------- PR: https://git.openjdk.java.net/jdk/pull/358 From tschatzl at openjdk.java.net Fri Oct 9 13:10:13 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Fri, 9 Oct 2020 13:10:13 GMT Subject: RFR: 8233214: Remove runtime code not needed with CMS removed In-Reply-To: References: Message-ID: On Fri, 9 Oct 2020 11:27:00 GMT, Coleen Phillimore wrote: > This change removes CMS code left over for ClassLoaderData walking. > Tested with Oracle platforms tier1 and built shenandoah with no errors. Good sans the "YC/CMS" comment ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/574 From lucy at openjdk.java.net Fri Oct 9 13:13:12 2020 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Fri, 9 Oct 2020 13:13:12 GMT Subject: RFR: 8253740: [PPC64] Minor interpreter cleanup [v2] In-Reply-To: References: <9ENQcPBWCaewPezO3ObXyRtvAW2464wWrYmn6eBExYs=.714ecad9-fca8-40a3-9f41-72875f70e54f@github.com> Message-ID: On Wed, 30 Sep 2020 10:19:42 GMT, Martin Doerr wrote: >> Template interpreter on PPC64 can be cleaned up after JDK-8253540: >> unlock_object has a parameter check_for_exceptions which is now unused. >> >> In addition to that, restore_interpreter_state is redundant after >> call_VM(R4_ARG2, CAST_FROM_FN_PTR(address, InterpreterRuntime::member_name_arg_or_null) because it's already included >> in call_VM. >> https://bugs.openjdk.java.net/browse/JDK-8253740 > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > call_VM usages: Remove some explicit /*check_exceptions=*/true parameters which match default value. The change looks good to me. Thank you for the cleanup work. ------------- Marked as reviewed by lucy (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/385 From mdoerr at openjdk.java.net Fri Oct 9 13:40:16 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Fri, 9 Oct 2020 13:40:16 GMT Subject: Integrated: 8253740: [PPC64] Minor interpreter cleanup In-Reply-To: <9ENQcPBWCaewPezO3ObXyRtvAW2464wWrYmn6eBExYs=.714ecad9-fca8-40a3-9f41-72875f70e54f@github.com> References: <9ENQcPBWCaewPezO3ObXyRtvAW2464wWrYmn6eBExYs=.714ecad9-fca8-40a3-9f41-72875f70e54f@github.com> Message-ID: On Mon, 28 Sep 2020 20:11:05 GMT, Martin Doerr wrote: > Template interpreter on PPC64 can be cleaned up after JDK-8253540: > unlock_object has a parameter check_for_exceptions which is now unused. > > In addition to that, restore_interpreter_state is redundant after > call_VM(R4_ARG2, CAST_FROM_FN_PTR(address, InterpreterRuntime::member_name_arg_or_null) because it's already included > in call_VM. > https://bugs.openjdk.java.net/browse/JDK-8253740 This pull request has now been integrated. Changeset: e9c1905b Author: Martin Doerr URL: https://git.openjdk.java.net/jdk/commit/e9c1905b Stats: 17 lines in 3 files changed: 0 ins; 8 del; 9 mod 8253740: [PPC64] Minor interpreter cleanup Reviewed-by: lucy ------------- PR: https://git.openjdk.java.net/jdk/pull/385 From rkennke at openjdk.java.net Fri Oct 9 14:10:19 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 9 Oct 2020 14:10:19 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing Message-ID: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> 8254315: Shenandoah: Concurrent weak reference processing ------------- Commit messages: - Merge branch 'master' into shenandoah-concurrent-weakrefs - Add Oracle copyright header to shenandoahReferenceProcessor.[hc]pp due to its structural origins from its ZGC couterparts - Relax assert in reference processor to account no LRB in passive mode - Aarch64 support for concurrent weak references/extended native barriers - Perform reference-processing during full-GC and degenerated-GC - Relax during-evacuation verification to account for Reference referents that have not yet been cleared - Apply LRB when draining ref-proc discovered lists - Install softref policy at init-mark pause, not at conc-mark - Use native-LRBs for Reference.get() intrinsics - Implement correct strong and final marking; Fix liveness counting - ... and 18 more: https://git.openjdk.java.net/jdk/compare/c9d0407e...610cd75a Changes: https://git.openjdk.java.net/jdk/pull/505/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254315 Stats: 2278 lines in 52 files changed: 1534 ins; 565 del; 179 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From pliden at openjdk.java.net Fri Oct 9 14:10:19 2020 From: pliden at openjdk.java.net (Per Liden) Date: Fri, 9 Oct 2020 14:10:19 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: <5SHfZCqDGmlAe76fFCuH_lgi3_vKh2Sp7_x3w7XLRDw=.75108d48-0443-4e7d-8e02-d165a852c55a@github.com> On Mon, 5 Oct 2020 13:42:02 GMT, Roman Kennke wrote: > 8254315: Shenandoah: Concurrent weak reference processing Hi @rkennke! It looks like `shenandoahReferenceProcess.[ch]pp` is heavily based on `zReferenceProcessor.[ch]hpp`. When copying non-trivial amounts of code from ZGC (or any other part of HotSpot), please retain the copyright notice of the original work. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From rkennke at openjdk.java.net Fri Oct 9 14:10:20 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 9 Oct 2020 14:10:20 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing In-Reply-To: <5SHfZCqDGmlAe76fFCuH_lgi3_vKh2Sp7_x3w7XLRDw=.75108d48-0443-4e7d-8e02-d165a852c55a@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> <5SHfZCqDGmlAe76fFCuH_lgi3_vKh2Sp7_x3w7XLRDw=.75108d48-0443-4e7d-8e02-d165a852c55a@github.com> Message-ID: On Mon, 5 Oct 2020 19:01:04 GMT, Per Liden wrote: > Hi @rkennke! > > It looks like `shenandoahReferenceProcess.[ch]pp` is heavily based on `zReferenceProcessor.[ch]hpp`. When copying > non-trivial amounts of code from ZGC (or any other part of HotSpot), please retain the copyright notice of the original > work. Thanks! Ok thanks for the notice. I will change it. ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From rkennke at openjdk.java.net Fri Oct 9 14:29:23 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 9 Oct 2020 14:29:23 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v2] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: <2KYMyFiOAaqO6xL8GgIph5Sj6h-BBRni4-bHHcfSP7s=.9ec669f1-08e0-4043-a2f7-1e7f88fa3cff@github.com> > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference > and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads > that make heavvy use of such weak references will therefore potentially cause significant GC pauses. There are 3 main > items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is > theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly > reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking > bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever > marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all > objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it > will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter > of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for > FinalReference, marking stops there, and does not mark through the referent. 2. Concurrent processing is performed > after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and > depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list > (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no > referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers > in Reference.get() intrinsics that return NULL when the referent is not reachable. Roman Kennke has updated the pull request incrementally with three additional commits since the last revision: - Add precompiled header to shenandoahMarkBitMap.cpp - Fix null-check after C2 native-LRB - Reinstate check for ShenandoahSelfFixing that got lost during the merge ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/505/files - new: https://git.openjdk.java.net/jdk/pull/505/files/610cd75a..ccdc9633 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=00-01 Stats: 8 lines in 3 files changed: 5 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From jiefu at openjdk.java.net Fri Oct 9 14:33:17 2020 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 9 Oct 2020 14:33:17 GMT Subject: RFR: 8254297: Zero and Minimal VMs are broken with undeclared identifier 'DerivedPointerTable' after JDK-8253180 Message-ID: The change fixes Zero and Minimal builds broken after JDK-8253180. Two build errors were fixed: 1 ./src/hotspot/share/runtime/frame.cpp:1047:38: error: use of undeclared identifier 'DerivedPointerTable' oops_do_internal(f, cf, map, true, DerivedPointerTable::is_active() ? 2. ./src/hotspot/share/utilities/vmError.cpp: In static member function 'static void VMError::print_stack_trace(outputStream*, JavaThread*, char*, int, bool)': ./src/hotspot/share/utilities/vmError.cpp:214:28: error: no matching function for call to 'StackFrameStream::StackFrameStream(JavaThread*&)' StackFrameStream sfs(jt); ^ ------------- Commit messages: - Zero and Minimal VMs are broken with undeclared identifier 'DerivedPointerTable' after JDK-8253180 Changes: https://git.openjdk.java.net/jdk/pull/578/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=578&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254297 Stats: 7 lines in 3 files changed: 6 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/578.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/578/head:pull/578 PR: https://git.openjdk.java.net/jdk/pull/578 From rkennke at openjdk.java.net Fri Oct 9 14:51:16 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 9 Oct 2020 14:51:16 GMT Subject: RFR: 8254319: Shenandoah: Interpreter native-LRB needs to activate during HAS_FORWARDED Message-ID: We currently only activate native-LRB when EVACUATING, however we need it to activate during all of HAS_FORWARDED because it may have to resolve the target. ------------- Commit messages: - 8254319: Shenandoah: Interpreter native-LRB needs to activate during HAS_FORWARDED Changes: https://git.openjdk.java.net/jdk/pull/579/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=579&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254319 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/579.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/579/head:pull/579 PR: https://git.openjdk.java.net/jdk/pull/579 From shade at openjdk.java.net Fri Oct 9 14:56:26 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 9 Oct 2020 14:56:26 GMT Subject: RFR: 8254297: Zero and Minimal VMs are broken with undeclared identifier 'DerivedPointerTable' after JDK-8253180 In-Reply-To: References: Message-ID: On Fri, 9 Oct 2020 14:22:06 GMT, Jie Fu wrote: > The change fixes Zero and Minimal builds broken after JDK-8253180. > > Two build errors were fixed: > 1 ./src/hotspot/share/runtime/frame.cpp:1047:38: error: use of undeclared identifier 'DerivedPointerTable' > oops_do_internal(f, cf, map, true, DerivedPointerTable::is_active() ? > > 2. ./src/hotspot/share/utilities/vmError.cpp: In static member function 'static void > VMError::print_stack_trace(outputStream*, JavaThread*, char*, int, bool)': > ./src/hotspot/share/utilities/vmError.cpp:214:28: error: no matching function for call to > 'StackFrameStream::StackFrameStream(JavaThread*&)' > StackFrameStream sfs(jt); > ^ I think this is fine. @fisk might need to ack this. src/hotspot/share/compiler/oopMap.cpp line 196: > 194: #if COMPILER2_OR_JVMCI > 195: DerivedPointerTable::add(derived, base); > 196: #endif // COMPILER2_OR_JVMCI This looks correct and actually reverses the JDK-8253180 change. It is correct because `DerivedPointerTable` is protected by the same `#if`: #if COMPILER2_OR_JVMCI class DerivedPointerTable : public AllStatic { ... src/hotspot/share/runtime/frame.cpp line 1053: > 1051: #else > 1052: oops_do_internal(f, cf, map, true, DerivedPointerIterationMode::_ignore); > 1053: #endif We could have used `COMPILER2_OR_JVMCI_PRESENT` inline macro, but I think that would be messier. src/hotspot/share/utilities/vmError.cpp line 214: > 212: > 213: // Print the frames > 214: StackFrameStream sfs(jt, true /* update */, true /* process_frames */); This looks correct, also because it does the same thing as L227 below does. ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/578 From shade at openjdk.java.net Fri Oct 9 14:56:20 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 9 Oct 2020 14:56:20 GMT Subject: RFR: 8254319: Shenandoah: Interpreter native-LRB needs to activate during HAS_FORWARDED In-Reply-To: References: Message-ID: On Fri, 9 Oct 2020 14:44:40 GMT, Roman Kennke wrote: > We currently only activate native-LRB when EVACUATING, however we need it to activate during all of HAS_FORWARDED > because it may have to resolve the target. Agreed. Looks good. ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/579 From eosterlund at openjdk.java.net Fri Oct 9 15:00:18 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 9 Oct 2020 15:00:18 GMT Subject: RFR: 8254297: Zero and Minimal VMs are broken with undeclared identifier 'DerivedPointerTable' after JDK-8253180 In-Reply-To: References: Message-ID: On Fri, 9 Oct 2020 14:22:06 GMT, Jie Fu wrote: > The change fixes Zero and Minimal builds broken after JDK-8253180. > > Two build errors were fixed: > 1 ./src/hotspot/share/runtime/frame.cpp:1047:38: error: use of undeclared identifier 'DerivedPointerTable' > oops_do_internal(f, cf, map, true, DerivedPointerTable::is_active() ? > > 2. ./src/hotspot/share/utilities/vmError.cpp: In static member function 'static void > VMError::print_stack_trace(outputStream*, JavaThread*, char*, int, bool)': > ./src/hotspot/share/utilities/vmError.cpp:214:28: error: no matching function for call to > 'StackFrameStream::StackFrameStream(JavaThread*&)' > StackFrameStream sfs(jt); > ^ Looks good. Thanks for fixing. ------------- Marked as reviewed by eosterlund (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/578 From shade at openjdk.java.net Fri Oct 9 15:05:11 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 9 Oct 2020 15:05:11 GMT Subject: RFR: 8254297: Zero and Minimal VMs are broken with undeclared identifier 'DerivedPointerTable' after JDK-8253180 In-Reply-To: References: Message-ID: On Fri, 9 Oct 2020 14:57:10 GMT, Erik ?sterlund wrote: >> The change fixes Zero and Minimal builds broken after JDK-8253180. >> >> Two build errors were fixed: >> 1 ./src/hotspot/share/runtime/frame.cpp:1047:38: error: use of undeclared identifier 'DerivedPointerTable' >> oops_do_internal(f, cf, map, true, DerivedPointerTable::is_active() ? >> >> 2. ./src/hotspot/share/utilities/vmError.cpp: In static member function 'static void >> VMError::print_stack_trace(outputStream*, JavaThread*, char*, int, bool)': >> ./src/hotspot/share/utilities/vmError.cpp:214:28: error: no matching function for call to >> 'StackFrameStream::StackFrameStream(JavaThread*&)' >> StackFrameStream sfs(jt); >> ^ > > Looks good. Thanks for fixing. Since mainline contains both #546 (merged yesterday) and #296 (merged today), most testing would now fail on builds steps. @DamonFool, please integrate as soon as possible! ------------- PR: https://git.openjdk.java.net/jdk/pull/578 From coleenp at openjdk.java.net Fri Oct 9 15:08:21 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 9 Oct 2020 15:08:21 GMT Subject: RFR: 8233214: Remove runtime code not needed with CMS removed [v2] In-Reply-To: References: Message-ID: > This change removes CMS code left over for ClassLoaderData walking. > Tested with Oracle platforms tier1 and built shenandoah with no errors. Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: - fix comment - fix comment ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/574/files - new: https://git.openjdk.java.net/jdk/pull/574/files/d3d781c1..b8b174f4 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=574&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=574&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/574.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/574/head:pull/574 PR: https://git.openjdk.java.net/jdk/pull/574 From coleenp at openjdk.java.net Fri Oct 9 15:08:22 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 9 Oct 2020 15:08:22 GMT Subject: RFR: 8233214: Remove runtime code not needed with CMS removed [v2] In-Reply-To: References: Message-ID: On Fri, 9 Oct 2020 12:50:18 GMT, Stefan Karlsson wrote: >> Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: >> >> - fix comment >> - fix comment > > src/hotspot/share/classfile/classLoaderData.hpp line 127: > >> 125: // Remembered sets support for the oops in the class loader data. >> 126: bool _modified_oops; // Card Table Equivalent (YC/CMS support) >> 127: bool _accumulated_modified_oops; // Mod Union Equivalent (CMS support) > > Maybe remove the 'CMS support' comment on line 126? Ok, can remove that too. ------------- PR: https://git.openjdk.java.net/jdk/pull/574 From coleenp at openjdk.java.net Fri Oct 9 15:08:23 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 9 Oct 2020 15:08:23 GMT Subject: RFR: 8233214: Remove runtime code not needed with CMS removed [v2] In-Reply-To: References: Message-ID: On Fri, 9 Oct 2020 14:59:16 GMT, Coleen Phillimore wrote: >> src/hotspot/share/classfile/classLoaderData.hpp line 127: >> >>> 125: // Remembered sets support for the oops in the class loader data. >>> 126: bool _modified_oops; // Card Table Equivalent (YC/CMS support) >>> 127: bool _accumulated_modified_oops; // Mod Union Equivalent (CMS support) >> >> Maybe remove the 'CMS support' comment on line 126? > > Ok, can remove that too. How about?: bool _modified_oops; // Card Table Equivalent I lined up the // also. ------------- PR: https://git.openjdk.java.net/jdk/pull/574 From jiefu at openjdk.java.net Fri Oct 9 15:21:12 2020 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 9 Oct 2020 15:21:12 GMT Subject: RFR: 8254297: Zero and Minimal VMs are broken with undeclared identifier 'DerivedPointerTable' after JDK-8253180 In-Reply-To: References: Message-ID: On Fri, 9 Oct 2020 15:02:19 GMT, Aleksey Shipilev wrote: >> Looks good. Thanks for fixing. > > Since mainline contains both #546 (merged yesterday) and #296 (merged today), most testing would now fail on builds > steps. @DamonFool, please integrate as soon as possible! Thanks @shipilev and @fisk for your review. ------------- PR: https://git.openjdk.java.net/jdk/pull/578 From jiefu at openjdk.java.net Fri Oct 9 15:21:12 2020 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 9 Oct 2020 15:21:12 GMT Subject: Integrated: 8254297: Zero and Minimal VMs are broken with undeclared identifier 'DerivedPointerTable' after JDK-8253180 In-Reply-To: References: Message-ID: On Fri, 9 Oct 2020 14:22:06 GMT, Jie Fu wrote: > The change fixes Zero and Minimal builds broken after JDK-8253180. > > Two build errors were fixed: > 1 ./src/hotspot/share/runtime/frame.cpp:1047:38: error: use of undeclared identifier 'DerivedPointerTable' > oops_do_internal(f, cf, map, true, DerivedPointerTable::is_active() ? > > 2. ./src/hotspot/share/utilities/vmError.cpp: In static member function 'static void > VMError::print_stack_trace(outputStream*, JavaThread*, char*, int, bool)': > ./src/hotspot/share/utilities/vmError.cpp:214:28: error: no matching function for call to > 'StackFrameStream::StackFrameStream(JavaThread*&)' > StackFrameStream sfs(jt); > ^ This pull request has now been integrated. Changeset: aaa0a2a0 Author: Jie Fu URL: https://git.openjdk.java.net/jdk/commit/aaa0a2a0 Stats: 7 lines in 3 files changed: 6 ins; 0 del; 1 mod 8254297: Zero and Minimal VMs are broken with undeclared identifier 'DerivedPointerTable' after JDK-8253180 Reviewed-by: shade, eosterlund ------------- PR: https://git.openjdk.java.net/jdk/pull/578 From psandoz at openjdk.java.net Fri Oct 9 15:26:11 2020 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Fri, 9 Oct 2020 15:26:11 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v5] In-Reply-To: References: Message-ID: On Fri, 9 Oct 2020 11:34:56 GMT, Maurizio Cimadamore wrote: >> This patch contains the changes associated with the third incubation round of the foreign memory access API incubation >> (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: >> * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from >> multiple threads >> * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee >> that the memory will be deallocated, eventually >> * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class >> has been added, which defines several useful dereference routines; these are really just thin wrappers around memory >> access var handles, but they make the barrier of entry for using this API somewhat lower. >> >> A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not >> the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link >> to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit >> of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which >> wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as >> dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability >> in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; >> secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can >> use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done >> by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided >> below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be >> happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, >> Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd >> like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio >> Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html >> Specdiff: >> >> http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html >> >> CSR: >> >> https://bugs.openjdk.java.net/browse/JDK-8254163 >> >> >> >> ### API Changes >> >> * `MemorySegment` >> * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) >> * added a no-arg factory for a native restricted segment representing entire native heap >> * rename `withOwnerThread` to `handoff` >> * add new `share` method, to create shared segments >> * add new `registerCleaner` method, to register a segment against a cleaner >> * add more helpers to create arrays from a segment e.g. `toIntArray` >> * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) >> * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) >> * `MemoryAddress` >> * drop `segment` accessor >> * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative >> to a given segment >> * `MemoryAccess` >> * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a >> carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access >> base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte >> offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. >> `getByteAtOffset` vs `getByteAtIndex`). >> * `MemoryHandles` >> * drop `withOffset` combinator >> * drop `withStride` combinator >> * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which >> it is easy to derive all the other handles using plain var handle combinators. >> * `Addressable` >> * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both >> `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients >> can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. >> * `MemoryLayouts` >> * A new layout, for machine addresses, has been added to the mix. >> >> >> >> ### Implementation changes >> >> There are two main things to discuss here: support for shared segments, and the general simplification of the memory >> access var handle support. >> #### Shared segments >> >> The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to >> achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment >> is shared, it would be possible for a thread to close it while another is accessing it. After considering several >> options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he >> reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world >> (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a >> close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and >> the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). >> It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of >> these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, >> we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to >> whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of >> stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, >> once we detect that a thread is accessing the very segment we're about to close, what should happen? We first >> experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it >> fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the >> machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to >> minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread >> is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and >> try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should >> be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single >> place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in >> addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` >> annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) >> class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, >> like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is >> tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during >> access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory >> access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead >> of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a >> `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed >> successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two >> implementations, one for confined segments and one for shared segments; the main difference between the two is what >> happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared >> segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or >> `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` >> state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. #### >> Memory access var handles overhaul The key realization here was that if all memory access var handles took a >> coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle >> form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var >> handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that >> e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the >> implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level >> access on the innards of the memory access var handle. All that code is now gone. #### Test changes Not much to see >> here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, >> since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` >> functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the >> microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared >> segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - >> https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 > > Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: > > Fix performance issue with "small" segment mismatch src/jdk.incubator.foreign/share/classes/jdk/internal/foreign/Utils.java line 76: > 74: // This adaptation is required, otherwise the memory access var handle will have type MemoryAddressProxy, > 75: // and not MemoryAddress (which the user expects), which causes performance issues with asType() > adaptations. 76: return MemoryHandles.filterCoordinates(handle, 0, ADDRESS_FILTER); The above comment needs updating to refer to `MemorySegmentProxy` and `MemorySegment`. ------------- PR: https://git.openjdk.java.net/jdk/pull/548 From akozlov at openjdk.java.net Fri Oct 9 16:00:27 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 9 Oct 2020 16:00:27 GMT Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on macOS [v4] In-Reply-To: References: Message-ID: > Please review an updated RFR from https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041463.html > > On macOS, MAP_JIT cannot be used with MAP_FIXED[1]. So pd_reserve_memory have to provide MAP_JIT for mmap(NULL, > PROT_NONE), the function was made aware of exec permissions. > For executable and data regions, pd_commit_memory only unlocks the memory with mprotect, this should make no difference > compared with old code. > For data regions, pd_uncommit_memory still uses a new overlapping anonymous mmap which returns pages to the OS and > immediately reflects this in diagnostic tools like ps. For executable regions it would require MAP_FIXED|MAP_JIT, so > instead madvise(MADV_FREE)+mprotect(PROT_NONE) are used. They should also allow OS to reclaim pages, but apparently > this does not happen immediately. In practice, it should not be a problem for executable regions, as codecache does not > shrink (if I haven't missed anything, by the implementation and in principle). Tested: > * local tier1 > * jdk-submit > * codesign[2] with hardened runtime and allow-jit but without > allow-unsigned-executable-memory entitlements[3] produce a working bundle. > > (adding GC group as suggested by @dholmes-ora) > > > [1] https://github.com/apple/darwin-xnu/blob/master/bsd/kern/kern_mman.c#L227 > [2] > > codesign \ > --sign - \ > --options runtime \ > --entitlements ents.plist \ > --timestamp \ > $J/bin/* $J/lib/server/*.dylib $J/lib/*.dylib > [3] > > > > > com.apple.security.cs.allow-jit > > com.apple.security.cs.disable-library-validation > > com.apple.security.cs.allow-dyld-environment-variables > > > Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: os::reserve to take exec parameter ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/294/files - new: https://git.openjdk.java.net/jdk/pull/294/files/0016bc4a..0899d0ba Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=294&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=294&range=02-03 Stats: 83 lines in 6 files changed: 25 ins; 27 del; 31 mod Patch: https://git.openjdk.java.net/jdk/pull/294.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/294/head:pull/294 PR: https://git.openjdk.java.net/jdk/pull/294 From rkennke at openjdk.java.net Fri Oct 9 17:03:24 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 9 Oct 2020 17:03:24 GMT Subject: RFR: 8254319: Shenandoah: Interpreter native-LRB needs to activate during HAS_FORWARDED [v2] In-Reply-To: References: Message-ID: <0VDBIsRpXLo0bbd3mTbpZOFyXOmgT53kM1wXQjmkDLg=.0ba83815-28a4-4258-8ba8-2c867b14dac9@github.com> > We currently only activate native-LRB when EVACUATING, however we need it to activate during all of HAS_FORWARDED > because it may have to resolve the target. Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge remote-tracking branch 'upstream/master' into 8254319 - 8254319: Shenandoah: Interpreter native-LRB needs to activate during HAS_FORWARDED ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/579/files - new: https://git.openjdk.java.net/jdk/pull/579/files/f597a4fe..677ea4b8 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=579&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=579&range=00-01 Stats: 1323 lines in 15 files changed: 1278 ins; 4 del; 41 mod Patch: https://git.openjdk.java.net/jdk/pull/579.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/579/head:pull/579 PR: https://git.openjdk.java.net/jdk/pull/579 From kbarrett at openjdk.java.net Fri Oct 9 17:13:10 2020 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 9 Oct 2020 17:13:10 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v2] In-Reply-To: <2KYMyFiOAaqO6xL8GgIph5Sj6h-BBRni4-bHHcfSP7s=.9ec669f1-08e0-4043-a2f7-1e7f88fa3cff@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> <2KYMyFiOAaqO6xL8GgIph5Sj6h-BBRni4-bHHcfSP7s=.9ec669f1-08e0-4043-a2f7-1e7f88fa3cff@github.com> Message-ID: On Fri, 9 Oct 2020 14:29:23 GMT, Roman Kennke wrote: >> Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference >> and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads >> that make heavvy use of such weak references will therefore potentially cause significant GC pauses. There are 3 main >> items that contribute to pause time linear to number of references, or worse: >> - We need to scan and consider each reference on the various 'discovered' lists. >> - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is >> theoretically only bounded by the live data set size. >> - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' >> >> The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly >> reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. >> The solution to this is two-fold: >> 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking >> bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever >> marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all >> objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it >> will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter >> of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for >> FinalReference, marking stops there, and does not mark through the referent. 2. Concurrent processing is performed >> after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and >> depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list >> (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no >> referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers >> in Reference.get() intrinsics that return NULL when the referent is not reachable. > > Roman Kennke has updated the pull request incrementally with three additional commits since the last revision: > > - Add precompiled header to shenandoahMarkBitMap.cpp > - Fix null-check after C2 native-LRB > - Reinstate check for ShenandoahSelfFixing that got lost during the merge I only looked at the non-shenandoah changes. src/hotspot/share/oops/instanceRefKlass.hpp line 61: > 59: InstanceRefKlass() { assert(DumpSharedSpaces || UseSharedSpaces, "only for CDS"); } > 60: > 61: virtual bool is_instance_ref_klass() const { return true; } Don't make this change. There already exists an idiom for testing for reference klasses. k->is_instance_klass() && (InstanceKlass::cast(k)->reference_type() != REF_NONE) ------------- Changes requested by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/505 From rkennke at openjdk.java.net Fri Oct 9 17:13:11 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 9 Oct 2020 17:13:11 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v2] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> <2KYMyFiOAaqO6xL8GgIph5Sj6h-BBRni4-bHHcfSP7s=.9ec669f1-08e0-4043-a2f7-1e7f88fa3cff@github.com> Message-ID: On Fri, 9 Oct 2020 17:07:04 GMT, Kim Barrett wrote: >> Roman Kennke has updated the pull request incrementally with three additional commits since the last revision: >> >> - Add precompiled header to shenandoahMarkBitMap.cpp >> - Fix null-check after C2 native-LRB >> - Reinstate check for ShenandoahSelfFixing that got lost during the merge > > src/hotspot/share/oops/instanceRefKlass.hpp line 61: > >> 59: InstanceRefKlass() { assert(DumpSharedSpaces || UseSharedSpaces, "only for CDS"); } >> 60: >> 61: virtual bool is_instance_ref_klass() const { return true; } > > Don't make this change. There already exists an idiom for testing for reference klasses. > > k->is_instance_klass() && (InstanceKlass::cast(k)->reference_type() != REF_NONE) Thanks, Kim! I didn't actually intend to open this up for review, and have reverted it back to 'draft' status. Thank you anyway for the suggestion: it is one of the things that I wanted to ask you guys about. I will make that change, thank you! ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From akozlov at openjdk.java.net Fri Oct 9 17:14:15 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 9 Oct 2020 17:14:15 GMT Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on macOS [v4] In-Reply-To: References: <6iVRP-20baz0_46SouR-dj9SyspR5QvaL9iJMdeipDE=.92688b4e-ebd3-4681-8e63-a4aee752c407@github.com> <_XaA5cQEInPMn5Q5gj2y7AFCRprFQiYfI6BeUN49FhA=.9f17ae05-b37e-4f40-a83f-fd34aa812575@github.com> Message-ID: On Fri, 9 Oct 2020 06:05:56 GMT, Thomas Stuefe wrote: > GrowableArray maybe not the best choice here since e.g. it requires you to search twice on add. A better solution may > be a specialized BST. I assume amount of executable mappings to be small. Depends on if exec parameter available at reserve, it is either only a single one for the CodeCache (see below) or plus several more for mappings with unknown mode (that were not committed yet) > IMHO too heavvy weight for a platform only change. > If there are other uses for such a solution (managing memory regions, melting them together, splitting them maybe on > remove) we should not support setting and clearing exec on commit but only on a per-mapping base. It is more simple when the whole mapping is executable or not. We don't need to split/merge on commit/uncommit then. But we need do to something when os::release_memory is called on a submapping of a mapping with unknown status. Like on AIX, uncommit is made https://github.com/openjdk/jdk/blob/master/src/hotspot/os/aix/os_aix.cpp#L2096. But here for macOS, I'm trying to avoid any change of behavior for non-exec mappings. If the exec parameter is provided for reserve (as it eventually would be), then we don't need splitting/merging at all. This is what the latest patch is about. I haven't tested that thoroughly yet, but eventually it would be possible to deduce correct exec values for os::reserve based on subsequent os::commit. If we make a step back, we have exec parameter known for reserve and commit, I also pretty sure that it is possible to deduce that for any uncommit (which was one of the initial concerns) Let's agree on some plan how to attack the problem? I would like to distinguish the work toward MAP_JIT and improving interface. Not sure what should come first. Are you still opposing to have exec parameter in os::reserve/commit/uncommit and obligating callers to provide consistent exec values for each, at least at this phase? I mean, eventually we will have a platform-dependent `handle_t` for mapping or equivalent. Like if we provide size of the whole mapping (the context) for each commit_memory on AIX, we won't need to do the bookkeeping. What if os::commit to take ReservedSpace and do something conservative when that is not provided? ------------- PR: https://git.openjdk.java.net/jdk/pull/294 From rkennke at openjdk.java.net Fri Oct 9 17:33:19 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 9 Oct 2020 17:33:19 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v3] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference > and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads > that make heavvy use of such weak references will therefore potentially cause significant GC pauses. There are 3 main > items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is > theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly > reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking > bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever > marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all > objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it > will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter > of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for > FinalReference, marking stops there, and does not mark through the referent. 2. Concurrent processing is performed > after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and > depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list > (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no > referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers > in Reference.get() intrinsics that return NULL when the referent is not reachable. Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 33 commits: - Merge remote-tracking branch 'upstream/master' into shenandoah-concurrent-weakrefs - Use existing idiom for checking for intanceRefKlass, instead of introducing new code - Add precompiled header to shenandoahMarkBitMap.cpp - Fix null-check after C2 native-LRB - Reinstate check for ShenandoahSelfFixing that got lost during the merge - Merge branch 'master' into shenandoah-concurrent-weakrefs - Add Oracle copyright header to shenandoahReferenceProcessor.[hc]pp due to its structural origins from its ZGC couterparts - Relax assert in reference processor to account no LRB in passive mode - Aarch64 support for concurrent weak references/extended native barriers - Perform reference-processing during full-GC and degenerated-GC - ... and 23 more: https://git.openjdk.java.net/jdk/compare/52e45a36...aaf8717f ------------- Changes: https://git.openjdk.java.net/jdk/pull/505/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=02 Stats: 2285 lines in 50 files changed: 1539 ins; 565 del; 181 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From aph at openjdk.java.net Fri Oct 9 17:38:16 2020 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 9 Oct 2020 17:38:16 GMT Subject: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v5] In-Reply-To: References: Message-ID: <4NM17B6l4GvNgCbmmQTUcnfZTA6G-IEc85O8jH_q-xA=.63b10da7-bab7-44bc-a4c8-0a675aca45c0@github.com> On Fri, 9 Oct 2020 12:18:58 GMT, Andrew Haley wrote: >> Fei Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains >> six commits: >> - Merge master >> - Add sha3 instructions to cpu/aarch64/aarch64-asmtest.py and regenerate the test in assembler_aarch64.cpp:asm_check >> - Rebase >> - Merge master >> - Fix trailing whitespace issue >> - 8252204: AArch64: Implement SHA3 accelerator/intrinsic >> Contributed-by: ard.biesheuvel at linaro.org, dongbo4 at huawei.com > > Marked as reviewed by aph (Reviewer). I see Linux x64 failed. However, I don't seem to be able to withdraw my patch approval. However, please consider it withdrawn. ------------- PR: https://git.openjdk.java.net/jdk/pull/207 From zgu at openjdk.java.net Fri Oct 9 17:45:13 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Fri, 9 Oct 2020 17:45:13 GMT Subject: RFR: 8254319: Shenandoah: Interpreter native-LRB needs to activate during HAS_FORWARDED [v2] In-Reply-To: <0VDBIsRpXLo0bbd3mTbpZOFyXOmgT53kM1wXQjmkDLg=.0ba83815-28a4-4258-8ba8-2c867b14dac9@github.com> References: <0VDBIsRpXLo0bbd3mTbpZOFyXOmgT53kM1wXQjmkDLg=.0ba83815-28a4-4258-8ba8-2c867b14dac9@github.com> Message-ID: On Fri, 9 Oct 2020 17:03:24 GMT, Roman Kennke wrote: >> We currently only activate native-LRB when EVACUATING, however we need it to activate during all of HAS_FORWARDED >> because it may have to resolve the target. > > Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev > excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since > the last revision: > - Merge remote-tracking branch 'upstream/master' into 8254319 > - 8254319: Shenandoah: Interpreter native-LRB needs to activate during HAS_FORWARDED Marked as reviewed by zgu (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/579 From kvn at openjdk.java.net Fri Oct 9 17:59:12 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 9 Oct 2020 17:59:12 GMT Subject: RFR: 8252847: Optimize primitive arrayCopy stubs using AVX-512 masked instructions [v6] In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 12:21:01 GMT, Jatin Bhateja wrote: >> Summary: >> >> 1) New AVX3 optimized stubs for both conjoint and disjoint arraycopy. >> 2) Special instruction sequence blocks for copy sizes b/w 32-192 bytes. >> 3) Block copy operation above 192 bytes is performed using destination address aligned PRE-MAIN-POST loop. Main loop >> copies 192 byte in one iteration and tail part fall over special instruction sequence blocks. 4) Both small copy block >> and aligned loop use 32 byte vector register to prevent and frequency penalty for copy sizes less than AVX3Threshold. >> 5) For block size above AVX3Theshold both special blocks and loop operate using 64 byte register. 6) In case user >> sets the maximum vector size to 32 bytes, forward copy (disjoint) operations are done using efficient REP MOVS for copy >> sizes above 4096 bytes. JMH Results: >> System : CascadeLake Server, Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz >> Micros : test/micro/org/openjdk/bench/java/lang/ArrayCopy*.java >> Baseline : [http://cr.openjdk.java.net/~jbhateja/8252847/JMH_results/ArrayCopy_AVX3_Stubs_Baseline.txt]() >> WithOpt : [http://cr.openjdk.java.net/~jbhateja/8252847/JMH_results/ArrayCopy_AVX3_Stubs_WithOpts.txt]() > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > 8252847 : Review comments resolution Yes, this looks better. Reviewed. Before pushing let me test it. I will let you know results. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/61 From kvn at openjdk.java.net Fri Oct 9 17:59:13 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 9 Oct 2020 17:59:13 GMT Subject: RFR: 8252847: Optimize primitive arrayCopy stubs using AVX-512 masked instructions [v5] In-Reply-To: References: Message-ID: On Tue, 6 Oct 2020 18:51:52 GMT, Nils Eliasson wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 1264: >> >>> 1262: } >>> 1263: >>> 1264: #ifndef PRODUCT >> >> macroAssembler_x86.hpp become big. May be we should start thing about splitting arraycopy stubs into separate file. > > But lets do that in a another change. It is good that the AVX3 case is separated out in this change - makes it easy to > follow. agree ------------- PR: https://git.openjdk.java.net/jdk/pull/61 From vladimir.kozlov at oracle.com Fri Oct 9 19:01:54 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 9 Oct 2020 12:01:54 -0700 Subject: =?UTF-8?Q?Re=3a_CFV=3a_New_HotSpot_Group_Member=3a_Erik_=c3=96sterl?= =?UTF-8?Q?und?= In-Reply-To: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> References: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> Message-ID: <5f23f40b-213a-6b44-a036-cb9e55fc3684@oracle.com> Vote: yes On 10/8/20 2:24 AM, Kim Barrett wrote: > I hearby nominate Erik ?sterlund to Membership in the HotSpot Group. > > Erik has been a JDK Reviewer and member of the Oracle GC team for several > years, currently working on ZGC, though his reach and influence extends > significantly beyond that project. He has made many substantial > contributions [1] including (most recently) JEP 376: ZGC: Concurrent > Thread-Stack Processing. > > Votes are due by Friday, 23-Oct-2020 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Erik+%C3%96sterlund%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From minqi at openjdk.java.net Fri Oct 9 19:08:27 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Fri, 9 Oct 2020 19:08:27 GMT Subject: RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive [v15] In-Reply-To: References: Message-ID: <4DdscR_ZGWXKUdu7k3df9USNXScFRqNEJhKq_8mSM3E=.5f8957ff-c209-42a0-945b-1fc30c5604ac@github.com> > This patch is reorganized after 8252725, which is separated from this patch to refactor jlink glugin code. The previous > webrev with hg can be found at: http://cr.openjdk.java.net/~minqi/2020/8247536/webrev-05. With 8252725 integrated, the > regeneration of holder classes is simply to call the new added GenerateJLIClassesHelper.cdsGenerateHolderClasses > function. Tests: tier1-4 Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: Changed isValidInputLines to validateInputLines and throw IAE upon wrong inputs. Fixed missed archive_mirror issue which could lead crash for archived heap iteration. After call generateLambdaFormHolderClasses, should check exception first. Added more comments why print out exception message and stacktrace in CDS. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/193/files - new: https://git.openjdk.java.net/jdk/pull/193/files/f163fe4c..16362e15 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=193&range=14 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=193&range=13-14 Stats: 37 lines in 4 files changed: 7 ins; 15 del; 15 mod Patch: https://git.openjdk.java.net/jdk/pull/193.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/193/head:pull/193 PR: https://git.openjdk.java.net/jdk/pull/193 From rkennke at openjdk.java.net Fri Oct 9 19:10:10 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 9 Oct 2020 19:10:10 GMT Subject: Integrated: 8254319: Shenandoah: Interpreter native-LRB needs to activate during HAS_FORWARDED In-Reply-To: References: Message-ID: On Fri, 9 Oct 2020 14:44:40 GMT, Roman Kennke wrote: > We currently only activate native-LRB when EVACUATING, however we need it to activate during all of HAS_FORWARDED > because it may have to resolve the target. This pull request has now been integrated. Changeset: 536b35b5 Author: Roman Kennke URL: https://git.openjdk.java.net/jdk/commit/536b35b5 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod 8254319: Shenandoah: Interpreter native-LRB needs to activate during HAS_FORWARDED Reviewed-by: shade, zgu ------------- PR: https://git.openjdk.java.net/jdk/pull/579 From minqi at openjdk.java.net Fri Oct 9 19:19:14 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Fri, 9 Oct 2020 19:19:14 GMT Subject: RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive [v12] In-Reply-To: References: <9emWKl6fr-GA5LN0uHhuEd5D123QcoCiHQR1M9bAbag=.cc4b6129-8b33-47e4-a421-9e6b4817933b@github.com> Message-ID: On Wed, 7 Oct 2020 17:53:30 GMT, Ioi Lam wrote: >> Yumin Qi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains >> 23 commits: >> - Added new separate function to CDS for logging species and modified the existing function to log lambda form invokers. >> Changed isDumpLoadedClassList to a reasonable name isDumpingClassList as read only in CDS. >> - Merge branch 'master' of https://github.com/openjdk/jdk into jdk-8247536 >> - Removed unused imports. >> - Fixed comments with correct class and method name in CDS, removed unused variables after last change. >> - Moved and renamed cdsGenerateHolderClasses from GenerateJLIClassesHelp to CDS as generateLambdaFormHolderClasses. Added >> input verification function in CDS before class generation. Added more test scenarios. Removed trailing unused ending >> words for output of lambda form trace line in case of DumpLoadedClassList. >> - Move the check work to java, restore code in VM. Modified test code according to the changes. The invoke name >> verififcation is not implemented since not all the holder class are processed, not all the functions of processed >> holder classes are added. For holder class with DirectMethodHandle in its name, only the name in the >> DMH_METHOD_TYPE_MAP keyset is added, ithe line with other names just gets skipped silently. This makes the verification >> on invoke names difficul, a name not in the keyset should not fail the test. Also add a boolean to >> cdsGenerateHolderClasses to indicate call path. >> - Remove trailing word of line which is not used in holder class regeneration. There is a trailing LF (Line Feed) so trim >> white spaces from both front and end of the line or it will fail method type validation. >> - In case of exception happens during reloading class, CHECK will return without free the allocated buffer for class >> bytes so moved the buffer allocation and freeing to caller. Also removed test 6 since there is not guarantee that we >> can give a signature which will always fail. Additional changes to GenerateJLIClassesHelper according to review >> suggestion. >> - Merge branch 'master' of https://github.com/openjdk/jdk into jdk-8247536 >> - Merge branch 'master' of https://git.openjdk.java.net/jdk into jdk-8247536 >> - ... and 13 more: https://git.openjdk.java.net/jdk/compare/82fe023b...f5584dcf > > Marked as reviewed by iklam (Reviewer). passed mach5 tier1-4 ------------- PR: https://git.openjdk.java.net/jdk/pull/193 From akozlov at openjdk.java.net Fri Oct 9 19:49:26 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 9 Oct 2020 19:49:26 GMT Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on macOS [v5] In-Reply-To: References: Message-ID: > Please review an updated RFR from https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041463.html > > On macOS, MAP_JIT cannot be used with MAP_FIXED[1]. So pd_reserve_memory have to provide MAP_JIT for mmap(NULL, > PROT_NONE), the function was made aware of exec permissions. > For executable and data regions, pd_commit_memory only unlocks the memory with mprotect, this should make no difference > compared with old code. > For data regions, pd_uncommit_memory still uses a new overlapping anonymous mmap which returns pages to the OS and > immediately reflects this in diagnostic tools like ps. For executable regions it would require MAP_FIXED|MAP_JIT, so > instead madvise(MADV_FREE)+mprotect(PROT_NONE) are used. They should also allow OS to reclaim pages, but apparently > this does not happen immediately. In practice, it should not be a problem for executable regions, as codecache does not > shrink (if I haven't missed anything, by the implementation and in principle). Tested: > * local tier1 > * jdk-submit > * codesign[2] with hardened runtime and allow-jit but without > allow-unsigned-executable-memory entitlements[3] produce a working bundle. > > (adding GC group as suggested by @dholmes-ora) > > > [1] https://github.com/apple/darwin-xnu/blob/master/bsd/kern/kern_mman.c#L227 > [2] > > codesign \ > --sign - \ > --options runtime \ > --entitlements ents.plist \ > --timestamp \ > $J/bin/* $J/lib/server/*.dylib $J/lib/*.dylib > [3] > > > > > com.apple.security.cs.allow-jit > > com.apple.security.cs.disable-library-validation > > com.apple.security.cs.allow-dyld-environment-variables > > > Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: Fix test builds (nothing except macOS still buildable) ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/294/files - new: https://git.openjdk.java.net/jdk/pull/294/files/0899d0ba..71968597 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=294&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=294&range=03-04 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/294.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/294/head:pull/294 PR: https://git.openjdk.java.net/jdk/pull/294 From kvn at openjdk.java.net Fri Oct 9 20:33:16 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 9 Oct 2020 20:33:16 GMT Subject: RFR: 8252847: Optimize primitive arrayCopy stubs using AVX-512 masked instructions [v6] In-Reply-To: References: Message-ID: On Fri, 9 Oct 2020 17:56:51 GMT, Vladimir Kozlov wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> 8252847 : Review comments resolution > > Yes, this looks better. Reviewed. Before pushing let me test it. I will let you know results. hs-tier1-3 testing passed on x86 (all OSs). ------------- PR: https://git.openjdk.java.net/jdk/pull/61 From coleenp at openjdk.java.net Fri Oct 9 20:49:12 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 9 Oct 2020 20:49:12 GMT Subject: RFR: 8233214: Remove runtime code not needed with CMS removed [v2] In-Reply-To: References: Message-ID: <9GWw4_NtcZhJMuWkUY8lQIrQL5oMJb2nM2d5WXGv6MQ=.7393e41e-24ea-4811-a49e-c7ebf513e2e0@github.com> On Fri, 9 Oct 2020 13:07:35 GMT, Thomas Schatzl wrote: >> Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: >> >> - fix comment >> - fix comment > > Good sans the "YC/CMS" comment Thanks, I think this is trivial enough to integrate. ------------- PR: https://git.openjdk.java.net/jdk/pull/574 From coleenp at openjdk.java.net Fri Oct 9 20:49:12 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 9 Oct 2020 20:49:12 GMT Subject: Integrated: 8233214: Remove runtime code not needed with CMS removed In-Reply-To: References: Message-ID: <4UL1gHRWIgbbkhDYzMxTZFFzR6vNSPe_TlGBd4ZyFEA=.5a3f8ddc-0f82-4ee0-82ab-8569a4ead7b8@github.com> On Fri, 9 Oct 2020 11:27:00 GMT, Coleen Phillimore wrote: > This change removes CMS code left over for ClassLoaderData walking. > Tested with Oracle platforms tier1 and built shenandoah with no errors. This pull request has now been integrated. Changeset: 7ec9c8ea Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/7ec9c8ea Stats: 73 lines in 6 files changed: 0 ins; 66 del; 7 mod 8233214: Remove runtime code not needed with CMS removed Reviewed-by: shade, stefank, tschatzl ------------- PR: https://git.openjdk.java.net/jdk/pull/574 From rkennke at openjdk.java.net Fri Oct 9 21:32:24 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 9 Oct 2020 21:32:24 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v4] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference > and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads > that make heavvy use of such weak references will therefore potentially cause significant GC pauses. There are 3 main > items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is > theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly > reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking > bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever > marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all > objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it > will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter > of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for > FinalReference, marking stops there, and does not mark through the referent. 2. Concurrent processing is performed > after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and > depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list > (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no > referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers > in Reference.get() intrinsics that return NULL when the referent is not reachable. Roman Kennke has updated the pull request incrementally with five additional commits since the last revision: - Implement reference-processing statistics - Implement abandoning partial discovery. Remove unused methods. - Remove leftovers of precleaning - Remove unused is_access_on_jlr_reference() helper method - Invert weak/native condition in interpreter native-LRB for clarity ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/505/files - new: https://git.openjdk.java.net/jdk/pull/505/files/aaf8717f..179002db Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=02-03 Stats: 159 lines in 17 files changed: 86 ins; 52 del; 21 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From kvn at openjdk.java.net Sat Oct 10 00:04:15 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Sat, 10 Oct 2020 00:04:15 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v9] In-Reply-To: <5O9n8cKBJyjhp2cNOVD2PcpKQiXqEs5BJjkW1lH-5EM=.044a510e-6517-4564-a3db-00c7951f0b22@github.com> References: <5O9n8cKBJyjhp2cNOVD2PcpKQiXqEs5BJjkW1lH-5EM=.044a510e-6517-4564-a3db-00c7951f0b22@github.com> Message-ID: On Thu, 8 Oct 2020 16:55:31 GMT, Richard Reingruber wrote: >> Hi, >> >> this is the continuation of the review of the implementation for: >> >> https://bugs.openjdk.java.net/browse/JDK-8227745 >> https://bugs.openjdk.java.net/browse/JDK-8233915 >> >> It allows for JIT optimizations based on escape analysis even if JVMTI agents acquire capabilities to access references >> to objects that are subject to such optimizations, e.g. scalar replacement. The implementation reverts such >> optimizations just before access very much as when switching from JIT compiled execution to the interpreter, aka >> "deoptimization". Webrev.8 was the last one before before the transition to Git/Github: >> >> http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ >> >> Thanks, Richard. > > Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains 19 commits: > - Merge branch 'master' into JDK-8227745 > - Merge branch 'master' into JDK-8227745 > - Factorized fragment out of EscapeBarrier::deoptimize_objects_internal into new method in compiledVFrame. > - More smaller changes proposed by Serguei. > - jvmtiDeferredUpdates.hpp: remove forward declarations. > - jvmtiDeferredLocalVariable: move member variables to the beginning of the class definition. > - jvmtiDeferredUpdates.hpp: add/remove empty lines and improve indentation. > - Merge branch 'master' into JDK-8227745 > - Merge branch 'master' into JDK-8227745 > - Make parameter current_thread of JvmtiEnvBase::check_top_frame() a JavaThread* again. > > With Asynchronous handshakes the type was changed from JavaThread* to Thread* > but this is not necessary as check_top_frame() is not executed during a handshake > / safepoint (robehn confirmed). > - ... and 9 more: https://git.openjdk.java.net/jdk/compare/d036dca0...d463b4f3 Compiler changes seems fine. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/119 From minqi at openjdk.java.net Sat Oct 10 00:08:26 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Sat, 10 Oct 2020 00:08:26 GMT Subject: RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive [v16] In-Reply-To: References: Message-ID: > This patch is reorganized after 8252725, which is separated from this patch to refactor jlink glugin code. The previous > webrev with hg can be found at: http://cr.openjdk.java.net/~minqi/2020/8247536/webrev-05. With 8252725 integrated, the > regeneration of holder classes is simply to call the new added GenerateJLIClassesHelper.cdsGenerateHolderClasses > function. Tests: tier1-4 Yumin Qi has updated the pull request incrementally with three additional commits since the last revision: - Correct typo for check message - Removed try/catch from java side and moved output to vm. - Make change to original indent ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/193/files - new: https://git.openjdk.java.net/jdk/pull/193/files/16362e15..2184725f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=193&range=15 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=193&range=14-15 Stats: 29 lines in 4 files changed: 1 ins; 10 del; 18 mod Patch: https://git.openjdk.java.net/jdk/pull/193.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/193/head:pull/193 PR: https://git.openjdk.java.net/jdk/pull/193 From kvn at openjdk.java.net Sat Oct 10 00:15:12 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Sat, 10 Oct 2020 00:15:12 GMT Subject: RFR: 8173585: Intrinsify StringLatin1.indexOf(char) [v3] In-Reply-To: References: <_T0873dC5tfUtGn9r1_Y21JkPKRt-za3MM9hPN2GQKQ=.b865fe53-5417-424f-81b6-1566a330640e@github.com> Message-ID: <0_4KDtvn6WhHRCxqupbUQjLauR5DMQWJluogsJ7m_KA=.697e6e15-5c91-4339-abab-1dff51871e8d@github.com> On Mon, 21 Sep 2020 12:45:55 GMT, Jason Tatton wrote: >> This is an implementation of the indexOf(char) intrinsic for StringLatin1 (1 byte encoded Strings). It is provided for >> x86 and ARM64. The implementation is greatly inspired by the indexOf(char) intrinsic for StringUTF16. To incorporate it >> I had to make a small change to StringLatin1.java (refactor of functionality to intrisified private method) as well as >> code for C2. Submitted to: hotspot-compiler-dev and core-libs-dev as this patch contains a change to hotspot and >> java/lang/StringLatin1.java https://bugs.openjdk.java.net/browse/JDK-8173585 >> >> Details of testing: >> ============ >> I have created a jtreg test ?compiler/intrinsics/string/TestStringLatin1IndexOfChar? to cover this new intrinsic. Note >> that, particularly for the x86 implementation of the intrinsic, the code path taken is dependent upon the length of the >> input String. Hence the test has been designed to cover all these cases. In summary they are: >> - A ?short? string of < 16 characters. >> - A SIMD String of 16 ? 31 characters. >> - A AVX2 SIMD String of 32 characters+. >> >> Hardware used for testing: >> ----------------------------- >> >> - Intel Xeon CPU E5-2680 (JVM did not recognize this as having AVX2 support) ? Intel i7 processor (with AVX2 support). >> - AWS Graviton 2 (ARM 64 processor). >> >> I also ran; ?run-test-tier1? and ?run-test-tier2? for: x86_64 and aarch64. >> >> Possible future enhancements: >> ==================== >> For the x86 implementation there may be two further improvements we can make in order to improve performance of both >> the StringUTF16 and StringLatin1 indexOf(char) intrinsics: >> 1. Make use of AVX-512 instructions. >> 2. For ?short? Strings (see below), I think it may be possible to modify the existing algorithm to still use SSE SIMD >> instructions instead of a loop. >> Benchmark results: >> ============ >> **Without** the new StringLatin1 indexOf(char) intrinsic: >> >> | Benchmark | Mode | Cnt | Score | Error | Units | >> | ------------- | ------------- |------------- |------------- |------------- |------------- | >> | IndexOfBenchmark.latin1_mixed_char | avgt | 5 | **26,389.129** | ? 182.581 | ns/op | >> | IndexOfBenchmark.utf16_mixed_char | avgt | 5 | 17,885.383 | ? 435.933 | ns/op | >> >> >> **With** the new StringLatin1 indexOf(char) intrinsic: >> >> | Benchmark | Mode | Cnt | Score | Error | Units | >> | ------------- | ------------- |------------- |------------- |------------- |------------- | >> | IndexOfBenchmark.latin1_mixed_char | avgt | 5 | **17,875.185** | ? 407.716 | ns/op | >> | IndexOfBenchmark.utf16_mixed_char | avgt | 5 | 18,292.802 | ? 167.306 | ns/op | >> >> >> The objective of the patch is to bring the performance of StringLatin1 indexOf(char) in line with StringUTF16 >> indexOf(char) for x86 and ARM64. We can see above that this has been achieved. Similar results were obtained when >> running on ARM. > > Jason Tatton has updated the pull request incrementally with one additional commit since the last revision: > > Add missing newline to end of vmSymbols.cpp Changes seems fine but you missing Copyright + GPL header in new files. test/hotspot/jtreg/compiler/intrinsics/string/TestStringLatin1IndexOfChar.java line 1: > 1: /* Missing copyright+GPL header in new test. See other tests fro example. test/micro/org/openjdk/bench/java/lang/StringIndexOfChar.java line 1: > 1: package org.openjdk.bench.java.lang; Again missing Copyright. ------------- Changes requested by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/71 From mchung at openjdk.java.net Sat Oct 10 00:17:13 2020 From: mchung at openjdk.java.net (Mandy Chung) Date: Sat, 10 Oct 2020 00:17:13 GMT Subject: RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive [v16] In-Reply-To: References: Message-ID: On Sat, 10 Oct 2020 00:08:26 GMT, Yumin Qi wrote: >> This patch is reorganized after 8252725, which is separated from this patch to refactor jlink glugin code. The previous >> webrev with hg can be found at: http://cr.openjdk.java.net/~minqi/2020/8247536/webrev-05. With 8252725 integrated, the >> regeneration of holder classes is simply to call the new added GenerateJLIClassesHelper.cdsGenerateHolderClasses >> function. Tests: tier1-4 > > Yumin Qi has updated the pull request incrementally with three additional commits since the last revision: > > - Correct typo for check message > - Removed try/catch from java side and moved output to vm. > - Make change to original indent I reviewed the files under `src/java.base`. ------------- Marked as reviewed by mchung (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/193 From kvn at openjdk.java.net Sat Oct 10 00:28:13 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Sat, 10 Oct 2020 00:28:13 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v9] In-Reply-To: <5O9n8cKBJyjhp2cNOVD2PcpKQiXqEs5BJjkW1lH-5EM=.044a510e-6517-4564-a3db-00c7951f0b22@github.com> References: <5O9n8cKBJyjhp2cNOVD2PcpKQiXqEs5BJjkW1lH-5EM=.044a510e-6517-4564-a3db-00c7951f0b22@github.com> Message-ID: On Thu, 8 Oct 2020 16:55:31 GMT, Richard Reingruber wrote: >> Hi, >> >> this is the continuation of the review of the implementation for: >> >> https://bugs.openjdk.java.net/browse/JDK-8227745 >> https://bugs.openjdk.java.net/browse/JDK-8233915 >> >> It allows for JIT optimizations based on escape analysis even if JVMTI agents acquire capabilities to access references >> to objects that are subject to such optimizations, e.g. scalar replacement. The implementation reverts such >> optimizations just before access very much as when switching from JIT compiled execution to the interpreter, aka >> "deoptimization". Webrev.8 was the last one before before the transition to Git/Github: >> >> http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ >> >> Thanks, Richard. > > Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains 19 commits: > - Merge branch 'master' into JDK-8227745 > - Merge branch 'master' into JDK-8227745 > - Factorized fragment out of EscapeBarrier::deoptimize_objects_internal into new method in compiledVFrame. > - More smaller changes proposed by Serguei. > - jvmtiDeferredUpdates.hpp: remove forward declarations. > - jvmtiDeferredLocalVariable: move member variables to the beginning of the class definition. > - jvmtiDeferredUpdates.hpp: add/remove empty lines and improve indentation. > - Merge branch 'master' into JDK-8227745 > - Merge branch 'master' into JDK-8227745 > - Make parameter current_thread of JvmtiEnvBase::check_top_frame() a JavaThread* again. > > With Asynchronous handshakes the type was changed from JavaThread* to Thread* > but this is not necessary as check_top_frame() is not executed during a handshake > / safepoint (robehn confirmed). > - ... and 9 more: https://git.openjdk.java.net/jdk/compare/d036dca0...d463b4f3 I tried to run testing with latest changes and latest JDK and build failed: src/hotspot/share/runtime/escapeBarrier.cpp:310:35: error: no matching function for call to 'StackFrameStream::StackFrameStream(JavaThread*&)' 310 | StackFrameStream fst(deoptee); ------------- Changes requested by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/119 From iklam at openjdk.java.net Sat Oct 10 00:34:15 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Sat, 10 Oct 2020 00:34:15 GMT Subject: RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive [v16] In-Reply-To: References: Message-ID: On Sat, 10 Oct 2020 00:08:26 GMT, Yumin Qi wrote: >> This patch is reorganized after 8252725, which is separated from this patch to refactor jlink glugin code. The previous >> webrev with hg can be found at: http://cr.openjdk.java.net/~minqi/2020/8247536/webrev-05. With 8252725 integrated, the >> regeneration of holder classes is simply to call the new added GenerateJLIClassesHelper.cdsGenerateHolderClasses >> function. Tests: tier1-4 > > Yumin Qi has updated the pull request incrementally with three additional commits since the last revision: > > - Correct typo for check message > - Removed try/catch from java side and moved output to vm. > - Make change to original indent Latest version LGTM. ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/193 From minqi at openjdk.java.net Sat Oct 10 02:11:12 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Sat, 10 Oct 2020 02:11:12 GMT Subject: Integrated: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive In-Reply-To: References: Message-ID: On Tue, 15 Sep 2020 18:57:55 GMT, Yumin Qi wrote: > This patch is reorganized after 8252725, which is separated from this patch to refactor jlink glugin code. The previous > webrev with hg can be found at: http://cr.openjdk.java.net/~minqi/2020/8247536/webrev-05. With 8252725 integrated, the > regeneration of holder classes is simply to call the new added GenerateJLIClassesHelper.cdsGenerateHolderClasses > function. Tests: tier1-4 This pull request has now been integrated. Changeset: e4469d2c Author: Yumin Qi URL: https://git.openjdk.java.net/jdk/commit/e4469d2c Stats: 551 lines in 22 files changed: 531 ins; 14 del; 6 mod 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive Reviewed-by: iklam, mchung ------------- PR: https://git.openjdk.java.net/jdk/pull/193 From fyang at openjdk.java.net Sat Oct 10 02:56:17 2020 From: fyang at openjdk.java.net (Fei Yang) Date: Sat, 10 Oct 2020 02:56:17 GMT Subject: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v6] In-Reply-To: References: Message-ID: > Contributed-by: ard.biesheuvel at linaro.org, dongbo4 at huawei.com > > This added an intrinsic for SHA3 using aarch64 v8.2 SHA3 Crypto Extensions. > Reference implementation for core SHA-3 transform using ARMv8.2 Crypto Extensions: > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/arm64/crypto/sha3-ce-core.S?h=v5.4.52 > > Trivial adaptation in SHA3. implCompress is needed for the purpose of adding the intrinsic. > For SHA3, we need to pass one extra parameter "digestLength" to the stub for the calculation of block size. > "digestLength" is also used in for the EOR loop before keccak to differentiate different SHA3 variants. > > We added jtreg tests for SHA3 and used QEMU system emulator which supports SHA3 instructions to test the functionality. > Patch passed jtreg tier1-3 tests with QEMU system emulator. > Also verified with jtreg tier1-3 tests without SHA3 instructions on aarch64-linux-gnu and x86_64-linux-gnu, to make > sure that there's no regression. > We used one existing JMH test for performance test: test/micro/org/openjdk/bench/java/security/MessageDigests.java > We measured the performance benefit with an aarch64 cycle-accurate simulator. > Patch delivers 20% - 40% performance improvement depending on specific SHA3 digest length and size of the message. > > For now, this feature will not be enabled automatically for aarch64. We can auto-enable this when it is fully tested on > real hardware. But for the above testing purposes, this is auto-enabled when the corresponding hardware feature is > detected. Fei Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: - Merge master - Merge master - Add sha3 instructions to cpu/aarch64/aarch64-asmtest.py and regenerate the test in assembler_aarch64.cpp:asm_check - Rebase - Merge master - Fix trailing whitespace issue - 8252204: AArch64: Implement SHA3 accelerator/intrinsic Contributed-by: ard.biesheuvel at linaro.org, dongbo4 at huawei.com ------------- Changes: https://git.openjdk.java.net/jdk/pull/207/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=207&range=05 Stats: 1512 lines in 35 files changed: 1025 ins; 22 del; 465 mod Patch: https://git.openjdk.java.net/jdk/pull/207.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/207/head:pull/207 PR: https://git.openjdk.java.net/jdk/pull/207 From fyang at openjdk.java.net Sat Oct 10 06:16:17 2020 From: fyang at openjdk.java.net (Fei Yang) Date: Sat, 10 Oct 2020 06:16:17 GMT Subject: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v7] In-Reply-To: References: Message-ID: > Contributed-by: ard.biesheuvel at linaro.org, dongbo4 at huawei.com > > This added an intrinsic for SHA3 using aarch64 v8.2 SHA3 Crypto Extensions. > Reference implementation for core SHA-3 transform using ARMv8.2 Crypto Extensions: > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/arm64/crypto/sha3-ce-core.S?h=v5.4.52 > > Trivial adaptation in SHA3. implCompress is needed for the purpose of adding the intrinsic. > For SHA3, we need to pass one extra parameter "digestLength" to the stub for the calculation of block size. > "digestLength" is also used in for the EOR loop before keccak to differentiate different SHA3 variants. > > We added jtreg tests for SHA3 and used QEMU system emulator which supports SHA3 instructions to test the functionality. > Patch passed jtreg tier1-3 tests with QEMU system emulator. > Also verified with jtreg tier1-3 tests without SHA3 instructions on aarch64-linux-gnu and x86_64-linux-gnu, to make > sure that there's no regression. > We used one existing JMH test for performance test: test/micro/org/openjdk/bench/java/security/MessageDigests.java > We measured the performance benefit with an aarch64 cycle-accurate simulator. > Patch delivers 20% - 40% performance improvement depending on specific SHA3 digest length and size of the message. > > For now, this feature will not be enabled automatically for aarch64. We can auto-enable this when it is fully tested on > real hardware. But for the above testing purposes, this is auto-enabled when the corresponding hardware feature is > detected. Fei Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: - Merge master - Merge master - Merge master - Add sha3 instructions to cpu/aarch64/aarch64-asmtest.py and regenerate the test in assembler_aarch64.cpp:asm_check - Rebase - Merge master - Fix trailing whitespace issue - 8252204: AArch64: Implement SHA3 accelerator/intrinsic Contributed-by: ard.biesheuvel at linaro.org, dongbo4 at huawei.com ------------- Changes: https://git.openjdk.java.net/jdk/pull/207/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=207&range=06 Stats: 1512 lines in 35 files changed: 1025 ins; 22 del; 465 mod Patch: https://git.openjdk.java.net/jdk/pull/207.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/207/head:pull/207 PR: https://git.openjdk.java.net/jdk/pull/207 From jbhateja at openjdk.java.net Sat Oct 10 06:32:12 2020 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Sat, 10 Oct 2020 06:32:12 GMT Subject: Integrated: 8252847: Optimize primitive arrayCopy stubs using AVX-512 masked instructions In-Reply-To: References: Message-ID: <_mGWUOZH852Ad58rrDEF6citxojOAY1Q-NU0BOWgVME=.2fcb9893-8f93-4f3b-a21b-f2f98069a971@github.com> On Mon, 7 Sep 2020 14:28:18 GMT, Jatin Bhateja wrote: > Summary: > > 1) New AVX3 optimized stubs for both conjoint and disjoint arraycopy. > 2) Special instruction sequence blocks for copy sizes b/w 32-192 bytes. > 3) Block copy operation above 192 bytes is performed using destination address aligned PRE-MAIN-POST loop. Main loop > copies 192 byte in one iteration and tail part fall over special instruction sequence blocks. 4) Both small copy block > and aligned loop use 32 byte vector register to prevent and frequency penalty for copy sizes less than AVX3Threshold. > 5) For block size above AVX3Theshold both special blocks and loop operate using 64 byte register. 6) In case user > sets the maximum vector size to 32 bytes, forward copy (disjoint) operations are done using efficient REP MOVS for copy > sizes above 4096 bytes. JMH Results: > System : CascadeLake Server, Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz > Micros : test/micro/org/openjdk/bench/java/lang/ArrayCopy*.java > Baseline : [http://cr.openjdk.java.net/~jbhateja/8252847/JMH_results/ArrayCopy_AVX3_Stubs_Baseline.txt]() > WithOpt : [http://cr.openjdk.java.net/~jbhateja/8252847/JMH_results/ArrayCopy_AVX3_Stubs_WithOpts.txt]() This pull request has now been integrated. Changeset: 4b5ac3ab Author: Jatin Bhateja URL: https://git.openjdk.java.net/jdk/commit/4b5ac3ab Stats: 1517 lines in 11 files changed: 1419 ins; 69 del; 29 mod 8252847: Optimize primitive arrayCopy stubs using AVX-512 masked instructions Reviewed-by: neliasso, kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/61 From iklam at openjdk.java.net Sat Oct 10 07:12:23 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Sat, 10 Oct 2020 07:12:23 GMT Subject: RFR: 8253402: Convert vmSymbols::SID to enum class [v3] In-Reply-To: References: Message-ID: > Convert `vmSymbols::SID` to an `enum class` to provide better type safety. > > - The original enum type `vmSymbols::SID` cannot be forward-declared. I moved it out of the `vmSymbols` class and > renamed, so now it can be forward-declared as `enum class vmSymbolID : int;`, without including the large vmSymbols.hpp > file. > - This also breaks the mutual dependency between the `vmSymbols` and `vmIntrinsics` classes. Now the declaration of > `vmIntrinsics` can be moved from vmSymbols.hpp to vmIntrinsics.hpp, where it naturally belongs. > - Type-safe enumeration (contributed by Kim Barrett) > for (vmSymbolID index : vmSymbolsIterator()) { > vm_symbol_index[as_int(index)] = index; > } > - I moved `vmSymbols::_symbols[]` to `Symbol::_vm_symbols[]`, and made it accessible via `Symbol::vm_symbol_at()`. This > way, header files (e.g. fieldInfo.hpp) that need to convert from `vmSymbolID` to `Symbol*` don't need to include the > large vmSymbols.hpp file. > - I changed the `VM_SYMBOL_ENUM_NAME` macro so that the users don't need to explicitly add the `vmSymbolID::` scope. > - I removed many unnecessary casts between `int` and `vmSymbolID`. > - The remaining casts are done via `vmSymbol::as_int()` and `vmSymbols::as_SID()` with range checks. > > ----- > If this is successful, I will do the same for `vmIntrinsics::ID`. Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: - Use 2-style EnumIterator - Merge master into 8253402-convert-vmsymbols-sid-to-enum-class - more vmEnums.hpp fixes; fixed minimal VM build - Merge branch 'master' into 8253402-convert-vmsymbols-sid-to-enum-class - Moved forward declaration of vmSymbolID to vmEnums.hpp - clean up whitespaces and removed useless comment - removed unnecessary include - 8253402: Convert vmSymbols::SID to enum class ------------- Changes: https://git.openjdk.java.net/jdk/pull/276/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=276&range=02 Stats: 791 lines in 29 files changed: 470 ins; 144 del; 177 mod Patch: https://git.openjdk.java.net/jdk/pull/276.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/276/head:pull/276 PR: https://git.openjdk.java.net/jdk/pull/276 From iklam at openjdk.java.net Sat Oct 10 07:17:12 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Sat, 10 Oct 2020 07:17:12 GMT Subject: RFR: 8253402: Convert vmSymbols::SID to enum class [v2] In-Reply-To: References: <2OwQBbIcnjHfvfRcyRo0evb8tFPgNSP_n91VPhJwASc=.84300d55-aa7b-4dec-acbb-94a828adce58@github.com> Message-ID: <5hDY8l3n9oNDKQQEvZAr1f5hfiqq23cj4G8GvLaPQtw=.0cc04d8f-ff0e-4362-8bc0-0b65691528bf@github.com> On Sun, 27 Sep 2020 11:09:30 GMT, Kim Barrett wrote: >> Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes >> the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last >> revision: >> - more vmEnums.hpp fixes; fixed minimal VM build >> - Merge branch 'master' into 8253402-convert-vmsymbols-sid-to-enum-class >> - Moved forward declaration of vmSymbolID to vmEnums.hpp >> - clean up whitespaces and removed useless comment >> - removed unnecessary include >> - 8253402: Convert vmSymbols::SID to enum class > > src/hotspot/share/utilities/enumIterator.hpp line 1: > >> 1: /* > > I think there are problems with this EnumIterator class. It was based in part on a prototype of mine, which I've been > revisiting and revising, because I think it has problems. That prototype, in turn, was based on the existing > WeakProcessorPhases::Iterator, which I think has some of the same failings. And I think this version expands on some of > those problems. I don't yet have a full review, but below are some observations and issues. I'm working on an > alternative. A fundamental question is what style of "iterator" do we want. (1) One style is self-contained; you > create a single iterator which knows both the current position and the iteration limit, and step until a predicate > (is_end() in the current code) is true. (2) Another style is to have a pair of iterators, one designating the current > position and the other designating the iteration limit. This is the style used by the C++ Standard Library. > Both my earlier prototype and the EnumIterator in this PR are 1-style but attempt (not necessarily very well, in my > opinion) to also provide 2-style behavior. The point of that currently seems to be to support the new "range-based for" > feature. (Said feature is currently not in the permitted list according to the Style Guide. I intentionally left it out > because I think its utility is pretty strongly dependent on adopting 2-style iterators, which is not very well > motivated without using the Standard Library.) One requirement for an enum iterator (for me) is that it doesn't > require a "fake" enumerator that designates the exclusive end of the range. The current proposal fails that test. A > problem with all of the variants is that they are trying to be both 1-style (providing is_end) and 2-style (providing > being/end), with the result that they do neither well. This is especially true for the variant in the PR. I think part > of the problem is that the begin/end functions don't belong to the iterator class; they should be part of a separate > range class. Aside: I think it is possible to provide iteration that doesn't assume sequential enumerators if one is > willing to have some code duplication or has an enum that is x-macro based. While possibly an interesting exercise, I > doubt that's worth pursuing. Just mentioning it in case anyone thinks this would actually useful. I'm not certain how > to proceed. Maybe this should be moved elsewhere as not yet ready to be a widely used "utility"? Or maybe go ahead with > it with the intention of improving it? Hi Kim, I have integrated the 2-style enumerator that you sent me off-line. Usage info (see enumIterator.hpp for details). // Example (see vmSymbols.hpp/cpp) // // ENUMERATOR_RANGE(vmSymbolID, vmSymbolID::FIRST_SID, vmSymbolID::LAST_SID) // constexpr EnumRange vmSymbolsRange; // using vmSymbolsIterator = EnumIterator; // // /* Without range-based for, allowed */ // for (vmSymbolsIterator it = vmSymbolsRange.begin(); it != vmSymbolsRange.end(); ++it) { // vmSymbolID index = *it; .... // } // // /* With range-base for, not allowed by HotSpot coding style yet */ // for (vmSymbolID index : vmSymbolsRange) { // .... // } I have rewritten all the iteration loops using the "Without range-based for" style. We can change to the "With range-base for" style when the HotSpot Coding Style Guide allows it. ------------- PR: https://git.openjdk.java.net/jdk/pull/276 From rrich at openjdk.java.net Sat Oct 10 08:34:23 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Sat, 10 Oct 2020 08:34:23 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v10] In-Reply-To: References: Message-ID: > Hi, > > this is the continuation of the review of the implementation for: > > https://bugs.openjdk.java.net/browse/JDK-8227745 > https://bugs.openjdk.java.net/browse/JDK-8233915 > > It allows for JIT optimizations based on escape analysis even if JVMTI agents acquire capabilities to access references > to objects that are subject to such optimizations, e.g. scalar replacement. The implementation reverts such > optimizations just before access very much as when switching from JIT compiled execution to the interpreter, aka > "deoptimization". Webrev.8 was the last one before before the transition to Git/Github: > > http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ > > Thanks, Richard. Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 21 commits: - The constructor of StackFrameStream takes more parameters after JDK-8253180 - Merge branch 'master' into JDK-8227745 - Merge branch 'master' into JDK-8227745 - Merge branch 'master' into JDK-8227745 - Factorized fragment out of EscapeBarrier::deoptimize_objects_internal into new method in compiledVFrame. - More smaller changes proposed by Serguei. - jvmtiDeferredUpdates.hpp: remove forward declarations. - jvmtiDeferredLocalVariable: move member variables to the beginning of the class definition. - jvmtiDeferredUpdates.hpp: add/remove empty lines and improve indentation. - Merge branch 'master' into JDK-8227745 - ... and 11 more: https://git.openjdk.java.net/jdk/compare/aaa0a2a0...06b139a9 ------------- Changes: https://git.openjdk.java.net/jdk/pull/119/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=119&range=09 Stats: 5814 lines in 52 files changed: 5595 ins; 116 del; 103 mod Patch: https://git.openjdk.java.net/jdk/pull/119.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/119/head:pull/119 PR: https://git.openjdk.java.net/jdk/pull/119 From rrich at openjdk.java.net Sat Oct 10 08:37:12 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Sat, 10 Oct 2020 08:37:12 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v9] In-Reply-To: References: <5O9n8cKBJyjhp2cNOVD2PcpKQiXqEs5BJjkW1lH-5EM=.044a510e-6517-4564-a3db-00c7951f0b22@github.com> Message-ID: On Sat, 10 Oct 2020 00:01:49 GMT, Vladimir Kozlov wrote: > > > Compiler changes seems fine. Thank you for looking again at this. ------------- PR: https://git.openjdk.java.net/jdk/pull/119 From rrich at openjdk.java.net Sat Oct 10 09:05:11 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Sat, 10 Oct 2020 09:05:11 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v9] In-Reply-To: References: <5O9n8cKBJyjhp2cNOVD2PcpKQiXqEs5BJjkW1lH-5EM=.044a510e-6517-4564-a3db-00c7951f0b22@github.com> Message-ID: On Sat, 10 Oct 2020 00:25:52 GMT, Vladimir Kozlov wrote: > > > I tried to run testing with latest changes and latest JDK and build failed: > src/hotspot/share/runtime/escapeBarrier.cpp:310:35: error: no matching function for call to > 'StackFrameStream::StackFrameStream(JavaThread*&)' 310 | StackFrameStream fst(deoptee); I noticed this too. I wanted to test with ZGC before pushing the small fix. Unfortunately I get # Internal Error (/priv/d038402/git/reinrich/jdk_ea_new/src/hotspot/share/runtime/stackWatermark.inline.hpp:67), pid=90890, tid=90912 # assert(processing_started()) failed: Processing should already have started [...] Current thread (0x00007f749c25b1c0): JavaThread "JDWP Transport Listener: dt_socket" daemon [_thread_in_vm, id=90912, stack(0x00007f7474c9f000,0x00007f7474da0000)] _threads_hazard_ptr=0x00007f749c2b00c0, _nested_threads_hazard_ptr_cnt=0 Stack: [0x00007f7474c9f000,0x00007f7474da0000], sp=0x00007f7474d9c240, free space=1012k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x15b3255] StackWatermarkSet::on_iteration(JavaThread*, frame const&)+0xa5 V [libjvm.so+0xa1024f] frame::sender(RegisterMap*) const+0x13f V [libjvm.so+0xa048f8] frame::real_sender(RegisterMap*) const+0x18 V [libjvm.so+0x176261b] vframe::sender() const+0xeb V [libjvm.so+0x16cd56b] JavaThread::last_java_vframe(RegisterMap*)+0x5b V [libjvm.so+0xfa7a56] JvmtiEnvBase::vframeFor(JavaThread*, int)+0x46 V [libjvm.so+0xfab8e5] JvmtiEnvBase::check_top_frame(JavaThread*, JavaThread*, jvalue, TosState, Handle*)+0x1f5 V [libjvm.so+0xfac13e] JvmtiEnvBase::force_early_return(JavaThread*, jvalue, TosState)+0x15e V [libjvm.so+0xf36fa8] jvmti_ForceEarlyReturnLong+0x258 C [libjdwp.so+0xa8b3] forceEarlyReturn+0x293 C [libjdwp.so+0x12945] debugLoop_run+0x1f5 C [libjdwp.so+0x25bb3] attachThread+0x33 V [libjvm.so+0xfcf524] JvmtiAgentThread::call_start_function()+0x1d4 V [libjvm.so+0x16cc8f7] JavaThread::thread_main_inner()+0x247 V [libjvm.so+0x16d1ce8] Thread::call_run()+0xf8 V [libjvm.so+0x12dd75e] thread_native_entry(Thread*)+0x10e In the test case `EAForceEarlyReturnOfInlinedMethodWithScalarReplacedObjectsReallocFailure` of the new test `jdk/com/sun/jdi/EATests.java` So far I do not have an indication that the failure is caused by this change but when I run the test with -XX:-DoEscapeAnalysis then the test succeeds. I need to look more into it. Wish I was a ZGC expert :) Anyway I pushed the build fix. Tests succeed with default GC. ------------- PR: https://git.openjdk.java.net/jdk/pull/119 From fyang at openjdk.java.net Sat Oct 10 13:09:11 2020 From: fyang at openjdk.java.net (Fei Yang) Date: Sat, 10 Oct 2020 13:09:11 GMT Subject: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v5] In-Reply-To: <4NM17B6l4GvNgCbmmQTUcnfZTA6G-IEc85O8jH_q-xA=.63b10da7-bab7-44bc-a4c8-0a675aca45c0@github.com> References: <4NM17B6l4GvNgCbmmQTUcnfZTA6G-IEc85O8jH_q-xA=.63b10da7-bab7-44bc-a4c8-0a675aca45c0@github.com> Message-ID: <7CXYOoHPTvfS6YvwFjdlO27rQKRbDu3_QSGP7vDuyDs=.41789630-e05f-4f8a-8562-ad8bb74e12aa@github.com> On Fri, 9 Oct 2020 17:35:22 GMT, Andrew Haley wrote: > I see Linux x64 failed. However, I don't seem to be able to withdraw my patch approval. > However, please consider it withdrawn. Thanks for approving this patch. I checked the error messages and I think the failures were not caused by this patch. The failures has been fixed by the following two commits: commit ec41046c5ce7077eebf4a3c265f79c7fba33d916 8254348: Build fails when cds is disabled after JDK-8247536 commit aaa0a2a04792d7c84150e9d972790978ffcc6890 8254297: Zero and Minimal VMs are broken with undeclared identifier 'DerivedPointerTable' after JDK-8253180 The testing was triggered again automatically after I merge master and I see it passed now. Do you have any comments for the discussion here? https://github.com/openjdk/jdk/pull/207#issuecomment-701243662 Valerie Peng has checked the java security changes, i.e. src/java.base/share/classes/sun/security/provider/SHA3.java. Do you think we need another reviewer for this patch? ------------- PR: https://git.openjdk.java.net/jdk/pull/207 From kcr at openjdk.java.net Sat Oct 10 13:19:09 2020 From: kcr at openjdk.java.net (Kevin Rushforth) Date: Sat, 10 Oct 2020 13:19:09 GMT Subject: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v5] In-Reply-To: <4NM17B6l4GvNgCbmmQTUcnfZTA6G-IEc85O8jH_q-xA=.63b10da7-bab7-44bc-a4c8-0a675aca45c0@github.com> References: <4NM17B6l4GvNgCbmmQTUcnfZTA6G-IEc85O8jH_q-xA=.63b10da7-bab7-44bc-a4c8-0a675aca45c0@github.com> Message-ID: On Fri, 9 Oct 2020 17:35:22 GMT, Andrew Haley wrote: >> Marked as reviewed by aph (Reviewer). > > I see Linux x64 failed. However, I don't seem to be able to withdraw my patch approval. > However, please consider it withdrawn. @theRealAph if you still need to, you can withdraw your approval by reviewing it again and selecting "Request changes". ------------- PR: https://git.openjdk.java.net/jdk/pull/207 From redestad at openjdk.java.net Sat Oct 10 15:07:17 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Sat, 10 Oct 2020 15:07:17 GMT Subject: RFR: 8254353: Remove unused non-product flags Message-ID: <_GO_6KaOEco1LKOP8DYAyXIp81VU90dbQYVcKHKJ4yA=.e6e6c9e4-7801-4d19-8731-9784960a41f5@github.com> Found a few non-product flags that has no implementation and could be removed: StressDerivedPointers PrintVMMessages TraceProfileInterpreter VerifyCompiledCode ProfilerNodeSize I suggest removing them. ------------- Commit messages: - Merge branch 'master' into remove_unused_flags - Merge branch 'master' into remove_unused_flags - Remove various unused non-product flags Changes: https://git.openjdk.java.net/jdk/pull/590/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=590&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254353 Stats: 25 lines in 3 files changed: 0 ins; 22 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/590.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/590/head:pull/590 PR: https://git.openjdk.java.net/jdk/pull/590 From iignatyev at openjdk.java.net Sat Oct 10 16:06:07 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Sat, 10 Oct 2020 16:06:07 GMT Subject: RFR: 8254353: Remove unused non-product flags In-Reply-To: <_GO_6KaOEco1LKOP8DYAyXIp81VU90dbQYVcKHKJ4yA=.e6e6c9e4-7801-4d19-8731-9784960a41f5@github.com> References: <_GO_6KaOEco1LKOP8DYAyXIp81VU90dbQYVcKHKJ4yA=.e6e6c9e4-7801-4d19-8731-9784960a41f5@github.com> Message-ID: On Sat, 10 Oct 2020 15:01:39 GMT, Claes Redestad wrote: > Found a few non-product flags that has no implementation and could be removed: > > StressDerivedPointers > PrintVMMessages > TraceProfileInterpreter > VerifyCompiledCode > ProfilerNodeSize > > I suggest removing them. LGTM ------------- Marked as reviewed by iignatyev (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/590 From rkennke at openjdk.java.net Sat Oct 10 21:27:19 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Sat, 10 Oct 2020 21:27:19 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v5] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference > and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads > that make heavvy use of such weak references will therefore potentially cause significant GC pauses. There are 3 main > items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is > theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly > reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking > bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever > marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all > objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it > will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter > of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for > FinalReference, marking stops there, and does not mark through the referent. 2. Concurrent processing is performed > after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and > depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list > (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no > referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers > in Reference.get() intrinsics that return NULL when the referent is not reachable. Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 41 commits: - Merge remote-tracking branch 'upstream/master' into shenandoah-concurrent-weakrefs - Prevent double-discovery of references - Also abandon pending list when abandoning discovered lists - Implement reference-processing statistics - Implement abandoning partial discovery. Remove unused methods. - Remove leftovers of precleaning - Remove unused is_access_on_jlr_reference() helper method - Invert weak/native condition in interpreter native-LRB for clarity - Merge remote-tracking branch 'upstream/master' into shenandoah-concurrent-weakrefs - Use existing idiom for checking for intanceRefKlass, instead of introducing new code - ... and 31 more: https://git.openjdk.java.net/jdk/compare/cc52358c...da242318 ------------- Changes: https://git.openjdk.java.net/jdk/pull/505/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=04 Stats: 2429 lines in 54 files changed: 1654 ins; 602 del; 173 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From rrich at openjdk.java.net Sun Oct 11 07:23:15 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Sun, 11 Oct 2020 07:23:15 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v9] In-Reply-To: References: <5O9n8cKBJyjhp2cNOVD2PcpKQiXqEs5BJjkW1lH-5EM=.044a510e-6517-4564-a3db-00c7951f0b22@github.com> Message-ID: On Sat, 10 Oct 2020 09:01:22 GMT, Richard Reingruber wrote: >> I tried to run testing with latest changes and latest JDK and build failed: >> src/hotspot/share/runtime/escapeBarrier.cpp:310:35: error: no matching function for call to >> 'StackFrameStream::StackFrameStream(JavaThread*&)' >> 310 | StackFrameStream fst(deoptee); > >> >> >> I tried to run testing with latest changes and latest JDK and build failed: >> src/hotspot/share/runtime/escapeBarrier.cpp:310:35: error: no matching function for call to >> 'StackFrameStream::StackFrameStream(JavaThread*&)' 310 | StackFrameStream fst(deoptee); > > I noticed this too. I wanted to test with ZGC before pushing the small > fix. Unfortunately I get > > # Internal Error (/priv/d038402/git/reinrich/jdk_ea_new/src/hotspot/share/runtime/stackWatermark.inline.hpp:67), > pid=90890, tid=90912 # assert(processing_started()) failed: Processing should already have started > > [...] > > Current thread (0x00007f749c25b1c0): JavaThread "JDWP Transport Listener: dt_socket" daemon [_thread_in_vm, id=90912, > stack(0x00007f7474c9f000,0x00007f7474da0000)] _threads_hazard_ptr=0x00007f749c2b00c0, _nested_threads_hazard_ptr_cnt=0 > Stack: [0x00007f7474c9f000,0x00007f7474da0000], sp=0x00007f7474d9c240, free space=1012k > Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x15b3255] StackWatermarkSet::on_iteration(JavaThread*, frame const&)+0xa5 > V [libjvm.so+0xa1024f] frame::sender(RegisterMap*) const+0x13f > V [libjvm.so+0xa048f8] frame::real_sender(RegisterMap*) const+0x18 > V [libjvm.so+0x176261b] vframe::sender() const+0xeb > V [libjvm.so+0x16cd56b] JavaThread::last_java_vframe(RegisterMap*)+0x5b > V [libjvm.so+0xfa7a56] JvmtiEnvBase::vframeFor(JavaThread*, int)+0x46 > V [libjvm.so+0xfab8e5] JvmtiEnvBase::check_top_frame(JavaThread*, JavaThread*, jvalue, TosState, Handle*)+0x1f5 > V [libjvm.so+0xfac13e] JvmtiEnvBase::force_early_return(JavaThread*, jvalue, TosState)+0x15e > V [libjvm.so+0xf36fa8] jvmti_ForceEarlyReturnLong+0x258 > C [libjdwp.so+0xa8b3] forceEarlyReturn+0x293 > C [libjdwp.so+0x12945] debugLoop_run+0x1f5 > C [libjdwp.so+0x25bb3] attachThread+0x33 > V [libjvm.so+0xfcf524] JvmtiAgentThread::call_start_function()+0x1d4 > V [libjvm.so+0x16cc8f7] JavaThread::thread_main_inner()+0x247 > V [libjvm.so+0x16d1ce8] Thread::call_run()+0xf8 > V [libjvm.so+0x12dd75e] thread_native_entry(Thread*)+0x10e > > In the test case > `EAForceEarlyReturnOfInlinedMethodWithScalarReplacedObjectsReallocFailure` of the > new test `jdk/com/sun/jdi/EATests.java` > > So far I do not have an indication that the failure is caused by this change but > when I run the test with -XX:-DoEscapeAnalysis then the test succeeds. > > I need to look more into it. Wish I was a ZGC expert :) > > Anyway I pushed the build fix. Tests succeed with default GC. The crash described above happens after JDK-8253180 (https://github.com/openjdk/jdk/commit/b9873e18330b7e43ca47bc1c0655e7ab20828f7a) when executing `EATests.java` with ZGC: make run-test TEST=test/jdk/com/sun/jdi/EATests.java JTREG=VM_OPTIONS=-XX:+UseZGC My understanding of JDK-8253180 (and ZGC) is rather vague. To me it looks as if stackwalks outside of a safepoint/handshake on suspended threads are currently not supported. It would be my understanding that `StackWatermarkSet::start_processing()` needs to be called before walking the stack of a thread. Currently this is only done in preparation of a safepoint or handshake. `JvmtiEnvBase::check_top_frame()` walks the stack of a suspended thread without safepoint/handshake. This triggers the crash in my opinion. When `StackWatermarkSet::start_processing()` is called before the test succeeds. I will ask Erik ?sterlund about this. ------------- PR: https://git.openjdk.java.net/jdk/pull/119 From kvn at openjdk.java.net Sun Oct 11 19:45:12 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Sun, 11 Oct 2020 19:45:12 GMT Subject: RFR: 8254353: Remove unused non-product flags In-Reply-To: <_GO_6KaOEco1LKOP8DYAyXIp81VU90dbQYVcKHKJ4yA=.e6e6c9e4-7801-4d19-8731-9784960a41f5@github.com> References: <_GO_6KaOEco1LKOP8DYAyXIp81VU90dbQYVcKHKJ4yA=.e6e6c9e4-7801-4d19-8731-9784960a41f5@github.com> Message-ID: On Sat, 10 Oct 2020 15:01:39 GMT, Claes Redestad wrote: > Found a few non-product flags that has no implementation and could be removed: > > StressDerivedPointers > PrintVMMessages > TraceProfileInterpreter > VerifyCompiledCode > ProfilerNodeSize > > I suggest removing them. Looks good. Please run hs-tier1 before integration. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/590 From redestad at openjdk.java.net Sun Oct 11 20:15:10 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Sun, 11 Oct 2020 20:15:10 GMT Subject: RFR: 8254353: Remove unused non-product flags In-Reply-To: References: <_GO_6KaOEco1LKOP8DYAyXIp81VU90dbQYVcKHKJ4yA=.e6e6c9e4-7801-4d19-8731-9784960a41f5@github.com> Message-ID: On Sun, 11 Oct 2020 19:42:45 GMT, Vladimir Kozlov wrote: > Looks good. Please run hs-tier1 before integration. Thanks! Does hs-tier1 cover anything the automatic pre-integration does not? I got some build failures on linux due the MaxVectorSize issue fixed by https://github.com/openjdk/jdk/pull/588/files but otherwise tier1 testing looks good ------------- PR: https://git.openjdk.java.net/jdk/pull/590 From kvn at openjdk.java.net Sun Oct 11 20:27:09 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Sun, 11 Oct 2020 20:27:09 GMT Subject: RFR: 8254353: Remove unused non-product flags In-Reply-To: References: <_GO_6KaOEco1LKOP8DYAyXIp81VU90dbQYVcKHKJ4yA=.e6e6c9e4-7801-4d19-8731-9784960a41f5@github.com> Message-ID: On Sun, 11 Oct 2020 20:12:10 GMT, Claes Redestad wrote: > > Looks good. Please run hs-tier1 before integration. > > Thanks! > > Does hs-tier1 cover anything the automatic pre-integration does not? I got some build failures on linux due the > MaxVectorSize issue fixed by https://github.com/openjdk/jdk/pull/588/files but otherwise tier1 testing looks good Aarch64 build and testing. And I am not sure Git testing matches exact what we run in mach5. ------------- PR: https://git.openjdk.java.net/jdk/pull/590 From redestad at openjdk.java.net Sun Oct 11 22:02:09 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Sun, 11 Oct 2020 22:02:09 GMT Subject: RFR: 8254353: Remove unused non-product flags In-Reply-To: References: <_GO_6KaOEco1LKOP8DYAyXIp81VU90dbQYVcKHKJ4yA=.e6e6c9e4-7801-4d19-8731-9784960a41f5@github.com> Message-ID: <8jammsBuuKCeyA_9Se4U_TWWenjZ6-ZCndPaYmkcWsY=.41a46602-c544-4724-8a6b-460a071ca0ba@github.com> On Sun, 11 Oct 2020 20:24:16 GMT, Vladimir Kozlov wrote: >>> Looks good. Please run hs-tier1 before integration. >> >> Thanks! >> >> Does hs-tier1 cover anything the automatic pre-integration does not? I got some build failures on linux due the >> MaxVectorSize issue fixed by https://github.com/openjdk/jdk/pull/588/files but otherwise tier1 testing looks good > >> > Looks good. Please run hs-tier1 before integration. >> >> Thanks! >> >> Does hs-tier1 cover anything the automatic pre-integration does not? I got some build failures on linux due the >> MaxVectorSize issue fixed by https://github.com/openjdk/jdk/pull/588/files but otherwise tier1 testing looks good > > Aarch64 build and testing. And I am not sure Git testing matches exact what we run in mach5. hs-tier1 pass, also ran a few other build-only jobs that should cover various other builds. ------------- PR: https://git.openjdk.java.net/jdk/pull/590 From redestad at openjdk.java.net Sun Oct 11 22:02:10 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Sun, 11 Oct 2020 22:02:10 GMT Subject: Integrated: 8254353: Remove unused non-product flags In-Reply-To: <_GO_6KaOEco1LKOP8DYAyXIp81VU90dbQYVcKHKJ4yA=.e6e6c9e4-7801-4d19-8731-9784960a41f5@github.com> References: <_GO_6KaOEco1LKOP8DYAyXIp81VU90dbQYVcKHKJ4yA=.e6e6c9e4-7801-4d19-8731-9784960a41f5@github.com> Message-ID: On Sat, 10 Oct 2020 15:01:39 GMT, Claes Redestad wrote: > Found a few non-product flags that has no implementation and could be removed: > > StressDerivedPointers > PrintVMMessages > TraceProfileInterpreter > VerifyCompiledCode > ProfilerNodeSize > > I suggest removing them. This pull request has now been integrated. Changeset: 77c77627 Author: Claes Redestad URL: https://git.openjdk.java.net/jdk/commit/77c77627 Stats: 25 lines in 3 files changed: 0 ins; 22 del; 3 mod 8254353: Remove unused non-product flags Reviewed-by: iignatyev, kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/590 From iklam at openjdk.java.net Sun Oct 11 22:54:21 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Sun, 11 Oct 2020 22:54:21 GMT Subject: RFR: 8253402: Convert vmSymbols::SID to enum class [v4] In-Reply-To: References: Message-ID: <9m7uFY5ij94oj3SQ9pTHNq-tsw0NnPHDVqHhznmAuOo=.bc75f19e-0527-4962-8b19-b178cfa8e572@github.com> > Convert `vmSymbols::SID` to an `enum class` to provide better type safety. > > - The original enum type `vmSymbols::SID` cannot be forward-declared. I moved it out of the `vmSymbols` class and > renamed, so now it can be forward-declared as `enum class vmSymbolID : int;`, without including the large vmSymbols.hpp > file. > - This also breaks the mutual dependency between the `vmSymbols` and `vmIntrinsics` classes. Now the declaration of > `vmIntrinsics` can be moved from vmSymbols.hpp to vmIntrinsics.hpp, where it naturally belongs. > - Type-safe enumeration (contributed by Kim Barrett) > for (vmSymbolID index : vmSymbolsIterator()) { > vm_symbol_index[as_int(index)] = index; > } > - I moved `vmSymbols::_symbols[]` to `Symbol::_vm_symbols[]`, and made it accessible via `Symbol::vm_symbol_at()`. This > way, header files (e.g. fieldInfo.hpp) that need to convert from `vmSymbolID` to `Symbol*` don't need to include the > large vmSymbols.hpp file. > - I changed the `VM_SYMBOL_ENUM_NAME` macro so that the users don't need to explicitly add the `vmSymbolID::` scope. > - I removed many unnecessary casts between `int` and `vmSymbolID`. > - The remaining casts are done via `vmSymbol::as_int()` and `vmSymbols::as_SID()` with range checks. > > ----- > If this is successful, I will do the same for `vmIntrinsics::ID`. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: added missing #include from enumIterator.hpp ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/276/files - new: https://git.openjdk.java.net/jdk/pull/276/files/9deb6811..9ddca08f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=276&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=276&range=02-03 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/276.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/276/head:pull/276 PR: https://git.openjdk.java.net/jdk/pull/276 From darcy at openjdk.java.net Mon Oct 12 03:00:14 2020 From: darcy at openjdk.java.net (Joe Darcy) Date: Mon, 12 Oct 2020 03:00:14 GMT Subject: RFR: 8223347: Integration of Vector API (Incubator) In-Reply-To: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> References: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> Message-ID: On Fri, 25 Sep 2020 20:14:29 GMT, Paul Sandoz wrote: > This pull request is for integration of the Vector API. It was previously reviewed under conditions when mercurial was > used for the source code control system. Review threads can be found here (searching for issue number 8223347 in the > title): https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-April/thread.html > https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-May/thread.html > https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-July/thread.html > > If mercurial was still being used the code would be pushed directly, once the CSR is approved. However, in this case a > pull request is required and needs explicit reviewer approval. Between the final review and this pull request no code > has changed, except for that related to merging. Marked as reviewed by darcy (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/367 From fyang at openjdk.java.net Mon Oct 12 07:05:16 2020 From: fyang at openjdk.java.net (Fei Yang) Date: Mon, 12 Oct 2020 07:05:16 GMT Subject: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v5] In-Reply-To: References: <4NM17B6l4GvNgCbmmQTUcnfZTA6G-IEc85O8jH_q-xA=.63b10da7-bab7-44bc-a4c8-0a675aca45c0@github.com> Message-ID: On Sat, 10 Oct 2020 13:15:51 GMT, Kevin Rushforth wrote: >> I see Linux x64 failed. However, I don't seem to be able to withdraw my patch approval. >> However, please consider it withdrawn. > > @theRealAph if you still need to, you can withdraw your approval by reviewing it again and selecting "Request changes". > I have looked at the java security changes, i.e. src/java.base/share/classes/sun/security/provider/SHA3.java. It looks > fine. @valeriepeng : I see you are not listed under "Reviewers" commit message part, could you please press the magic button(s)(approve?) so you get the credit? Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/207 From ihse at openjdk.java.net Mon Oct 12 10:32:13 2020 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Mon, 12 Oct 2020 10:32:13 GMT Subject: RFR: 8254072: AArch64: Get rid of --disable-warnings-as-errors on Windows+ARM64 build [v2] In-Reply-To: References: Message-ID: On Thu, 8 Oct 2020 20:28:33 GMT, Bernhard Urban-Forster wrote: >> I organized this PR so that each commit contains the warning emitted by MSVC as commit message and its relevant fix. >> >> Verified on >> * Linux+ARM64: `{hotspot,jdk,langtools}:tier1`, no failures. >> * Windows+ARM64: `{hotspot,jdk,langtools}:tier1`, no (new) failures. >> * internal macOS+ARM64 port: build without `--disable-warnings-as-errors` still works. Just mentioning this here, because >> it's yet another toolchain (Xcode / clang) that needs to be kept happy [going >> forward](https://openjdk.java.net/jeps/391). > > Bernhard Urban-Forster has updated the pull request with a new target base due to a merge or a rebase. The pull request > now contains 18 commits: > - Merge remote-tracking branch 'upstream/master' into 8254072-fix-windows-arm64-warnings > - ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1441): warning C4267: 'argument': conversion from 'size_t' to > 'int', possible loss of data > ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1446): warning C4267: 'argument': conversion from 'size_t' to > 'int', possible loss of data ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1654): warning C4267: 'argument': > conversion from 'size_t' to 'int', possible loss of data > - Revert changes for "warning C4146: unary minus operator applied to unsigned type, result still unsigned" > - msvc: disable unary minus warning for unsigned types > - ./src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp(1123): warning C4267: 'initializing': conversion > from 'size_t' to 'int', possible loss of data > ./src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp(1123): warning C4267: 'initializing': conversion > from 'size_t' to 'const int', possible loss of data > - ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1312): warning C4267: 'argument': conversion from 'size_t' to > 'unsigned int', possible loss of data > ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1370): warning C4267: 'argument': conversion from 'size_t' to > 'int', possible loss of data ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1441): warning C4146: unary minus > operator applied to unsigned type, result still unsigned ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1441): > warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data > - ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp(2472): warning C4312: 'type cast': conversion from 'unsigned int' > to 'address' of greater size > - ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp(1527): warning C4267: 'argument': conversion from 'size_t' to > 'int', possible loss of data > - ./src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp(2901): warning C4267: 'initializing': conversion from 'size_t' to > 'int', possible loss of data > ./src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp(2901): warning C4267: 'initializing': conversion from 'size_t' to > 'const int', possible loss of data > - ./src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp(2756): warning C4146: unary minus operator applied to unsigned > type, result still unsigned > - ... and 8 more: https://git.openjdk.java.net/jdk/compare/5351ba6c...a081dfb4 Changes requested by ihse (Reviewer). make/autoconf/flags-cflags.m4 line 137: > 135: WARNINGS_ENABLE_ALL="-W3" > 136: DISABLED_WARNINGS="4800" > 137: DISABLED_WARNINGS+=" 4146" # unary minus operator applied to unsigned type, result still unsigned This change will affect *all* JDK code. I'm not sure this was intended? If it was intended, I think you need to motivate this more explicitly. If you only wanted to disable the warning for hotspot, the proper solution would be to add it to DISABLED_WARNINGS_microsoft in make/hotspot/lib/CompileJvm.gmk. ------------- PR: https://git.openjdk.java.net/jdk/pull/530 From mcimadamore at openjdk.java.net Mon Oct 12 10:50:48 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Mon, 12 Oct 2020 10:50:48 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v6] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the third incubation round of the foreign memory access API incubation > (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: > * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from > multiple threads > * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee > that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class > has been added, which defines several useful dereference routines; these are really just thin wrappers around memory > access var handles, but they make the barrier of entry for using this API somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not > the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link > to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit > of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which > wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as > dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability > in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; > secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can > use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done > by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided > below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be > happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, > Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd > like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio > Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative > to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a > carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access > base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte > offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. > `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which > it is easy to derive all the other handles using plain var handle combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both > `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients > can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. > * `MemoryLayouts` > * A new layout, for machine addresses, has been added to the mix. > > > > ### Implementation changes > > There are two main things to discuss here: support for shared segments, and the general simplification of the memory > access var handle support. > #### Shared segments > > The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to > achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment > is shared, it would be possible for a thread to close it while another is accessing it. After considering several > options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he > reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world > (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a > close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and > the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). > It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of > these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, > we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to > whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of > stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, > once we detect that a thread is accessing the very segment we're about to close, what should happen? We first > experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it > fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the > machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to > minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread > is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and > try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should > be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single > place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in > addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` > annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) > class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, > like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is > tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during > access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory > access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead > of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a > `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed > successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two > implementations, one for confined segments and one for shared segments; the main difference between the two is what > happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared > segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or > `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` > state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. #### > Memory access var handles overhaul The key realization here was that if all memory access var handles took a > coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle > form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var > handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that > e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the > implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level > access on the innards of the memory access var handle. All that code is now gone. #### Test changes Not much to see > here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, > since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` > functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the > microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared > segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - > https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Tweak referenced to MemoryAddressProxy in Utils.java ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/548/files - new: https://git.openjdk.java.net/jdk/pull/548/files/9b3fc227..770b1e9c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=04-05 Stats: 6 lines in 1 file changed: 0 ins; 1 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/548.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/548/head:pull/548 PR: https://git.openjdk.java.net/jdk/pull/548 From mdoerr at openjdk.java.net Mon Oct 12 11:09:15 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Mon, 12 Oct 2020 11:09:15 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v4] In-Reply-To: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> References: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> Message-ID: On Thu, 8 Oct 2020 20:31:47 GMT, CoreyAshford wrote: >> This patch set encompasses the following commits: >> >> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for >> the intrinsic. The API is similar to the existing encodeBlock intrinsic. >> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic >> implementation >> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. >> - Adds a JMH microbenchmark for both Base64 encoding and encoding. >> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. > > CoreyAshford has updated the pull request incrementally with two additional commits since the last revision: > > - TestBase64.java: fix comment to correctly reflect actual intrinsic names. > > The intrinsic names that are visible with -XX:+PrintCompilation are encode > and decode, rather than encodeBlock and decodeBlock. > - stubGenerator_ppc.cpp: fix regression caused by change to using loop counter > > My original fix didn't account for the case where sl < block_size. In the > event sl < block_size, the shifted sl will become zero, so it should > jump to the code that computes how much data was processed - 0 - and return. Test java/util/Base64/TestBase64.java failed on Power9: Seed from RandomFactory = -8714459054005749075L ------------- Changes requested by mdoerr (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/293 From github.com+70893615+jasontatton-aws at openjdk.java.net Mon Oct 12 11:17:25 2020 From: github.com+70893615+jasontatton-aws at openjdk.java.net (Jason Tatton) Date: Mon, 12 Oct 2020 11:17:25 GMT Subject: RFR: 8173585: Intrinsify StringLatin1.indexOf(char) [v6] In-Reply-To: <_T0873dC5tfUtGn9r1_Y21JkPKRt-za3MM9hPN2GQKQ=.b865fe53-5417-424f-81b6-1566a330640e@github.com> References: <_T0873dC5tfUtGn9r1_Y21JkPKRt-za3MM9hPN2GQKQ=.b865fe53-5417-424f-81b6-1566a330640e@github.com> Message-ID: > This is an implementation of the indexOf(char) intrinsic for StringLatin1 (1 byte encoded Strings). It is provided for > x86 and ARM64. The implementation is greatly inspired by the indexOf(char) intrinsic for StringUTF16. To incorporate it > I had to make a small change to StringLatin1.java (refactor of functionality to intrisified private method) as well as > code for C2. Submitted to: hotspot-compiler-dev and core-libs-dev as this patch contains a change to hotspot and > java/lang/StringLatin1.java https://bugs.openjdk.java.net/browse/JDK-8173585 > > Details of testing: > ============ > I have created a jtreg test ?compiler/intrinsics/string/TestStringLatin1IndexOfChar? to cover this new intrinsic. Note > that, particularly for the x86 implementation of the intrinsic, the code path taken is dependent upon the length of the > input String. Hence the test has been designed to cover all these cases. In summary they are: > - A ?short? string of < 16 characters. > - A SIMD String of 16 ? 31 characters. > - A AVX2 SIMD String of 32 characters+. > > Hardware used for testing: > ----------------------------- > > - Intel Xeon CPU E5-2680 (JVM did not recognize this as having AVX2 support) ? Intel i7 processor (with AVX2 support). > - AWS Graviton 2 (ARM 64 processor). > > I also ran; ?run-test-tier1? and ?run-test-tier2? for: x86_64 and aarch64. > > Possible future enhancements: > ==================== > For the x86 implementation there may be two further improvements we can make in order to improve performance of both > the StringUTF16 and StringLatin1 indexOf(char) intrinsics: > 1. Make use of AVX-512 instructions. > 2. For ?short? Strings (see below), I think it may be possible to modify the existing algorithm to still use SSE SIMD > instructions instead of a loop. > Benchmark results: > ============ > **Without** the new StringLatin1 indexOf(char) intrinsic: > > | Benchmark | Mode | Cnt | Score | Error | Units | > | ------------- | ------------- |------------- |------------- |------------- |------------- | > | IndexOfBenchmark.latin1_mixed_char | avgt | 5 | **26,389.129** | ? 182.581 | ns/op | > | IndexOfBenchmark.utf16_mixed_char | avgt | 5 | 17,885.383 | ? 435.933 | ns/op | > > > **With** the new StringLatin1 indexOf(char) intrinsic: > > | Benchmark | Mode | Cnt | Score | Error | Units | > | ------------- | ------------- |------------- |------------- |------------- |------------- | > | IndexOfBenchmark.latin1_mixed_char | avgt | 5 | **17,875.185** | ? 407.716 | ns/op | > | IndexOfBenchmark.utf16_mixed_char | avgt | 5 | 18,292.802 | ? 167.306 | ns/op | > > > The objective of the patch is to bring the performance of StringLatin1 indexOf(char) in line with StringUTF16 > indexOf(char) for x86 and ARM64. We can see above that this has been achieved. Similar results were obtained when > running on ARM. Jason Tatton has updated the pull request incrementally with one additional commit since the last revision: Added missing copyright notices ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/71/files - new: https://git.openjdk.java.net/jdk/pull/71/files/8ead02ab..3ae1d92d Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=71&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=71&range=04-05 Stats: 45 lines in 2 files changed: 45 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/71.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/71/head:pull/71 PR: https://git.openjdk.java.net/jdk/pull/71 From github.com+70893615+jasontatton-aws at openjdk.java.net Mon Oct 12 11:17:27 2020 From: github.com+70893615+jasontatton-aws at openjdk.java.net (Jason Tatton) Date: Mon, 12 Oct 2020 11:17:27 GMT Subject: RFR: 8173585: Intrinsify StringLatin1.indexOf(char) [v3] In-Reply-To: <0_4KDtvn6WhHRCxqupbUQjLauR5DMQWJluogsJ7m_KA=.697e6e15-5c91-4339-abab-1dff51871e8d@github.com> References: <_T0873dC5tfUtGn9r1_Y21JkPKRt-za3MM9hPN2GQKQ=.b865fe53-5417-424f-81b6-1566a330640e@github.com> <0_4KDtvn6WhHRCxqupbUQjLauR5DMQWJluogsJ7m_KA=.697e6e15-5c91-4339-abab-1dff51871e8d@github.com> Message-ID: <5sQCV-tDrsi1Ivud3DaxJq5vfJMQDJoZ4tPfKj_JI60=.82ae2d6d-c9b1-46e3-ac2b-284d6ea8878e@github.com> On Sat, 10 Oct 2020 00:10:54 GMT, Vladimir Kozlov wrote: >> Jason Tatton has updated the pull request incrementally with one additional commit since the last revision: >> >> Add missing newline to end of vmSymbols.cpp > > test/hotspot/jtreg/compiler/intrinsics/string/TestStringLatin1IndexOfChar.java line 1: > >> 1: /* > > Missing copyright+GPL header in new test. See other tests fro example. Thanks I have added this now > test/micro/org/openjdk/bench/java/lang/StringIndexOfChar.java line 1: > >> 1: package org.openjdk.bench.java.lang; > > Again missing Copyright. Thanks I have added this now ------------- PR: https://git.openjdk.java.net/jdk/pull/71 From ihse at openjdk.java.net Mon Oct 12 11:45:15 2020 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Mon, 12 Oct 2020 11:45:15 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v6] In-Reply-To: References: Message-ID: On Mon, 12 Oct 2020 10:50:48 GMT, Maurizio Cimadamore wrote: >> This patch contains the changes associated with the third incubation round of the foreign memory access API incubation >> (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: >> * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from >> multiple threads >> * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee >> that the memory will be deallocated, eventually >> * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class >> has been added, which defines several useful dereference routines; these are really just thin wrappers around memory >> access var handles, but they make the barrier of entry for using this API somewhat lower. >> >> A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not >> the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link >> to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit >> of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which >> wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as >> dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability >> in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; >> secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can >> use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done >> by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided >> below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be >> happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, >> Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd >> like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio >> Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html >> Specdiff: >> >> http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html >> >> CSR: >> >> https://bugs.openjdk.java.net/browse/JDK-8254163 >> >> >> >> ### API Changes >> >> * `MemorySegment` >> * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) >> * added a no-arg factory for a native restricted segment representing entire native heap >> * rename `withOwnerThread` to `handoff` >> * add new `share` method, to create shared segments >> * add new `registerCleaner` method, to register a segment against a cleaner >> * add more helpers to create arrays from a segment e.g. `toIntArray` >> * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) >> * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) >> * `MemoryAddress` >> * drop `segment` accessor >> * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative >> to a given segment >> * `MemoryAccess` >> * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a >> carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access >> base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte >> offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. >> `getByteAtOffset` vs `getByteAtIndex`). >> * `MemoryHandles` >> * drop `withOffset` combinator >> * drop `withStride` combinator >> * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which >> it is easy to derive all the other handles using plain var handle combinators. >> * `Addressable` >> * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both >> `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients >> can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. >> * `MemoryLayouts` >> * A new layout, for machine addresses, has been added to the mix. >> >> >> >> ### Implementation changes >> >> There are two main things to discuss here: support for shared segments, and the general simplification of the memory >> access var handle support. >> #### Shared segments >> >> The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to >> achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment >> is shared, it would be possible for a thread to close it while another is accessing it. After considering several >> options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he >> reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world >> (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a >> close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and >> the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). >> It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of >> these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, >> we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to >> whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of >> stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, >> once we detect that a thread is accessing the very segment we're about to close, what should happen? We first >> experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it >> fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the >> machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to >> minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread >> is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and >> try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should >> be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single >> place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in >> addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` >> annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) >> class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, >> like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is >> tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during >> access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory >> access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead >> of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a >> `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed >> successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two >> implementations, one for confined segments and one for shared segments; the main difference between the two is what >> happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared >> segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or >> `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` >> state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. #### >> Memory access var handles overhaul The key realization here was that if all memory access var handles took a >> coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle >> form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var >> handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that >> e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the >> implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level >> access on the innards of the memory access var handle. All that code is now gone. #### Test changes Not much to see >> here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, >> since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` >> functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the >> microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared >> segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - >> https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 > > Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: > > Tweak referenced to MemoryAddressProxy in Utils.java Build changes look good. ------------- Marked as reviewed by ihse (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/548 From vkempik at openjdk.java.net Mon Oct 12 12:07:23 2020 From: vkempik at openjdk.java.net (Vladimir Kempik) Date: Mon, 12 Oct 2020 12:07:23 GMT Subject: RFR: 8253899: Make IsClassUnloadingEnabled signature match specification [v2] In-Reply-To: References: Message-ID: > Please review this change for hotspot and one test. > There is few JVMTI callback/event functions in jdk which signature doesn't match specification. > for example: > static jvmtiError JNICALL IsClassUnloadingEnabled(const jvmtiEnv* env, jboolean* enabled, ...) > but according to jvmti specs it should be: > static jvmtiError JNICALL IsClassUnloadingEnabled(const jvmtiEnv* env, ...) > same with ClassUnload(jvmtiEnv* jvmti_env, JNIEnv* jni_env, const char* name, ...) in tests > for many years that didn't matter but with coming JEP-391 it becomes important to make it match the spec > https://developer.apple.com/documentation/apple_silicon/addressing_architectural_differences_in_your_macos_code > This commit makes the above mentioned functions to have signature matching jvmti specification Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: Add impl of CSR JDK-8254014 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/466/files - new: https://git.openjdk.java.net/jdk/pull/466/files/1ef832d2..cd7c4d53 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=466&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=466&range=00-01 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/466.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/466/head:pull/466 PR: https://git.openjdk.java.net/jdk/pull/466 From rkennke at openjdk.java.net Mon Oct 12 12:21:21 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 12 Oct 2020 12:21:21 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v6] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference > and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads > that make heavvy use of such weak references will therefore potentially cause significant GC pauses. There are 3 main > items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is > theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly > reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking > bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever > marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all > objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it > will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter > of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for > FinalReference, marking stops there, and does not mark through the referent. 2. Concurrent processing is performed > after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and > depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list > (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no > referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers > in Reference.get() intrinsics that return NULL when the referent is not reachable. Testing: hotspot_gc_shenadoah > (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 44 commits: - Merge branch 'master' into shenandoah-concurrent-weakrefs - Add documentation to ShenandoahReferenceProcessor - Merge branch 'master' into shenandoah-concurrent-weakrefs - Merge remote-tracking branch 'upstream/master' into shenandoah-concurrent-weakrefs - Prevent double-discovery of references - Also abandon pending list when abandoning discovered lists - Implement reference-processing statistics - Implement abandoning partial discovery. Remove unused methods. - Remove leftovers of precleaning - Remove unused is_access_on_jlr_reference() helper method - ... and 34 more: https://git.openjdk.java.net/jdk/compare/05459df0...070fd836 ------------- Changes: https://git.openjdk.java.net/jdk/pull/505/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=05 Stats: 2400 lines in 54 files changed: 1661 ins; 567 del; 172 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From vkempik at openjdk.java.net Mon Oct 12 12:28:11 2020 From: vkempik at openjdk.java.net (Vladimir Kempik) Date: Mon, 12 Oct 2020 12:28:11 GMT Subject: RFR: 8253899: Make IsClassUnloadingEnabled signature match specification [v2] In-Reply-To: References: <5UPvhVjwK3TOKrd2rSzeCz-xT0tXsCUkMT9R8tgFR8I=.f1723d22-6883-4ad3-af55-a7437ef905de@github.com> Message-ID: On Mon, 5 Oct 2020 12:44:34 GMT, Vladimir Kempik wrote: >>> _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on >>> [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ >>> Hi Vladimir, >>> >>> On 2/10/2020 5:37 pm, Vladimir Kempik wrote: >>> >>> > On Fri, 2 Oct 2020 07:27:17 GMT, David Holmes wrote: >>> > > Okay but look at the example that documentation gives: >>> > > > For example, if the jvmtiParamInfo returned by GetExtensionEvents indicates that there is a jint parameter, the event >>> > > > handler should be declared: ``` >>> > > > void JNICALL myHandler(jvmtiEnv* jvmti_env, jint myInt, ...) >>> > > > ``` >>> > > >>> > > >>> > > The myInt is explicit, just as our "jboolean* enabled" is explicit. I think they key point is that the signature must >>> > > end with "..." which it does. >>> > > I don't see anything here that needs to be fixed. >>> > >>> > >>> > Hello David. On majority of platforms this would be fine. >>> > But on some platforms, variadic arguments and non variadic arguments are passed differently ( for example on >>> > macos-aarch64, variadic args are passed always on stack, non variadic on registers (and on stack for 9th+ arg) , that >>> > causes issues. >>> >>> Okay - I see the potential for a problme here but ... >>> >>> > If you still see no issues here we can delay and make this changeset part of JEP-391. >>> > But since this changeset isn't much macos-aarch64 specific, I thought it would be good to integrate it separately from >>> > jep-391. >>> >>> ... this change actually goes against the example in the spec, so if you >>> make this change it indicates the spec needs to be updated too. >>> >>> Cheers, >>> David >>> ----- >> >> Hello David >> >> I really believe the problem is in document here ( in examples) >> first, the doc clearly specify the type >> >> typedef jvmtiError (JNICALL *jvmtiExtensionFunction) >> (jvmtiEnv* jvmti_env, >> ...); >> >> then in examples it declares the function not matching this spec. >> >> Is it a good idea to update the docs in a separate bug ? >> >> Thanks, Vladimir > > Hello David > I have created CSR draft > https://bugs.openjdk.java.net/browse/JDK-8254014 > > Regards, Vladimir Hello I have updated the PR with changes from spec of CSR Regards, Vladimir ------------- PR: https://git.openjdk.java.net/jdk/pull/466 From rkennke at openjdk.java.net Mon Oct 12 12:34:25 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 12 Oct 2020 12:34:25 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v7] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference > and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads > that make heavvy use of such weak references will therefore potentially cause significant GC pauses. There are 3 main > items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is > theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly > reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking > bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever > marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all > objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it > will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter > of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for > FinalReference, marking stops there, and does not mark through the referent. 2. Concurrent processing is performed > after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and > depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list > (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no > referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers > in Reference.get() intrinsics that return NULL when the referent is not reachable. Testing: hotspot_gc_shenadoah > (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Don't mark through a Reference that's already been discovered ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/505/files - new: https://git.openjdk.java.net/jdk/pull/505/files/070fd836..34ca4991 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=05-06 Stats: 15 lines in 1 file changed: 7 ins; 7 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From mdoerr at openjdk.java.net Mon Oct 12 12:39:19 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Mon, 12 Oct 2020 12:39:19 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v4] In-Reply-To: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> References: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> Message-ID: On Thu, 8 Oct 2020 20:31:47 GMT, CoreyAshford wrote: >> This patch set encompasses the following commits: >> >> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for >> the intrinsic. The API is similar to the existing encodeBlock intrinsic. >> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic >> implementation >> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. >> - Adds a JMH microbenchmark for both Base64 encoding and encoding. >> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. > > CoreyAshford has updated the pull request incrementally with two additional commits since the last revision: > > - TestBase64.java: fix comment to correctly reflect actual intrinsic names. > > The intrinsic names that are visible with -XX:+PrintCompilation are encode > and decode, rather than encodeBlock and decodeBlock. > - stubGenerator_ppc.cpp: fix regression caused by change to using loop counter > > My original fix didn't account for the case where sl < block_size. In the > event sl < block_size, the shifted sl will become zero, so it should > jump to the code that computes how much data was processed - 0 - and return. src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3803: > 3801: // Base64 class will be used to process the last 12 characters. > 3802: __ sub(sl, sl, sp); > 3803: __ subi(sl, sl, 12); I think we should subtract 4, now. srawi will round it down below. We have no guarantee that we can subract more than 4 without getting negative value. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From github.com+4146708+a74nh at openjdk.java.net Mon Oct 12 12:45:19 2020 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Mon, 12 Oct 2020 12:45:19 GMT Subject: RFR: 8221554: aarch64 cross-modifying code [v3] In-Reply-To: References: Message-ID: > The AArch64 port uses maybe_isb in places where an ISB might be required > because the code may have safepointed. These maybe_isbs are very conservative > and are used in many places are used when a safepoint has not happened. > > cross_modify_fence was added in common code to place a barrier in all the > places after a safepoint has occurred. All the uses of it are in common code, > yet it remains unimplemented on AArch64. > > This set of patches implements cross_modify_fence for AArch64 and reconsiders > every uses of maybe_isb, discarding many of them. In addition, it introduces > a new diagnostic option, which when enabled on AArch64 tests the correct > usage of the barriers. > > Advantage of this patch is threefold: > * Reducing the number of ISBs - giving a theoretical performance improvement. > * Use of common code instead of backend specific code. > * Additional test diagnostic options > > Patch 1: Split cross_modify_fence > ================================= > This is simply refactoring work split out to simplify the other two patches. > > instruction_fence() is provided by each target and simply places > a fence for the instruction stream. > > cross_modify_fence() is now a member of JavaThread and just calls > instruction_fence. This function will be extended in Patch 3. > > Patch 2: Use cross_modify_fence instead of maybe_isb > ==================================================== > > The [n] References refer to the comments for cross_modify_fence in > thread.hpp. > > This is all the existing uses of maybe_isb in the AArch64 target: > > 1) Instances of Java code calling a VM function > * This encapsulates the changes to: > ** MacroAssembler::call_VM_leaf_base() > ** generate_fast_get_int_field0() > ** stubGenerator_aarch64 generate_throw_exception() > ** sharedRuntime_aarch64 generate_handler_blob() > ** SharedRuntime::generate_resolve_blob() > ** C1 LIR_Assembler::rt_call > ** C1 StubAssembler::call_RT(): used by Used by generate_exception_throw, > generate_handle_exception, generate_code_for. > ** OptoRuntime::generate_exception_blob() > * Any changes will be caught due to calls to [2] or [3] by the VM function. > * Any calls that do not call [2] or [3] do not require an ISB. > * This patch is more optimal for these cases. > > 2) Instances of Java code calling a JNI function > * This encapsulates the changes to: > ** SharedRuntime::generate_native_wrapper() > ** TemplateInterpreterGenerator::generate_native_entry() > * A safepoint still in progress after the call with be caught by [4]. > * An ISB is still required for the case where there was a safepoint > but it completed during the call. This happens if the code doesn't > branch on safepoint_in_progress > * In the SharedRuntime version, the two possible calls to > reguard_yellow_pages and complete_monitor_unlocking_C are after the thread > goes back into it's original state, so are covered by [2] and [3], the > same as a normal VM call. > * This patch is only more optimal for the two post-JNI calls. > > 3) Patching functions > * This encapsulates the changes to: > ** patch_callers_callsite() (called by gen_c2i_adapter()) > * This results in code being patched, but does not safepoint > * Therefore an ISB is required. > * This patch introduces no change here. > > 4) C1 MacroAssembler::emit_static_call_stub() > * Calls ISB (not maybe_isb) > * By design, the patching doesn't require that the up-to-date > destination is required for proper functioning. > * However, the ISB makes it most likely that the new destination will > be picked up. > * This patch introduces no change here. > > Patch 3: Add cross modify fence verification > ============================================ > > The VerifyCrossModifyFence diagnostic flag enables confirmation to the correct > usage of instruction barriers. It can safely be enabled on any Java run. > > Enabling it will cause the following: > > * Once all threads have been brought to a safepoint, each thread will be > marked. > > * On a cross_modify_fence and safepoint_fence the mark for that thread > will be cleared. > > * On entry to a method and in a safepoint poll, then the thread is checked. > If it is marked, then the code will error. Alan Hayward has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: - AArch64: Add cross modify fence verification - AArch64: Use cross_modify_fence instead of maybe_isb - Split cross_modify_fence ------------- Changes: https://git.openjdk.java.net/jdk/pull/428/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=428&range=02 Stats: 172 lines in 25 files changed: 125 ins; 8 del; 39 mod Patch: https://git.openjdk.java.net/jdk/pull/428.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/428/head:pull/428 PR: https://git.openjdk.java.net/jdk/pull/428 From github.com+4146708+a74nh at openjdk.java.net Mon Oct 12 12:50:14 2020 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Mon, 12 Oct 2020 12:50:14 GMT Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: References: <35eLsMpWmcCUoiEWhnYdSpZNmvLy4ra56Qtd6eRW574=.4e7c9278-3e0d-457d-9c15-eef45bae9755@github.com> Message-ID: On Mon, 5 Oct 2020 16:05:43 GMT, Alan Hayward wrote: >>> _Mailing list message from [Andrew Haley](mailto:aph at redhat.com) on [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ >>> >>> On 02/10/2020 09:08, Alan Hayward wrote: >>> >>> > I have no strong feeling on the names of these functions. >>> > The reason for moving it was that in the third part (the verification testing) it needs JavaThread. >>> >>> Right, but you can get the JavaThread efficiently any time you want, >>> so you don't need to pass it to cross_modify_fence(). >>> >> >> Oh, ok, didn't spot that. >> This would result in code in OrderAccess.cpp calling a function in JavaThread. >> It feels that OrderAccess should be much lower level than JavaThread. But, that might be ok. > > Patch updated. > > * cross_modify_fence now calls cross_modify_fence_impl as suggested. > > * ISBs in the JNI calls have been removed. This means that it is currently unsafe to merge until > https://github.com/openjdk/jdk/pull/296 has been merged. https://github.com/openjdk/jdk/pull/296 has been merged to head. Rebased this request and re-tested. All review comments have been addressed. Note that with VerifyCrossModifyFence set, testing took longer by a factor of *2.5. ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From dholmes at openjdk.java.net Mon Oct 12 12:56:26 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 12 Oct 2020 12:56:26 GMT Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: References: <35eLsMpWmcCUoiEWhnYdSpZNmvLy4ra56Qtd6eRW574=.4e7c9278-3e0d-457d-9c15-eef45bae9755@github.com> Message-ID: On Mon, 12 Oct 2020 12:47:46 GMT, Alan Hayward wrote: >> Patch updated. >> >> * cross_modify_fence now calls cross_modify_fence_impl as suggested. >> >> * ISBs in the JNI calls have been removed. This means that it is currently unsafe to merge until >> https://github.com/openjdk/jdk/pull/296 has been merged. > > https://github.com/openjdk/jdk/pull/296 has been merged to head. > > Rebased this request and re-tested. > All review comments have been addressed. > Note that with VerifyCrossModifyFence set, testing took longer by a factor of *2.5. @a74nh Please do not force-push commits on an open PR as it breaks the commit history and prevents reviewers from seeing what has changed since they last reviewed things. If you need to "rebase" you can just merge your branch with an updated master branch and push the merge commit to your personal fork. The skara tooling will flatten the commits into a single clean commit when integration happens. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From erikj at openjdk.java.net Mon Oct 12 12:59:23 2020 From: erikj at openjdk.java.net (Erik Joelsson) Date: Mon, 12 Oct 2020 12:59:23 GMT Subject: RFR: 8223347: Integration of Vector API (Incubator) In-Reply-To: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> References: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> Message-ID: On Fri, 25 Sep 2020 20:14:29 GMT, Paul Sandoz wrote: > This pull request is for integration of the Vector API. It was previously reviewed under conditions when mercurial was > used for the source code control system. Review threads can be found here (searching for issue number 8223347 in the > title): https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-April/thread.html > https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-May/thread.html > https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-July/thread.html > > If mercurial was still being used the code would be pushed directly, once the CSR is approved. However, in this case a > pull request is required and needs explicit reviewer approval. Between the final review and this pull request no code > has changed, except for that related to merging. Build changes look good. ------------- Marked as reviewed by erikj (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/367 From mdoerr at openjdk.java.net Mon Oct 12 13:00:25 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Mon, 12 Oct 2020 13:00:25 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v4] In-Reply-To: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> References: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> Message-ID: <2zhMnWcr1cuE5sxUyCvU9bN5HP_ph_1xQEV3wdx_7dg=.3d509c17-db83-432c-a983-79137d12a827@github.com> On Thu, 8 Oct 2020 20:31:47 GMT, CoreyAshford wrote: >> This patch set encompasses the following commits: >> >> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for >> the intrinsic. The API is similar to the existing encodeBlock intrinsic. >> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic >> implementation >> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. >> - Adds a JMH microbenchmark for both Base64 encoding and encoding. >> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. > > CoreyAshford has updated the pull request incrementally with two additional commits since the last revision: > > - TestBase64.java: fix comment to correctly reflect actual intrinsic names. > > The intrinsic names that are visible with -XX:+PrintCompilation are encode > and decode, rather than encodeBlock and decodeBlock. > - stubGenerator_ppc.cpp: fix regression caused by change to using loop counter > > My original fix didn't account for the case where sl < block_size. In the > event sl < block_size, the shifted sl will become zero, so it should > jump to the code that computes how much data was processed - 0 - and return. src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3745: > 3743: __ clrldi(isURL, isURL, 32); > 3744: > 3745: // Load constant vec registers that need to be loaded from memory With larger unroll factor we run through this code more often without making any progress, because only the Java part does all the work for the remaining bytes. Would be nice to move unnecessary parts for that between mtctr and align. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From dholmes at openjdk.java.net Mon Oct 12 13:03:20 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 12 Oct 2020 13:03:20 GMT Subject: RFR: 8253899: Make IsClassUnloadingEnabled signature match specification [v2] In-Reply-To: References: Message-ID: <2BRqQiAy2eXizRcLeuj1-R_F-Rxv-GcGKMkwFSNMaMU=.9837528f-f42e-48c5-a5e1-dfa3ff4289ef@github.com> On Mon, 12 Oct 2020 12:07:23 GMT, Vladimir Kempik wrote: >> Please review this change for hotspot and one test. >> There is few JVMTI callback/event functions in jdk which signature doesn't match specification. >> for example: >> static jvmtiError JNICALL IsClassUnloadingEnabled(const jvmtiEnv* env, jboolean* enabled, ...) >> but according to jvmti specs it should be: >> static jvmtiError JNICALL IsClassUnloadingEnabled(const jvmtiEnv* env, ...) >> same with ClassUnload(jvmtiEnv* jvmti_env, JNIEnv* jni_env, const char* name, ...) in tests >> for many years that didn't matter but with coming JEP-391 it becomes important to make it match the spec >> https://developer.apple.com/documentation/apple_silicon/addressing_architectural_differences_in_your_macos_code >> This commit makes the above mentioned functions to have signature matching jvmti specification > > Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: > > Add impl of CSR JDK-8254014 Looks good. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/466 From vkempik at openjdk.java.net Mon Oct 12 13:20:14 2020 From: vkempik at openjdk.java.net (Vladimir Kempik) Date: Mon, 12 Oct 2020 13:20:14 GMT Subject: Integrated: 8253899: Make IsClassUnloadingEnabled signature match specification In-Reply-To: References: Message-ID: On Thu, 1 Oct 2020 15:02:01 GMT, Vladimir Kempik wrote: > Please review this change for hotspot and one test. > There is few JVMTI callback/event functions in jdk which signature doesn't match specification. > for example: > static jvmtiError JNICALL IsClassUnloadingEnabled(const jvmtiEnv* env, jboolean* enabled, ...) > but according to jvmti specs it should be: > static jvmtiError JNICALL IsClassUnloadingEnabled(const jvmtiEnv* env, ...) > same with ClassUnload(jvmtiEnv* jvmti_env, JNIEnv* jni_env, const char* name, ...) in tests > for many years that didn't matter but with coming JEP-391 it becomes important to make it match the spec > https://developer.apple.com/documentation/apple_silicon/addressing_architectural_differences_in_your_macos_code > This commit makes the above mentioned functions to have signature matching jvmti specification This pull request has now been integrated. Changeset: c7f00640 Author: Vladimir Kempik URL: https://git.openjdk.java.net/jdk/commit/c7f00640 Stats: 20 lines in 3 files changed: 17 ins; 0 del; 3 mod 8253899: Make IsClassUnloadingEnabled signature match specification Reviewed-by: sspitsyn, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/466 From github.com+4146708+a74nh at openjdk.java.net Mon Oct 12 13:53:13 2020 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Mon, 12 Oct 2020 13:53:13 GMT Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: References: <35eLsMpWmcCUoiEWhnYdSpZNmvLy4ra56Qtd6eRW574=.4e7c9278-3e0d-457d-9c15-eef45bae9755@github.com> Message-ID: On Mon, 12 Oct 2020 12:53:29 GMT, David Holmes wrote: > @a74nh Please do not force-push commits on an open PR as it breaks the commit history and prevents reviewers from > seeing what has changed since they last reviewed things. If you need to "rebase" you can just merge your branch with an > updated master branch and push the merge commit to your personal fork. The skara tooling will flatten the commits into > a single clean commit when integration happens. Thanks. Not a fan of working with merge commits and I feel it gets muddled when you have history on top of a patch series (as opposed to a single patch). However, understood - I'll make sure to merge instead of force pushing next time. ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From fyang at openjdk.java.net Mon Oct 12 14:47:32 2020 From: fyang at openjdk.java.net (Fei Yang) Date: Mon, 12 Oct 2020 14:47:32 GMT Subject: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v8] In-Reply-To: References: Message-ID: <76qazRT5aX06rurPVGQmtfH2af9_l7DEdy_mGyF7BQQ=.4dd45e3f-b7a4-44c9-9824-4daab9b7dc3b@github.com> > Contributed-by: ard.biesheuvel at linaro.org, dongbo4 at huawei.com > > This added an intrinsic for SHA3 using aarch64 v8.2 SHA3 Crypto Extensions. > Reference implementation for core SHA-3 transform using ARMv8.2 Crypto Extensions: > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/arm64/crypto/sha3-ce-core.S?h=v5.4.52 > > Trivial adaptation in SHA3. implCompress is needed for the purpose of adding the intrinsic. > For SHA3, we need to pass one extra parameter "digestLength" to the stub for the calculation of block size. > "digestLength" is also used in for the EOR loop before keccak to differentiate different SHA3 variants. > > We added jtreg tests for SHA3 and used QEMU system emulator which supports SHA3 instructions to test the functionality. > Patch passed jtreg tier1-3 tests with QEMU system emulator. > Also verified with jtreg tier1-3 tests without SHA3 instructions on aarch64-linux-gnu and x86_64-linux-gnu, to make > sure that there's no regression. > We used one existing JMH test for performance test: test/micro/org/openjdk/bench/java/security/MessageDigests.java > We measured the performance benefit with an aarch64 cycle-accurate simulator. > Patch delivers 20% - 40% performance improvement depending on specific SHA3 digest length and size of the message. > > For now, this feature will not be enabled automatically for aarch64. We can auto-enable this when it is fully tested on > real hardware. But for the above testing purposes, this is auto-enabled when the corresponding hardware feature is > detected. Fei Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: - Remove unnecessary code changes in vm_version_aarch64.cpp - Merge master - Merge master - Merge master - Merge master - Add sha3 instructions to cpu/aarch64/aarch64-asmtest.py and regenerate the test in assembler_aarch64.cpp:asm_check - Rebase - Merge master - Fix trailing whitespace issue - 8252204: AArch64: Implement SHA3 accelerator/intrinsic Contributed-by: ard.biesheuvel at linaro.org, dongbo4 at huawei.com ------------- Changes: https://git.openjdk.java.net/jdk/pull/207/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=207&range=07 Stats: 1498 lines in 35 files changed: 1011 ins; 22 del; 465 mod Patch: https://git.openjdk.java.net/jdk/pull/207.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/207/head:pull/207 PR: https://git.openjdk.java.net/jdk/pull/207 From aph at openjdk.java.net Mon Oct 12 14:52:22 2020 From: aph at openjdk.java.net (Andrew Haley) Date: Mon, 12 Oct 2020 14:52:22 GMT Subject: RFR: 8221554: aarch64 cross-modifying code [v3] In-Reply-To: References: Message-ID: On Mon, 12 Oct 2020 12:45:19 GMT, Alan Hayward wrote: >> The AArch64 port uses maybe_isb in places where an ISB might be required >> because the code may have safepointed. These maybe_isbs are very conservative >> and are used in many places are used when a safepoint has not happened. >> >> cross_modify_fence was added in common code to place a barrier in all the >> places after a safepoint has occurred. All the uses of it are in common code, >> yet it remains unimplemented on AArch64. >> >> This set of patches implements cross_modify_fence for AArch64 and reconsiders >> every uses of maybe_isb, discarding many of them. In addition, it introduces >> a new diagnostic option, which when enabled on AArch64 tests the correct >> usage of the barriers. >> >> Advantage of this patch is threefold: >> * Reducing the number of ISBs - giving a theoretical performance improvement. >> * Use of common code instead of backend specific code. >> * Additional test diagnostic options >> >> Patch 1: Split cross_modify_fence >> ================================= >> This is simply refactoring work split out to simplify the other two patches. >> >> instruction_fence() is provided by each target and simply places >> a fence for the instruction stream. >> >> cross_modify_fence() is now a member of JavaThread and just calls >> instruction_fence. This function will be extended in Patch 3. >> >> Patch 2: Use cross_modify_fence instead of maybe_isb >> ==================================================== >> >> The [n] References refer to the comments for cross_modify_fence in >> thread.hpp. >> >> This is all the existing uses of maybe_isb in the AArch64 target: >> >> 1) Instances of Java code calling a VM function >> * This encapsulates the changes to: >> ** MacroAssembler::call_VM_leaf_base() >> ** generate_fast_get_int_field0() >> ** stubGenerator_aarch64 generate_throw_exception() >> ** sharedRuntime_aarch64 generate_handler_blob() >> ** SharedRuntime::generate_resolve_blob() >> ** C1 LIR_Assembler::rt_call >> ** C1 StubAssembler::call_RT(): used by Used by generate_exception_throw, >> generate_handle_exception, generate_code_for. >> ** OptoRuntime::generate_exception_blob() >> * Any changes will be caught due to calls to [2] or [3] by the VM function. >> * Any calls that do not call [2] or [3] do not require an ISB. >> * This patch is more optimal for these cases. >> >> 2) Instances of Java code calling a JNI function >> * This encapsulates the changes to: >> ** SharedRuntime::generate_native_wrapper() >> ** TemplateInterpreterGenerator::generate_native_entry() >> * A safepoint still in progress after the call with be caught by [4]. >> * An ISB is still required for the case where there was a safepoint >> but it completed during the call. This happens if the code doesn't >> branch on safepoint_in_progress >> * In the SharedRuntime version, the two possible calls to >> reguard_yellow_pages and complete_monitor_unlocking_C are after the thread >> goes back into it's original state, so are covered by [2] and [3], the >> same as a normal VM call. >> * This patch is only more optimal for the two post-JNI calls. >> >> 3) Patching functions >> * This encapsulates the changes to: >> ** patch_callers_callsite() (called by gen_c2i_adapter()) >> * This results in code being patched, but does not safepoint >> * Therefore an ISB is required. >> * This patch introduces no change here. >> >> 4) C1 MacroAssembler::emit_static_call_stub() >> * Calls ISB (not maybe_isb) >> * By design, the patching doesn't require that the up-to-date >> destination is required for proper functioning. >> * However, the ISB makes it most likely that the new destination will >> be picked up. >> * This patch introduces no change here. >> >> Patch 3: Add cross modify fence verification >> ============================================ >> >> The VerifyCrossModifyFence diagnostic flag enables confirmation to the correct >> usage of instruction barriers. It can safely be enabled on any Java run. >> >> Enabling it will cause the following: >> >> * Once all threads have been brought to a safepoint, each thread will be >> marked. >> >> * On a cross_modify_fence and safepoint_fence the mark for that thread >> will be cleared. >> >> * On entry to a method and in a safepoint poll, then the thread is checked. >> If it is marked, then the code will error. > > Alan Hayward has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains three commits: > - AArch64: Add cross modify fence verification > - AArch64: Use cross_modify_fence instead of maybe_isb > - Split cross_modify_fence src/hotspot/os_cpu/linux_aarch64/orderAccess_linux_aarch64.hpp line 35: > 33: #define inlasm_isb() asm volatile("isb" : : : "memory") > 34: > 35: // Implementation of class OrderAccess. This #define of inlasm_isb() looks wrong. Surely it should be in the body of OrderAccess::cross_modify_fence_impl() given that it's not used anywhere else. All the extra indirection does is confuse the reader. ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From github.com+4146708+a74nh at openjdk.java.net Mon Oct 12 15:07:14 2020 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Mon, 12 Oct 2020 15:07:14 GMT Subject: RFR: 8221554: aarch64 cross-modifying code [v3] In-Reply-To: References: Message-ID: On Mon, 12 Oct 2020 14:49:01 GMT, Andrew Haley wrote: >> Alan Hayward has updated the pull request with a new target base due to a merge or a rebase. The pull request now >> contains three commits: >> - AArch64: Add cross modify fence verification >> - AArch64: Use cross_modify_fence instead of maybe_isb >> - Split cross_modify_fence > > src/hotspot/os_cpu/linux_aarch64/orderAccess_linux_aarch64.hpp line 35: > >> 33: #define inlasm_isb() asm volatile("isb" : : : "memory") >> 34: >> 35: // Implementation of class OrderAccess. > > This #define of inlasm_isb() looks wrong. Surely it should be in the body of OrderAccess::cross_modify_fence_impl() > given that it's not used anywhere else. All the extra indirection does is confuse the reader. Agreed. It was designed to fit with my patch which did the same for the dmb's - but I've closed that patch. Will fix. ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From aph at redhat.com Mon Oct 12 16:14:49 2020 From: aph at redhat.com (Andrew Haley) Date: Mon, 12 Oct 2020 17:14:49 +0100 Subject: RFR: 8221554: aarch64 cross-modifying code [v3] In-Reply-To: References: Message-ID: <130c9589-273f-d6c8-dc74-483ba9d53e1f@redhat.com> So, the good news and the bad news: Moving to cross_modify_fence reduces the number of ISBs from 3,840,210 maybe_isb()s to 74,538 cross_modify_fence()s on my poster child application, which is recompiling all of java.base. However, this is a program that runs for 187,501,798,979 insns, so we've reduced the proportion of ISBs from 0.002% to 0.00004%. I guess that's worth having, but I doubt that the improvement would ever have been above the noise level. On the good side, this at least makes AArch64 more like other targets. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From github.com+51754783+coreyashford at openjdk.java.net Mon Oct 12 17:55:14 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Mon, 12 Oct 2020 17:55:14 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v4] In-Reply-To: References: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> Message-ID: On Mon, 12 Oct 2020 12:36:19 GMT, Martin Doerr wrote: >> CoreyAshford has updated the pull request incrementally with two additional commits since the last revision: >> >> - TestBase64.java: fix comment to correctly reflect actual intrinsic names. >> >> The intrinsic names that are visible with -XX:+PrintCompilation are encode >> and decode, rather than encodeBlock and decodeBlock. >> - stubGenerator_ppc.cpp: fix regression caused by change to using loop counter >> >> My original fix didn't account for the case where sl < block_size. In the >> event sl < block_size, the shifted sl will become zero, so it should >> jump to the code that computes how much data was processed - 0 - and return. > > src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3803: > >> 3801: // Base64 class will be used to process the last 12 characters. >> 3802: __ sub(sl, sl, sp); >> 3803: __ subi(sl, sl, 12); > > I think we should subtract 4, now. srawi will round it down below. We have no guarantee that we can subract more than 4 > without getting negative value. In the original paper this code is based upon, they subtract 12 because of the overwrite issue. This is discussed in the preceding code comment as well. So I think that needs to be retained, but I do need to check for a negative after the subtract. According to the `srawi.` specification: > CA and CA32 are set to 1 if the low-order 32 bits of (RS) contain a negative number and any 1-bits are shifted out of > position 63; otherwise CA and CA32 are set to 0. Because the `sub` instruction is a 64-bit subtract, all of the upper bits should be 1's if sl is negative after the subtract, so I think the `srawi.` should catch the negative case if I also check CA after the srawi, via: __ srawi_(sl, sl, block_size_shift); // if XER CA is set, sl was less than zero. __ mcrxrx(CCR2); // moves XER's OV, OV32, CA, CA32 to CCR2's LT, GT, EQ, SO bits, respectively. __ beq_predict_not_taken(CCR2, unrolled_loop_exit); __ beq_predict_not_taken(CCR0, unrolled_loop_exit); ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From mcimadamore at openjdk.java.net Mon Oct 12 17:58:54 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Mon, 12 Oct 2020 17:58:54 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v7] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the third incubation round of the foreign memory access API incubation > (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: > * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from > multiple threads > * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee > that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class > has been added, which defines several useful dereference routines; these are really just thin wrappers around memory > access var handles, but they make the barrier of entry for using this API somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not > the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link > to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit > of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which > wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as > dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability > in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; > secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can > use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done > by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided > below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be > happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, > Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd > like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio > Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative > to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a > carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access > base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte > offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. > `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which > it is easy to derive all the other handles using plain var handle combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both > `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients > can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. > * `MemoryLayouts` > * A new layout, for machine addresses, has been added to the mix. > > > > ### Implementation changes > > There are two main things to discuss here: support for shared segments, and the general simplification of the memory > access var handle support. > #### Shared segments > > The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to > achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment > is shared, it would be possible for a thread to close it while another is accessing it. After considering several > options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he > reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world > (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a > close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and > the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). > It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of > these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, > we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to > whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of > stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, > once we detect that a thread is accessing the very segment we're about to close, what should happen? We first > experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it > fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the > machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to > minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread > is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and > try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should > be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single > place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in > addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` > annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) > class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, > like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is > tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during > access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory > access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead > of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a > `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed > successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two > implementations, one for confined segments and one for shared segments; the main difference between the two is what > happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared > segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or > `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` > state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. #### > Memory access var handles overhaul The key realization here was that if all memory access var handles took a > coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle > form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var > handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that > e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the > implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level > access on the innards of the memory access var handle. All that code is now gone. #### Test changes Not much to see > here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, > since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` > functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the > microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared > segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - > https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Tweak support for mapped memory segments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/548/files - new: https://git.openjdk.java.net/jdk/pull/548/files/770b1e9c..75e406c0 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=05-06 Stats: 543 lines in 13 files changed: 336 ins; 151 del; 56 mod Patch: https://git.openjdk.java.net/jdk/pull/548.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/548/head:pull/548 PR: https://git.openjdk.java.net/jdk/pull/548 From mcimadamore at openjdk.java.net Mon Oct 12 18:10:18 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Mon, 12 Oct 2020 18:10:18 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v6] In-Reply-To: References: Message-ID: On Mon, 12 Oct 2020 11:42:04 GMT, Magnus Ihse Bursie wrote: >> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: >> >> Tweak referenced to MemoryAddressProxy in Utils.java > > Build changes look good. I've just uploaded a biggie update to the foreign memory access support. While doing performance evaluation, we have realized that mixing a multi-level hierarchy (`MappedMemorySegment extends MemorySegments`) with exact invoke semantics of `VarHandle` and `MethodHandle` is not a good match and can lead to great performance degradation for seemingly "correct" code. While some of this can be attributed to the `VarHandle` API, or to the fact that the so called "generic" invocation path should not be that slow in case where the parameters are clearly related, it seems smelly that a primitive API such as `MemorySegment` should give raise to such issues. We have therefore decided to drop the `MappedMemorySegment` - this means that there's only one memory segment type users can deal with: `MemorySegment` - and no chance for mistakes. Of course `MappedMemorySegment` has been primarily introduces to allow for operations which were previously possible on `MappedByteBuffer` such as `force`. To support these use cases, a separate class has been introduced, namely `MappedMemorySegments` (note the trailing `S`). This class contains a bunch of static methods which can be used to achieve the desired effects, without polluting the `MemorySegment` API. A new method has been added on `MemorySegment` which returns an optional file descriptor; this might be useful for clients which want to guess whether a segment is in fact a mapped segment, or if they need (e.g. in Windows) the file descriptor to do some other kind of low level op. I think this approach is more true to the goals and spirit of the Foreign Memory Access API, and it also offers some ways to improve over the existing API: for instance, the only reason why the `MemorySegment::spliterator` method was a static method was that we needed inference, so that we could return either a `Spliterator` or a `Spliterator`. All of that is gone now, so the method can return to be what it morally always has been: an instance method on `MemorySegment`. Updated javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v2/javadoc/jdk/incubator/foreign/package-summary.html Updated specdiff: http://cr.openjdk.java.net/~mcimadamore/8254162_v2/specdiff/overview-summary.html ------------- PR: https://git.openjdk.java.net/jdk/pull/548 From hoffmann at mountainminds.com Mon Oct 12 18:12:50 2020 From: hoffmann at mountainminds.com (Marc Hoffmann) Date: Mon, 12 Oct 2020 20:12:50 +0200 Subject: arm32 builds continue to fail for me after 8253540 and 8253901 Message-ID: Hi, since JDK-8253540 has been applied to master (77a0f3999afa322b64643afd4a161164440af975) JDK builds on arm32 fail for me, even after the corresponding fix JDK-8253901. The newly built JVM crashes right after the start with SIGSEGV. Please find the build log and the hs_err file for commit fd0cb98ed03c6214c02ccd3503c1e6d77065a428 attached. Is there any additional information I can provide to help getting these builds fixed again? Thanks and best regards, -marc === build.log === Configuration summary: * Debug level: release * HS debug level: product * JVM variants: server * JVM features: server: 'cds compiler1 compiler2 epsilongc g1gc jfr jni-check jvmti management nmt parallelgc serialgc services vm-structs' * OpenJDK target: OS: linux, CPU architecture: arm, address length: 32 * Version string: 16-internal+0-adhoc..workspace (16-internal) Tools summary: * Boot JDK: openjdk version "15" 2020-09-15 OpenJDK Runtime Environment AdoptOpenJDK (build 15+36) OpenJDK Server VM AdoptOpenJDK (build 15+36, mixed mode) (at /opt/java/openjdk) * Toolchain: gcc (GNU Compiler Collection) * C Compiler: Version 7.5.0 (at /usr/bin/gcc) * C++ Compiler: Version 7.5.0 (at /usr/bin/g++) Build performance summary: * Cores to use: 3 * Memory limit: 3827 MB Building target 'images' in configuration 'linux-arm-server-release' Compiling 8 files for BUILD_TOOLS_LANGTOOLS Warning: No SCM configuration present and no .src-rev Parsing 2 properties into enum-like class for jdk.compiler Compiling 13 properties into resource bundles for jdk.javadoc Compiling 12 properties into resource bundles for jdk.jdeps Compiling 7 properties into resource bundles for jdk.jshell Compiling 16 properties into resource bundles for jdk.compiler Compiling 127 files for BUILD_java.compiler.interim Compiling 398 files for BUILD_jdk.compiler.interim Compiling 226 files for BUILD_jdk.javadoc.interim Compiling 1 files for BUILD_TOOLS_HOTSPOT Compiling 185 files for BUILD_TOOLS_JDK Compiling 31 files for BUILD_JRTFS Creating hotspot/variant-server/tools/adlc/adlc from 13 file(s) Compiling 2 files for BUILD_JVMTI_TOOLS Creating support/modules_libs/java.base/jrt-fs.jar Compiling 2 files for COMPILE_DEPEND Compiling 2 files for BUILD_BREAKITERATOR_BASE Compiling 2 files for BUILD_BREAKITERATOR_LD Compiling 11 properties into resource bundles for java.logging Compiling 11 properties into resource bundles for java.base Compiling 6 properties into resource bundles for java.base Compiling 5 properties into resource bundles for jdk.jlink Compiling 3 properties into resource bundles for jdk.jlink Compiling 1 properties into resource bundles for jdk.jlink Compiling 11 properties into resource bundles for jdk.jartool Compiling 71 files for COMPILE_CREATE_SYMBOLS Creating javadoc element list Compiling 11 properties into resource bundles for jdk.management.agent Compiling 3 properties into resource bundles for jdk.jdi Compiling 224 properties into resource bundles for jdk.localedata Compiling 3050 files for java.base Creating support/modules_libs/java.base/server/libjvm.so from 856 file(s) Compiling 89 properties into resource bundles for java.desktop Creating ct.sym classes Updating support/src.zip Compiling 127 files for java.compiler Compiling 18 files for java.datatransfer Compiling 1845 files for java.xml Compiling 10 files for java.instrument Compiling 35 files for java.logging Compiling 330 files for java.management Compiling 30 files for java.security.sasl Compiling 131 files for java.rmi Compiling 141 files for java.net.http Compiling 15 files for java.scripting Compiling 5 files for java.transaction.xa Compiling 275 files for java.xml.crypto Compiling 22 files for java.smartcardio Compiling 61 files for jdk.internal.jvmstat Compiling 120 files for jdk.charsets Compiling 402 files for jdk.compiler Compiling 35 files for jdk.crypto.ec Compiling 68 files for jdk.dynalink Compiling 3 files for jdk.internal.ed Compiling 44 files for jdk.httpserver Compiling 21 files for jdk.incubator.foreign Compiling 51 files for jdk.internal.opt Compiling 100 files for jdk.internal.le Compiling 31 files for jdk.jartool Compiling 226 files for jdk.javadoc Compiling 24 files for jdk.management Compiling 1 files for jdk.jdwp.agent Compiling 194 files for jdk.jfr Compiling 4 files for jdk.jsobject Compiling 11 files for jdk.jstatd Compiling 1797 files for jdk.localedata Compiling 14 files for jdk.management.jfr Compiling 8 files for jdk.net Compiling 2 files for jdk.nio.mapmode Compiling 33 files for jdk.sctp Compiling 9 files for jdk.unsupported Compiling 94 files for jdk.xml.dom Compiling 14 files for jdk.zipfs Compiling 15 files for java.prefs Compiling 198 files for java.naming Compiling 77 files for java.sql Compiling 15 files for jdk.attach Compiling 74 files for jdk.crypto.cryptoki Compiling 136 files for jdk.jdeps Compiling 40 files for jdk.jcmd Compiling 251 files for jdk.jdi Compiling 16 files for jdk.naming.dns Compiling 8 files for jdk.naming.rmi Compiling 16 files for java.management.rmi Compiling 220 files for java.security.jgss Compiling 2781 files for java.desktop Compiling 56 files for java.sql.rowset Compiling 84 files for jdk.jlink Compiling 31 files for jdk.management.agent Compiling 95 files for jdk.jshell Compiling 30 files for jdk.security.auth Compiling 16 files for jdk.security.jgss Compiling 1 files for java.se Compiling 18 files for jdk.accessibility Compiling 3 files for jdk.editpad Compiling 948 files for jdk.hotspot.agent Compiling 47 files for jdk.incubator.jpackage Compiling 64 files for jdk.jconsole Compiling 8 files for jdk.unsupported.desktop Creating support/modules_libs/java.base/libverify.so from 1 file(s) Creating support/modules_libs/java.base/libjava.so from 59 file(s) Creating support/native/java.base/libfdlibm.a from 57 file(s) Creating support/modules_libs/java.base/libzip.so from 5 file(s) Creating support/modules_libs/java.base/libjimage.so from 6 file(s) Creating support/modules_libs/java.base/libjli.so from 8 file(s) Creating support/modules_libs/java.base/libnet.so from 21 file(s) Creating support/modules_libs/java.base/libnio.so from 20 file(s) Creating support/modules_libs/java.base/libjsig.so from 1 file(s) Creating support/modules_libs/java.prefs/libprefs.so from 1 file(s) Creating support/modules_cmds/java.base/java from 1 file(s) Creating support/modules_cmds/java.base/keytool from 1 file(s) Creating support/modules_libs/java.base/jexec from 1 file(s) Creating support/modules_libs/java.base/jspawnhelper from 1 file(s) Creating support/modules_libs/java.instrument/libinstrument.so from 12 file(s) Creating support/modules_libs/java.desktop/libmlib_image.so from 50 file(s) Creating support/modules_libs/java.desktop/libawt.so from 72 file(s) Creating support/modules_libs/java.desktop/libawt_xawt.so from 51 file(s) Creating support/modules_libs/java.desktop/liblcms.so from 27 file(s) Creating support/modules_libs/java.desktop/libjavajpeg.so from 46 file(s) Creating support/modules_libs/java.desktop/libawt_headless.so from 26 file(s) Creating support/modules_libs/java.desktop/libharfbuzz.so from 53 file(s) Creating support/modules_libs/java.desktop/libfontmanager.so from 8 file(s) Creating support/modules_libs/java.desktop/libjawt.so from 1 file(s) Creating support/modules_libs/java.desktop/libsplashscreen.so from 67 file(s) Creating support/modules_libs/java.desktop/libjsound.so from 18 file(s) Creating support/modules_libs/java.management/libmanagement.so from 9 file(s) Creating support/modules_libs/java.rmi/librmi.so from 1 file(s) Creating support/modules_cmds/java.rmi/rmid from 1 file(s) Creating support/modules_cmds/java.rmi/rmiregistry from 1 file(s) Creating support/modules_cmds/java.scripting/jrunscript from 1 file(s) Creating support/modules_libs/java.security.jgss/libj2gss.so from 3 file(s) Creating support/modules_libs/java.smartcardio/libj2pcsc.so from 2 file(s) Creating support/modules_libs/jdk.attach/libattach.so from 1 file(s) Creating support/modules_cmds/jdk.compiler/javac from 1 file(s) Creating support/modules_cmds/jdk.compiler/serialver from 1 file(s) Creating support/modules_libs/jdk.crypto.cryptoki/libj2pkcs11.so from 14 file(s) Creating support/modules_libs/jdk.hotspot.agent/libsaproc.so from 10 file(s) Creating support/modules_cmds/jdk.hotspot.agent/jhsdb from 1 file(s) Creating support/modules_cmds/jdk.jdeps/javap from 1 file(s) Creating support/modules_cmds/jdk.jdeps/jdeps from 1 file(s) Creating support/modules_cmds/jdk.jdeps/jdeprscan from 1 file(s) Creating support/modules_cmds/jdk.jlink/jimage from 1 file(s) Creating support/modules_cmds/jdk.jlink/jlink from 1 file(s) Creating support/modules_cmds/jdk.jlink/jmod from 1 file(s) Creating jdk/modules/jdk.incubator.jpackage/jdk/incubator/jpackage/internal/resources/jpackageapplauncher from 15 file(s) Creating support/modules_cmds/jdk.incubator.jpackage/jpackage from 1 file(s) Creating support/modules_cmds/jdk.jartool/jar from 1 file(s) Creating support/modules_cmds/jdk.jartool/jarsigner from 1 file(s) Creating support/modules_cmds/jdk.javadoc/javadoc from 1 file(s) Creating support/modules_cmds/jdk.jcmd/jinfo from 1 file(s) Creating support/modules_cmds/jdk.jcmd/jmap from 1 file(s) Creating support/modules_cmds/jdk.jcmd/jps from 1 file(s) Creating support/modules_cmds/jdk.jcmd/jstack from 1 file(s) Creating support/modules_cmds/jdk.jcmd/jstat from 1 file(s) Creating support/modules_cmds/jdk.jcmd/jcmd from 1 file(s) Creating support/modules_libs/jdk.management/libmanagement_ext.so from 8 file(s) Creating support/modules_libs/jdk.management.agent/libmanagement_agent.so from 1 file(s) Creating support/modules_cmds/jdk.jconsole/jconsole from 1 file(s) Creating support/modules_libs/jdk.jdwp.agent/libdt_socket.so from 2 file(s) Creating support/modules_libs/jdk.jdwp.agent/libjdwp.so from 43 file(s) Creating support/modules_cmds/jdk.jdi/jdb from 1 file(s) Creating support/modules_cmds/jdk.jfr/jfr from 1 file(s) Creating support/modules_cmds/jdk.jshell/jshell from 1 file(s) Creating support/modules_cmds/jdk.jstatd/jstatd from 1 file(s) Creating support/modules_libs/jdk.net/libextnet.so from 1 file(s) Creating support/modules_libs/jdk.sctp/libsctp.so from 2 file(s) Creating support/modules_libs/jdk.security.auth/libjaas.so from 1 file(s) Updating images/sec-bin.zip Compiling 4 files for BUILD_JIGSAW_TOOLS Optimizing the exploded image Creating java.datatransfer.jmod Creating java.compiler.jmod Creating java.desktop.jmod Creating java.instrument.jmod Creating java.logging.jmod Creating java.management.jmod Creating java.management.rmi.jmod Creating java.naming.jmod Creating java.net.http.jmod Creating java.prefs.jmod Creating java.rmi.jmod Creating java.scripting.jmod Creating java.se.jmod Creating java.security.jgss.jmod Creating java.security.sasl.jmod Creating java.smartcardio.jmod Creating java.sql.jmod Creating java.sql.rowset.jmod Creating java.transaction.xa.jmod Creating java.xml.jmod Creating java.xml.crypto.jmod Creating jdk.accessibility.jmod Creating jdk.attach.jmod Creating jdk.charsets.jmod Creating jdk.compiler.jmod Creating jdk.crypto.cryptoki.jmod Creating jdk.crypto.ec.jmod Creating jdk.dynalink.jmod Creating jdk.editpad.jmod Creating jdk.httpserver.jmod Creating jdk.hotspot.agent.jmod Creating jdk.incubator.foreign.jmod Creating jdk.incubator.jpackage.jmod Creating jdk.internal.ed.jmod Creating jdk.internal.jvmstat.jmod Creating jdk.internal.le.jmod Creating jdk.internal.opt.jmod Creating jdk.jartool.jmod Creating jdk.javadoc.jmod Creating jdk.jcmd.jmod Creating jdk.jconsole.jmod Creating jdk.jdeps.jmod Creating jdk.jdi.jmod Creating jdk.jdwp.agent.jmod Creating jdk.jfr.jmod Creating interim java.base.jmod Creating interim java.logging.jmod Creating jdk.jshell.jmod Creating jdk.jsobject.jmod Creating jdk.jstatd.jmod Creating jdk.localedata.jmod Creating jdk.management.jmod Creating jdk.management.agent.jmod Creating jdk.management.jfr.jmod Creating jdk.naming.dns.jmod Creating jdk.naming.rmi.jmod Creating jdk.net.jmod Creating jdk.nio.mapmode.jmod Creating jdk.security.auth.jmod Creating jdk.sctp.jmod Creating jdk.security.jgss.jmod Creating jdk.unsupported.jmod Creating jdk.unsupported.desktop.jmod Creating jdk.xml.dom.jmod Creating jdk.zipfs.jmod Creating interim jimage Compiling 3 files for BUILD_DEMO_CodePointIM Updating support/demos/image/jfc/CodePointIM/src.zip Compiling 3 files for BUILD_DEMO_FileChooserDemo Updating support/demos/image/jfc/FileChooserDemo/src.zip Compiling 29 files for BUILD_DEMO_SwingSet2 Updating support/demos/image/jfc/SwingSet2/src.zip Compiling 3 files for BUILD_DEMO_Font2DTest Updating support/demos/image/jfc/Font2DTest/src.zip Compiling 64 files for BUILD_DEMO_J2Ddemo Updating support/demos/image/jfc/J2Ddemo/src.zip Compiling 15 files for BUILD_DEMO_Metalworks Compiling 1 files for CLASSLIST_JAR Updating support/demos/image/jfc/Metalworks/src.zip Creating support/classlist.jar Compiling 2 files for BUILD_DEMO_Notepad Updating support/demos/image/jfc/Notepad/src.zip Compiling 5 files for BUILD_DEMO_Stylepad Updating support/demos/image/jfc/Stylepad/src.zip Compiling 5 files for BUILD_DEMO_SampleTree Updating support/demos/image/jfc/SampleTree/src.zip Compiling 8 files for BUILD_DEMO_TableExample Updating support/demos/image/jfc/TableExample/src.zip /bin/bash: line 14: 20847 Aborted (core dumped) /workspace/build/linux-arm-server-release/support/interim-image/bin/java -XX:DumpLoadedClassList=/workspace/build/linux-arm-server-release/support/link_opt/classlist.raw.2 -XX:SharedClassListFile=/workspace/build/linux-arm-server-release/support/link_opt/classlist.interim -XX:SharedArchiveFile=/workspace/build/linux-arm-server-release/support/link_opt/classlist.jsa -Djava.lang.invoke.MethodHandle.TRACE_RESOLVE=true -Duser.language=en -Duser.country=US --module-path /workspace/build/linux-arm-server-release/support/classlist.jar -cp /workspace/build/linux-arm-server-release/support/classlist.jar build.tools.classlist.HelloClasslist 2> /workspace/build/linux-arm-server-release/support/link_opt/stderr > /workspace/build/linux-arm-server-release/support/link_opt/default_jli_trace.txt ERROR: Failed to generate link optimization data. This is likely a problem with the newly built JVM/JDK. make[3]: *** [/workspace/build/linux-arm-server-release/support/link_opt/classlist] Error 134 make[2]: *** [generate-link-opt-data] Error 2 make[2]: *** Waiting for unfinished jobs.... # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00000000, pid=20847, tid=20848 # # JRE version: (16.0) (build ) # Java VM: OpenJDK Server VM (16-internal+0-adhoc..workspace, mixed mode, sharing, g1 gc, linux-arm) # Problematic frame: # C 0x00000000 # # Core dump will be written. Default location: /workspace/make/core # # An error report file with more information is saved as: # /workspace/make/hs_err_pid20847.log # # GenerateLinkOptData.gmk:65: recipe for target '/workspace/build/linux-arm-server-release/support/link_opt/classlist' failed make/Main.gmk:575: recipe for target 'generate-link-opt-data' failed Compiling 1 files for BUILD_DEMO_TransparentRuler Updating support/demos/image/jfc/TransparentRuler/src.zip Creating support/demos/image/jfc/CodePointIM/CodePointIM.jar Creating support/demos/image/jfc/FileChooserDemo/FileChooserDemo.jar Creating support/demos/image/jfc/SwingSet2/SwingSet2.jar Creating support/demos/image/jfc/Font2DTest/Font2DTest.jar Creating support/demos/image/jfc/J2Ddemo/J2Ddemo.jar Creating support/demos/image/jfc/Metalworks/Metalworks.jar Creating support/demos/image/jfc/Notepad/Notepad.jar Creating support/demos/image/jfc/Stylepad/Stylepad.jar Creating support/demos/image/jfc/SampleTree/SampleTree.jar Creating support/demos/image/jfc/TableExample/TableExample.jar Creating support/demos/image/jfc/TransparentRuler/TransparentRuler.jar ERROR: Build failed for target 'images' in configuration 'linux-arm-server-release' (exit code 2) Stopping sjavac server === Make failed targets repeated here === GenerateLinkOptData.gmk:65: recipe for target '/workspace/build/linux-arm-server-release/support/link_opt/classlist' failed make/Main.gmk:575: recipe for target 'generate-link-opt-data' failed === End of repeated output === Hint: Try searching the build log for the name of the first failed target. Hint: See doc/building.html#troubleshooting for assistance. /workspace/make/Init.gmk:310: recipe for target 'main' failed make[1]: *** [main] Error 2 /workspace/make/Init.gmk:186: recipe for target 'images' failed make: *** [images] Error 2 === hs_err_pid20847.log === # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00000000, pid=20847, tid=20848 # # JRE version: (16.0) (build ) # Java VM: OpenJDK Server VM (16-internal+0-adhoc..workspace, mixed mode, sharing, g1 gc, linux-arm) # Problematic frame: # C 0x00000000 # # Core dump will be written. Default location: /workspace/make/core # # --------------- S U M M A R Y ------------ Command Line: -XX:DumpLoadedClassList=/workspace/build/linux-arm-server-release/support/link_opt/classlist.raw.2 -XX:SharedClassListFile=/workspace/build/linux-arm-server-release/support/link_opt/classlist.interim -XX:SharedArchiveFile=/workspace/build/linux-arm-server-release/support/link_opt/classlist.jsa -Djava.lang.invoke.MethodHandle.TRACE_RESOLVE=true -Duser.language=en -Duser.country=US --module-path=/workspace/build/linux-arm-server-release/support/classlist.jar build.tools.classlist.HelloClasslist Host: rev 3 (v7l), 4 cores, 3G, Ubuntu 18.04.3 LTS Time: Mon Oct 12 16:52:00 2020 UTC elapsed time: 0.158241 seconds (0d 0h 0m 0s) --------------- T H R E A D --------------- Current thread (0xb6314088): JavaThread "Unknown thread" [_thread_in_vm, id=20848, stack(0xb644c000,0xb649c000)] Stack: [0xb644c000,0xb649c000], sp=0xb649aac0, free space=314k siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x00000000 Registers: r0 = 0x74f29720 r1 = 0xb6314088 r2 = 0x00000000 r3 = 0x00000000 r4 = 0xb6314088 r5 = 0x74f29720 r6 = 0x004403e0 r7 = 0xb649aac8 r8 = 0x00000000 r9 = 0x00000000 r10 = 0x00000000 fp = 0x7529ef68 r12 = 0xb6d2a1ac sp = 0xb649aac0 lr = 0xb68e99ef pc = 0x00000000 cpsr = 0x400f0010 Top of Stack: (sp=0xb649aac0) 0xb649aac0: b6d31fb8 b64900e0 b6ceefe0 74f29720 0xb649aad0: 00000000 ad3ab600 00000000 00000000 0xb649aae0: b6d29dd4 b6314088 b649ab08 b63e6b08 0xb649aaf0: 00000000 0000000c 7529ef68 b69e3e65 0xb649ab00: 00000000 b6314088 00000000 74f29650 0xb649ab10: b6d29dd4 b6d48150 00000000 00000000 0xb649ab20: b6d29dd4 74f29720 b6314088 00000000 0xb649ab30: b649ab48 b63e6b08 7529ef60 0000000c Instructions: (pc=0x00000000) 0xffffff00: --------------- P R O C E S S --------------- Threads class SMR info: _java_thread_list=0xb6d82c70, length=0, elements={ } Java Threads: ( => current thread ) Other Threads: 0xb6376f68 GCTaskThread "GC Thread#0" [stack: 0x75fd2000,0x76052000] [id=20849] 0xb637c228 ConcurrentGCThread "G1 Main Marker" [stack: 0x75f50000,0x75fd0000] [id=20850] 0xb637d128 ConcurrentGCThread "G1 Conc#0" [stack: 0x75d80000,0x75e00000] [id=20851] 0xb63dd150 ConcurrentGCThread "G1 Refine#0" [stack: 0x75a80000,0x75b00000] [id=20852] 0xb63de020 ConcurrentGCThread "G1 Service" [stack: 0x75880000,0x75900000] [id=20853] =>0xb6314088 (exited) JavaThread "Unknown thread" [_thread_in_vm, id=20848, stack(0xb644c000,0xb649c000)] Threads with active compile tasks: VM state: not at safepoint (not fully initialized) VM Mutex/Monitor currently owned by a thread: None GC Precious Log: CPUs: 4 total, 4 available Memory: 3827M Large Page Support: Disabled NUMA Support: Disabled Compressed Oops: Disabled Heap Region Size: 1M Heap Min Capacity: 6M Heap Initial Capacity: 60M Heap Max Capacity: 958M Pre-touch: Disabled Parallel Workers: 4 Concurrent Workers: 1 Concurrent Refinement Workers: 4 Periodic GC: Disabled Heap: garbage-first heap total 61440K, used 0K [0x78200000, 0xb4000000) region size 1024K, 1 young (1024K), 0 survivors (0K) Metaspace used 0K, capacity 0K, committed 0K, reserved 4400K Heap Regions: E=young(eden), S=young(survivor), O=old, HS=humongous(starts), HC=humongous(continues), CS=collection set, F=free, OA=open archive, CA=closed archive, TAMS=top-at-mark-start (previous, next) | 0|0x78200000, 0x78200000, 0x78300000| 0%| F| |TAMS 0x78200000, 0x78200000| Untracked | 1|0x78300000, 0x78300000, 0x78400000| 0%| F| |TAMS 0x78300000, 0x78300000| Untracked | 2|0x78400000, 0x78400000, 0x78500000| 0%| F| |TAMS 0x78400000, 0x78400000| Untracked | 3|0x78500000, 0x78500000, 0x78600000| 0%| F| |TAMS 0x78500000, 0x78500000| Untracked | 4|0x78600000, 0x78600000, 0x78700000| 0%| F| |TAMS 0x78600000, 0x78600000| Untracked | 5|0x78700000, 0x78700000, 0x78800000| 0%| F| |TAMS 0x78700000, 0x78700000| Untracked | 6|0x78800000, 0x78800000, 0x78900000| 0%| F| |TAMS 0x78800000, 0x78800000| Untracked | 7|0x78900000, 0x78900000, 0x78a00000| 0%| F| |TAMS 0x78900000, 0x78900000| Untracked | 8|0x78a00000, 0x78a00000, 0x78b00000| 0%| F| |TAMS 0x78a00000, 0x78a00000| Untracked | 9|0x78b00000, 0x78b00000, 0x78c00000| 0%| F| |TAMS 0x78b00000, 0x78b00000| Untracked | 10|0x78c00000, 0x78c00000, 0x78d00000| 0%| F| |TAMS 0x78c00000, 0x78c00000| Untracked | 11|0x78d00000, 0x78d00000, 0x78e00000| 0%| F| |TAMS 0x78d00000, 0x78d00000| Untracked | 12|0x78e00000, 0x78e00000, 0x78f00000| 0%| F| |TAMS 0x78e00000, 0x78e00000| Untracked | 13|0x78f00000, 0x78f00000, 0x79000000| 0%| F| |TAMS 0x78f00000, 0x78f00000| Untracked | 14|0x79000000, 0x79000000, 0x79100000| 0%| F| |TAMS 0x79000000, 0x79000000| Untracked | 15|0x79100000, 0x79100000, 0x79200000| 0%| F| |TAMS 0x79100000, 0x79100000| Untracked | 16|0x79200000, 0x79200000, 0x79300000| 0%| F| |TAMS 0x79200000, 0x79200000| Untracked | 17|0x79300000, 0x79300000, 0x79400000| 0%| F| |TAMS 0x79300000, 0x79300000| Untracked | 18|0x79400000, 0x79400000, 0x79500000| 0%| F| |TAMS 0x79400000, 0x79400000| Untracked | 19|0x79500000, 0x79500000, 0x79600000| 0%| F| |TAMS 0x79500000, 0x79500000| Untracked | 20|0x79600000, 0x79600000, 0x79700000| 0%| F| |TAMS 0x79600000, 0x79600000| Untracked | 21|0x79700000, 0x79700000, 0x79800000| 0%| F| |TAMS 0x79700000, 0x79700000| Untracked | 22|0x79800000, 0x79800000, 0x79900000| 0%| F| |TAMS 0x79800000, 0x79800000| Untracked | 23|0x79900000, 0x79900000, 0x79a00000| 0%| F| |TAMS 0x79900000, 0x79900000| Untracked | 24|0x79a00000, 0x79a00000, 0x79b00000| 0%| F| |TAMS 0x79a00000, 0x79a00000| Untracked | 25|0x79b00000, 0x79b00000, 0x79c00000| 0%| F| |TAMS 0x79b00000, 0x79b00000| Untracked | 26|0x79c00000, 0x79c00000, 0x79d00000| 0%| F| |TAMS 0x79c00000, 0x79c00000| Untracked | 27|0x79d00000, 0x79d00000, 0x79e00000| 0%| F| |TAMS 0x79d00000, 0x79d00000| Untracked | 28|0x79e00000, 0x79e00000, 0x79f00000| 0%| F| |TAMS 0x79e00000, 0x79e00000| Untracked | 29|0x79f00000, 0x79f00000, 0x7a000000| 0%| F| |TAMS 0x79f00000, 0x79f00000| Untracked | 30|0x7a000000, 0x7a000000, 0x7a100000| 0%| F| |TAMS 0x7a000000, 0x7a000000| Untracked | 31|0x7a100000, 0x7a100000, 0x7a200000| 0%| F| |TAMS 0x7a100000, 0x7a100000| Untracked | 32|0x7a200000, 0x7a200000, 0x7a300000| 0%| F| |TAMS 0x7a200000, 0x7a200000| Untracked | 33|0x7a300000, 0x7a300000, 0x7a400000| 0%| F| |TAMS 0x7a300000, 0x7a300000| Untracked | 34|0x7a400000, 0x7a400000, 0x7a500000| 0%| F| |TAMS 0x7a400000, 0x7a400000| Untracked | 35|0x7a500000, 0x7a500000, 0x7a600000| 0%| F| |TAMS 0x7a500000, 0x7a500000| Untracked | 36|0x7a600000, 0x7a600000, 0x7a700000| 0%| F| |TAMS 0x7a600000, 0x7a600000| Untracked | 37|0x7a700000, 0x7a700000, 0x7a800000| 0%| F| |TAMS 0x7a700000, 0x7a700000| Untracked | 38|0x7a800000, 0x7a800000, 0x7a900000| 0%| F| |TAMS 0x7a800000, 0x7a800000| Untracked | 39|0x7a900000, 0x7a900000, 0x7aa00000| 0%| F| |TAMS 0x7a900000, 0x7a900000| Untracked | 40|0x7aa00000, 0x7aa00000, 0x7ab00000| 0%| F| |TAMS 0x7aa00000, 0x7aa00000| Untracked | 41|0x7ab00000, 0x7ab00000, 0x7ac00000| 0%| F| |TAMS 0x7ab00000, 0x7ab00000| Untracked | 42|0x7ac00000, 0x7ac00000, 0x7ad00000| 0%| F| |TAMS 0x7ac00000, 0x7ac00000| Untracked | 43|0x7ad00000, 0x7ad00000, 0x7ae00000| 0%| F| |TAMS 0x7ad00000, 0x7ad00000| Untracked | 44|0x7ae00000, 0x7ae00000, 0x7af00000| 0%| F| |TAMS 0x7ae00000, 0x7ae00000| Untracked | 45|0x7af00000, 0x7af00000, 0x7b000000| 0%| F| |TAMS 0x7af00000, 0x7af00000| Untracked | 46|0x7b000000, 0x7b000000, 0x7b100000| 0%| F| |TAMS 0x7b000000, 0x7b000000| Untracked | 47|0x7b100000, 0x7b100000, 0x7b200000| 0%| F| |TAMS 0x7b100000, 0x7b100000| Untracked | 48|0x7b200000, 0x7b200000, 0x7b300000| 0%| F| |TAMS 0x7b200000, 0x7b200000| Untracked | 49|0x7b300000, 0x7b300000, 0x7b400000| 0%| F| |TAMS 0x7b300000, 0x7b300000| Untracked | 50|0x7b400000, 0x7b400000, 0x7b500000| 0%| F| |TAMS 0x7b400000, 0x7b400000| Untracked | 51|0x7b500000, 0x7b500000, 0x7b600000| 0%| F| |TAMS 0x7b500000, 0x7b500000| Untracked | 52|0x7b600000, 0x7b600000, 0x7b700000| 0%| F| |TAMS 0x7b600000, 0x7b600000| Untracked | 53|0x7b700000, 0x7b700000, 0x7b800000| 0%| F| |TAMS 0x7b700000, 0x7b700000| Untracked | 54|0x7b800000, 0x7b800000, 0x7b900000| 0%| F| |TAMS 0x7b800000, 0x7b800000| Untracked | 55|0x7b900000, 0x7b900000, 0x7ba00000| 0%| F| |TAMS 0x7b900000, 0x7b900000| Untracked | 56|0x7ba00000, 0x7ba00000, 0x7bb00000| 0%| F| |TAMS 0x7ba00000, 0x7ba00000| Untracked | 57|0x7bb00000, 0x7bb00000, 0x7bc00000| 0%| F| |TAMS 0x7bb00000, 0x7bb00000| Untracked | 58|0x7bc00000, 0x7bc00000, 0x7bd00000| 0%| F| |TAMS 0x7bc00000, 0x7bc00000| Untracked | 59|0x7bd00000, 0x7bd42908, 0x7be00000| 26%| E| |TAMS 0x7bd00000, 0x7bd00000| Complete Card table byte_map: [0x78021000,0x78200000] _byte_map_base: 0x77c60000 Marking Bits (Prev, Next): (CMBitMap*) 0xb6377de0, (CMBitMap*) 0xb6377e00 Prev Bits: [0x76f4a000, 0x77e42000) Next Bits: [0x76052000, 0x76f4a000) GC Heap History (0 events): No events Deoptimization events (0 events): No events Classes unloaded (0 events): No events Classes redefined (0 events): No events Internal exceptions (0 events): No events Events (2 events): Event: 0.005 Protecting memory [0xb644c000,0xb644f000] with protection modes 0 Event: 0.007 Loaded shared library /workspace/build/linux-arm-server-release/support/interim-image/lib/libjava.so Dynamic libraries: 00454000-00455000 r-xp 00000000 b3:02 1685994 /workspace/build/linux-arm-server-release/support/interim-image/bin/java 00464000-00465000 r--p 00000000 b3:02 1685994 /workspace/build/linux-arm-server-release/support/interim-image/bin/java 00465000-00466000 rw-p 00001000 b3:02 1685994 /workspace/build/linux-arm-server-release/support/interim-image/bin/java 01293000-012b4000 rw-p 00000000 00:00 0 [heap] 74ad8000-74f24000 ---p 00000000 00:00 0 74f27000-74f29000 rwxp 00001000 b3:02 1810901 /workspace/build/linux-arm-server-release/support/link_opt/classlist.jsa 74f29000-75700000 rw-p 00003000 b3:02 1810901 /workspace/build/linux-arm-server-release/support/link_opt/classlist.jsa 75700000-75721000 rw-p 00000000 00:00 0 75721000-75800000 ---p 00000000 00:00 0 7587e000-7587f000 ---p 00000000 00:00 0 7587f000-75900000 rw-p 00000000 00:00 0 75900000-75921000 rw-p 00000000 00:00 0 75921000-75a00000 ---p 00000000 00:00 0 75a23000-75a7e000 rw-p 00000000 00:00 0 75a7e000-75a7f000 ---p 00000000 00:00 0 75a7f000-75b00000 rw-p 00000000 00:00 0 75b00000-75b21000 rw-p 00000000 00:00 0 75b21000-75c00000 ---p 00000000 00:00 0 75c00000-75c21000 rw-p 00000000 00:00 0 75c21000-75d00000 ---p 00000000 00:00 0 75d7e000-75d7f000 ---p 00000000 00:00 0 75d7f000-75e00000 rw-p 00000000 00:00 0 75e00000-75e21000 rw-p 00000000 00:00 0 75e21000-75f00000 ---p 00000000 00:00 0 75f0d000-75f4e000 rw-p 00000000 00:00 0 75f4e000-75f4f000 ---p 00000000 00:00 0 75f4f000-75fd0000 rw-p 00000000 00:00 0 75fd0000-75fd1000 ---p 00000000 00:00 0 75fd1000-76142000 rw-p 00000000 00:00 0 76142000-76f4a000 ---p 00000000 00:00 0 76f4a000-7703a000 rw-p 00000000 00:00 0 7703a000-77e42000 ---p 00000000 00:00 0 77e42000-77e60000 rw-p 00000000 00:00 0 77e60000-78021000 ---p 00000000 00:00 0 78021000-7803f000 rw-p 00000000 00:00 0 7803f000-78200000 ---p 00000000 00:00 0 78200000-7be00000 rw-p 00000000 00:00 0 7be00000-b4000000 ---p 00000000 00:00 0 b4001000-b402f000 rw-p 00000000 00:00 0 b402f000-b41f0000 ---p 00000000 00:00 0 b41f0000-b41f3000 rw-p 00000000 00:00 0 b41f3000-b4230000 ---p 00000000 00:00 0 b4230000-b43b0000 rwxp 00000000 00:00 0 b43b0000-b6230000 ---p 00000000 00:00 0 b6230000-b6246000 r-xp 00000000 b3:02 659335 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjava.so b6246000-b6255000 ---p 00016000 b3:02 659335 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjava.so b6255000-b6256000 r--p 00015000 b3:02 659335 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjava.so b6256000-b6257000 rw-p 00016000 b3:02 659335 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjava.so b6257000-b625e000 r-xp 00000000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so b625e000-b626d000 ---p 00007000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so b626d000-b626e000 r--p 00006000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so b626e000-b626f000 rw-p 00007000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so b626f000-b6275000 rw-p 00000000 00:00 0 b6275000-b6282000 r-xp 00000000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so b6282000-b6291000 ---p 0000d000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so b6291000-b6292000 r--p 0000c000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so b6292000-b6293000 rw-p 0000d000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so b6293000-b6295000 rw-p 00000000 00:00 0 b6295000-b629c000 r-xp 00000000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so b629c000-b62ab000 ---p 00007000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so b62ab000-b62ac000 r--p 00006000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so b62ac000-b62ad000 rw-p 00007000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so b62ad000-b6300000 r--s 00000000 b3:02 659315 /workspace/build/linux-arm-server-release/support/interim-image/lib/modules b6300000-b63f1000 rw-p 00000000 00:00 0 b63f1000-b6400000 ---p 00000000 00:00 0 b640f000-b6417000 rw-s 00000000 b3:02 2475046 /tmp/hsperfdata_root/20847 b6417000-b641c000 r-xp 00000000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so b641c000-b642b000 ---p 00005000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so b642b000-b642c000 r--p 00004000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so b642c000-b642d000 rw-p 00005000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so b642d000-b643b000 r-xp 00000000 b3:02 659333 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjimage.so b643b000-b644a000 ---p 0000e000 b3:02 659333 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjimage.so b644a000-b644b000 r--p 0000d000 b3:02 659333 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjimage.so b644b000-b644c000 rw-p 0000e000 b3:02 659333 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjimage.so b644c000-b644f000 ---p 00000000 00:00 0 b644f000-b649c000 rw-p 00000000 00:00 0 b649c000-b650b000 r-xp 00000000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so b650b000-b651b000 ---p 0006f000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so b651b000-b651c000 r--p 0006f000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so b651c000-b651d000 rw-p 00070000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so b651d000-b6cdf000 r-xp 00000000 b3:02 1686001 /workspace/build/linux-arm-server-release/support/interim-image/lib/server/libjvm.so b6cdf000-b6cee000 ---p 007c2000 b3:02 1686001 /workspace/build/linux-arm-server-release/support/interim-image/lib/server/libjvm.so b6cee000-b6d30000 r--p 007c1000 b3:02 1686001 /workspace/build/linux-arm-server-release/support/interim-image/lib/server/libjvm.so b6d30000-b6d47000 rw-p 00803000 b3:02 1686001 /workspace/build/linux-arm-server-release/support/interim-image/lib/server/libjvm.so b6d47000-b6d8a000 rw-p 00000000 00:00 0 b6d8a000-b6d9b000 r-xp 00000000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so b6d9b000-b6dab000 ---p 00011000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so b6dab000-b6dac000 r--p 00011000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so b6dac000-b6dad000 rw-p 00012000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so b6dad000-b6daf000 rw-p 00000000 00:00 0 b6daf000-b6db1000 r-xp 00000000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so b6db1000-b6dc0000 ---p 00002000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so b6dc0000-b6dc1000 r--p 00001000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so b6dc1000-b6dc2000 rw-p 00002000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so b6dc2000-b6ddb000 r-xp 00000000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 b6ddb000-b6dea000 ---p 00019000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 b6dea000-b6deb000 r--p 00018000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 b6deb000-b6dec000 rw-p 00019000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 b6dec000-b6ece000 r-xp 00000000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so b6ece000-b6ede000 ---p 000e2000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so b6ede000-b6ee0000 r--p 000e2000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so b6ee0000-b6ee1000 rw-p 000e4000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so b6ee1000-b6ee4000 rw-p 00000000 00:00 0 b6ee4000-b6eee000 r-xp 00000000 b3:02 659325 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjli.so b6eee000-b6efe000 ---p 0000a000 b3:02 659325 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjli.so b6efe000-b6eff000 r--p 0000a000 b3:02 659325 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjli.so b6eff000-b6f00000 rw-p 0000b000 b3:02 659325 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjli.so b6f00000-b6f18000 r-xp 00000000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so b6f1f000-b6f21000 rw-p 00000000 00:00 0 b6f23000-b6f24000 ---p 00000000 00:00 0 b6f24000-b6f25000 r--p 00000000 00:00 0 b6f25000-b6f26000 ---p 00000000 00:00 0 b6f26000-b6f28000 rw-p 00000000 00:00 0 b6f28000-b6f29000 r--p 00018000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so b6f29000-b6f2a000 rw-p 00019000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so beefd000-bef1e000 rw-p 00000000 00:00 0 [stack] befd8000-befd9000 r-xp 00000000 00:00 0 [sigpage] befd9000-befda000 r--p 00000000 00:00 0 [vvar] befda000-befdb000 r-xp 00000000 00:00 0 [vdso] ffff0000-ffff1000 r-xp 00000000 00:00 0 [vectors] VM Arguments: jvm_args: -XX:DumpLoadedClassList=/workspace/build/linux-arm-server-release/support/link_opt/classlist.raw.2 -XX:SharedClassListFile=/workspace/build/linux-arm-server-release/support/link_opt/classlist.interim -XX:SharedArchiveFile=/workspace/build/linux-arm-server-release/support/link_opt/classlist.jsa -Djava.lang.invoke.MethodHandle.TRACE_RESOLVE=true -Duser.language=en -Duser.country=US --module-path=/workspace/build/linux-arm-server-release/support/classlist.jar java_command: build.tools.classlist.HelloClasslist java_class_path (initial): /workspace/build/linux-arm-server-release/support/classlist.jar Launcher Type: SUN_STANDARD [Global flags] uint ConcGCThreads = 1 {product} {ergonomic} ccstr DumpLoadedClassList = /workspace/build/linux-arm-server-release/support/link_opt/classlist.raw.2 {product} {command line} uint G1ConcRefinementThreads = 4 {product} {ergonomic} size_t G1HeapRegionSize = 1048576 {product} {ergonomic} uintx GCDrainStackTargetSize = 64 {product} {ergonomic} size_t InitialHeapSize = 62914560 {product} {ergonomic} size_t MarkStackSize = 32768 {product} {ergonomic} size_t MaxHeapSize = 1004535808 {product} {ergonomic} size_t MaxNewSize = 601882624 {product} {ergonomic} size_t MinHeapDeltaBytes = 1048576 {product} {ergonomic} size_t MinHeapSize = 6291456 {product} {ergonomic} uintx NonProfiledCodeHeapSize = 0 {pd product} {ergonomic} uintx ProfiledCodeHeapSize = 0 {pd product} {ergonomic} ccstr SharedArchiveFile = /workspace/build/linux-arm-server-release/support/link_opt/classlist.jsa {product} {command line} ccstr SharedClassListFile = /workspace/build/linux-arm-server-release/support/link_opt/classlist.interim {product} {command line} size_t SoftMaxHeapSize = 1004535808 {manageable} {ergonomic} bool UseG1GC = true {product} {ergonomic} Logging: Log output configuration: #0: stdout all=warning uptime,level,tags #1: stderr all=off uptime,level,tags Environment Variables: JAVA_HOME=/opt/java/openjdk PATH=/opt/java/openjdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin LC_ALL=C Signal Handlers: SIGSEGV: [libjvm.so+0x71d0e1], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO SIGBUS: [libjvm.so+0x71d0e1], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO SIGFPE: [libjvm.so+0x71d0e1], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO SIGPIPE: [libjvm.so+0x6549d1], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO SIGXFSZ: [libjvm.so+0x6549d1], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO SIGILL: [libjvm.so+0x71d0e1], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO SIGUSR2: [libjvm.so+0x6548c1], sa_mask[0]=00000000000000000000000000000000, sa_flags=SA_RESTART|SA_SIGINFO SIGHUP: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none SIGINT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none SIGTERM: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none SIGQUIT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none --------------- S Y S T E M --------------- OS: DISTRIB_ID=Ubuntu DISTRIB_RELEASE=18.04 DISTRIB_CODENAME=bionic DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS" uname: Linux 5.4.51-v7l+ #1333 SMP Mon Aug 10 16:51:40 BST 2020 armv7l OS uptime: 14 days 4:29 hours libc: glibc 2.27 NPTL 2.27 rlimit (soft/hard): STACK 8192k/infinity , CORE infinity/infinity , NPROC infinity/infinity , NOFILE 1048576/1048576 , AS infinity/infinity , CPU infinity/infinity , DATA infinity/infinity , FSIZE infinity/infinity , MEMLOCK 64k/64k load average: 5.03 3.88 3.74 /proc/meminfo: MemTotal: 3919812 kB MemFree: 1065720 kB MemAvailable: 3390780 kB Buffers: 131112 kB Cached: 2181160 kB SwapCached: 0 kB Active: 1343692 kB Inactive: 1274264 kB Active(anon): 248060 kB Inactive(anon): 69168 kB Active(file): 1095632 kB Inactive(file): 1205096 kB Unevictable: 16 kB Mlocked: 16 kB HighTotal: 3264512 kB HighFree: 854016 kB LowTotal: 655300 kB LowFree: 211704 kB SwapTotal: 102396 kB SwapFree: 102396 kB Dirty: 29260 kB Writeback: 0 kB AnonPages: 305752 kB Mapped: 120900 kB Shmem: 16892 kB KReclaimable: 186028 kB Slab: 209376 kB SReclaimable: 186028 kB SUnreclaim: 23348 kB KernelStack: 2608 kB PageTables: 3244 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 2062300 kB Committed_AS: 1266456 kB VmallocTotal: 245760 kB VmallocUsed: 5520 kB VmallocChunk: 0 kB Percpu: 512 kB CmaTotal: 262144 kB CmaFree: 170068 kB /sys/kernel/mm/transparent_hugepage/enabled: /sys/kernel/mm/transparent_hugepage/defrag (defrag/compaction efforts parameter): Process Memory: Virtual Size: 1084744K (peak: 1084996K) Resident Set Size: 15988K (peak: 16088K) (anon: 7936K, file: 8052K, shmem: 0K) Swapped out: 0K C-Heap outstanding allocations: 994K /proc/sys/kernel/threads-max (system-wide limit on the number of threads): 57119 /proc/sys/vm/max_map_count (maximum number of memory map areas a process may have): 65530 /proc/sys/kernel/pid_max (system-wide limit on number of process identifiers): 32768 Steal ticks since vm start: 0 Steal ticks percentage since vm start: 0.000 CPU: total 4 (initial active 4) (ARMv7), vfp, vfp3-32, simd, mp_ext /proc/cpuinfo: processor : 0 model name : ARMv7 Processor rev 3 (v7l) BogoMIPS : 270.00 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x0 CPU part : 0xd08 CPU revision : 3 processor : 1 model name : ARMv7 Processor rev 3 (v7l) BogoMIPS : 270.00 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x0 CPU part : 0xd08 CPU revision : 3 processor : 2 model name : ARMv7 Processor rev 3 (v7l) BogoMIPS : 270.00 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x0 CPU part : 0xd08 CPU revision : 3 processor : 3 model name : ARMv7 Processor rev 3 (v7l) BogoMIPS : 270.00 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x0 CPU part : 0xd08 CPU revision : 3 Hardware : BCM2711 Revision : c03111 Serial : 100000001c47254f Model : Raspberry Pi 4 Model B Rev 1.1 Online cpus: 0-3 Offline cpus: Memory: 4k page, physical 3919812k(1065720k free), swap 102396k(102396k free) vm_info: OpenJDK Server VM (16-internal+0-adhoc..workspace) for linux-arm JRE (16-internal+0-adhoc..workspace), built on Oct 12 2020 16:24:13 by "" with gcc 7.5.0 END. From shade at redhat.com Mon Oct 12 18:24:38 2020 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 12 Oct 2020 20:24:38 +0200 Subject: arm32 builds continue to fail for me after 8253540 and 8253901 In-Reply-To: References: Message-ID: <56ff08d5-a4e5-788a-1c29-02f76e8755d2@redhat.com> Hi, On 10/12/20 8:12 PM, Marc Hoffmann wrote: > Please find the build log and the hs_err file for commit fd0cb98ed03c6214c02ccd3503c1e6d77065a428 attached. Please try to build with fastdebug (./configure --enable-debug), so that JVM asserts meaninfully somewhere? > Is there any additional information I can provide to help getting these builds fixed again? I am seeing plenty of weird x86_32 crashes since last week. Pretty sure some of them would manifest on ARM32 as well. This is why building with fastdebug is the next step: it maps out the bug symptoms. -- Thanks, -Aleksey From github.com+51754783+coreyashford at openjdk.java.net Mon Oct 12 19:15:16 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Mon, 12 Oct 2020 19:15:16 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v4] In-Reply-To: <2zhMnWcr1cuE5sxUyCvU9bN5HP_ph_1xQEV3wdx_7dg=.3d509c17-db83-432c-a983-79137d12a827@github.com> References: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> <2zhMnWcr1cuE5sxUyCvU9bN5HP_ph_1xQEV3wdx_7dg=.3d509c17-db83-432c-a983-79137d12a827@github.com> Message-ID: On Mon, 12 Oct 2020 12:55:02 GMT, Martin Doerr wrote: >> CoreyAshford has updated the pull request incrementally with two additional commits since the last revision: >> >> - TestBase64.java: fix comment to correctly reflect actual intrinsic names. >> >> The intrinsic names that are visible with -XX:+PrintCompilation are encode >> and decode, rather than encodeBlock and decodeBlock. >> - stubGenerator_ppc.cpp: fix regression caused by change to using loop counter >> >> My original fix didn't account for the case where sl < block_size. In the >> event sl < block_size, the shifted sl will become zero, so it should >> jump to the code that computes how much data was processed - 0 - and return. > > src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3745: > >> 3743: __ clrldi(isURL, isURL, 32); >> 3744: >> 3745: // Load constant vec registers that need to be loaded from memory > > With larger unroll factor we run through this code more often without making any progress, because only the Java part > does all the work for the remaining bytes. Would be nice to move unnecessary parts for that between mtctr and align. You're right that there's quite a lot of set up before the size check is performed. I will fix, this and run regression the regression test. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From boris.ulasevich at bell-sw.com Mon Oct 12 19:54:10 2020 From: boris.ulasevich at bell-sw.com (Boris Ulasevich) Date: Mon, 12 Oct 2020 22:54:10 +0300 Subject: arm32 builds continue to fail for me after 8253540 and 8253901 In-Reply-To: References: Message-ID: Hi, > Problematic frame: > # C 0x00000000 This looks exactly like the issue fixed by JDK-8253901! > JDK builds on arm32 fail for me, even after the corresponding fix JDK-8253901. I can't reproduce the issue after the JDK-8253901 fix. I have tried cross-build and straight arm32 build, fastdebug and release builds. Boris On Mon, Oct 12, 2020 at 9:13 PM Marc Hoffmann wrote: > > Hi, > > since JDK-8253540 has been applied to master (77a0f3999afa322b64643afd4a161164440af975) JDK builds on arm32 fail for me, even after the corresponding fix JDK-8253901. The newly built JVM crashes right after the start with SIGSEGV. > > Please find the build log and the hs_err file for commit fd0cb98ed03c6214c02ccd3503c1e6d77065a428 attached. > > Is there any additional information I can provide to help getting these builds fixed again? > > Thanks and best regards, > -marc > > > === build.log === > > Configuration summary: > * Debug level: release > * HS debug level: product > * JVM variants: server > * JVM features: server: 'cds compiler1 compiler2 epsilongc g1gc jfr jni-check jvmti management nmt parallelgc serialgc services vm-structs' > * OpenJDK target: OS: linux, CPU architecture: arm, address length: 32 > * Version string: 16-internal+0-adhoc..workspace (16-internal) > > Tools summary: > * Boot JDK: openjdk version "15" 2020-09-15 OpenJDK Runtime Environment AdoptOpenJDK (build 15+36) OpenJDK Server VM AdoptOpenJDK (build 15+36, mixed mode) (at /opt/java/openjdk) > * Toolchain: gcc (GNU Compiler Collection) > * C Compiler: Version 7.5.0 (at /usr/bin/gcc) > * C++ Compiler: Version 7.5.0 (at /usr/bin/g++) > > Build performance summary: > * Cores to use: 3 > * Memory limit: 3827 MB > > Building target 'images' in configuration 'linux-arm-server-release' > Compiling 8 files for BUILD_TOOLS_LANGTOOLS > Warning: No SCM configuration present and no .src-rev > Parsing 2 properties into enum-like class for jdk.compiler > Compiling 13 properties into resource bundles for jdk.javadoc > Compiling 12 properties into resource bundles for jdk.jdeps > Compiling 7 properties into resource bundles for jdk.jshell > Compiling 16 properties into resource bundles for jdk.compiler > Compiling 127 files for BUILD_java.compiler.interim > Compiling 398 files for BUILD_jdk.compiler.interim > Compiling 226 files for BUILD_jdk.javadoc.interim > Compiling 1 files for BUILD_TOOLS_HOTSPOT > Compiling 185 files for BUILD_TOOLS_JDK > Compiling 31 files for BUILD_JRTFS > Creating hotspot/variant-server/tools/adlc/adlc from 13 file(s) > Compiling 2 files for BUILD_JVMTI_TOOLS > Creating support/modules_libs/java.base/jrt-fs.jar > Compiling 2 files for COMPILE_DEPEND > Compiling 2 files for BUILD_BREAKITERATOR_BASE > Compiling 2 files for BUILD_BREAKITERATOR_LD > Compiling 11 properties into resource bundles for java.logging > Compiling 11 properties into resource bundles for java.base > Compiling 6 properties into resource bundles for java.base > Compiling 5 properties into resource bundles for jdk.jlink > Compiling 3 properties into resource bundles for jdk.jlink > Compiling 1 properties into resource bundles for jdk.jlink > Compiling 11 properties into resource bundles for jdk.jartool > Compiling 71 files for COMPILE_CREATE_SYMBOLS > Creating javadoc element list > Compiling 11 properties into resource bundles for jdk.management.agent > Compiling 3 properties into resource bundles for jdk.jdi > Compiling 224 properties into resource bundles for jdk.localedata > Compiling 3050 files for java.base > Creating support/modules_libs/java.base/server/libjvm.so from 856 file(s) > Compiling 89 properties into resource bundles for java.desktop > Creating ct.sym classes > Updating support/src.zip > Compiling 127 files for java.compiler > Compiling 18 files for java.datatransfer > Compiling 1845 files for java.xml > Compiling 10 files for java.instrument > Compiling 35 files for java.logging > Compiling 330 files for java.management > Compiling 30 files for java.security.sasl > Compiling 131 files for java.rmi > Compiling 141 files for java.net.http > Compiling 15 files for java.scripting > Compiling 5 files for java.transaction.xa > Compiling 275 files for java.xml.crypto > Compiling 22 files for java.smartcardio > Compiling 61 files for jdk.internal.jvmstat > Compiling 120 files for jdk.charsets > Compiling 402 files for jdk.compiler > Compiling 35 files for jdk.crypto.ec > Compiling 68 files for jdk.dynalink > Compiling 3 files for jdk.internal.ed > Compiling 44 files for jdk.httpserver > Compiling 21 files for jdk.incubator.foreign > Compiling 51 files for jdk.internal.opt > Compiling 100 files for jdk.internal.le > Compiling 31 files for jdk.jartool > Compiling 226 files for jdk.javadoc > Compiling 24 files for jdk.management > Compiling 1 files for jdk.jdwp.agent > Compiling 194 files for jdk.jfr > Compiling 4 files for jdk.jsobject > Compiling 11 files for jdk.jstatd > Compiling 1797 files for jdk.localedata > Compiling 14 files for jdk.management.jfr > Compiling 8 files for jdk.net > Compiling 2 files for jdk.nio.mapmode > Compiling 33 files for jdk.sctp > Compiling 9 files for jdk.unsupported > Compiling 94 files for jdk.xml.dom > Compiling 14 files for jdk.zipfs > Compiling 15 files for java.prefs > Compiling 198 files for java.naming > Compiling 77 files for java.sql > Compiling 15 files for jdk.attach > Compiling 74 files for jdk.crypto.cryptoki > Compiling 136 files for jdk.jdeps > Compiling 40 files for jdk.jcmd > Compiling 251 files for jdk.jdi > Compiling 16 files for jdk.naming.dns > Compiling 8 files for jdk.naming.rmi > Compiling 16 files for java.management.rmi > Compiling 220 files for java.security.jgss > Compiling 2781 files for java.desktop > Compiling 56 files for java.sql.rowset > Compiling 84 files for jdk.jlink > Compiling 31 files for jdk.management.agent > Compiling 95 files for jdk.jshell > Compiling 30 files for jdk.security.auth > Compiling 16 files for jdk.security.jgss > Compiling 1 files for java.se > Compiling 18 files for jdk.accessibility > Compiling 3 files for jdk.editpad > Compiling 948 files for jdk.hotspot.agent > Compiling 47 files for jdk.incubator.jpackage > Compiling 64 files for jdk.jconsole > Compiling 8 files for jdk.unsupported.desktop > Creating support/modules_libs/java.base/libverify.so from 1 file(s) > Creating support/modules_libs/java.base/libjava.so from 59 file(s) > Creating support/native/java.base/libfdlibm.a from 57 file(s) > Creating support/modules_libs/java.base/libzip.so from 5 file(s) > Creating support/modules_libs/java.base/libjimage.so from 6 file(s) > Creating support/modules_libs/java.base/libjli.so from 8 file(s) > Creating support/modules_libs/java.base/libnet.so from 21 file(s) > Creating support/modules_libs/java.base/libnio.so from 20 file(s) > Creating support/modules_libs/java.base/libjsig.so from 1 file(s) > Creating support/modules_libs/java.prefs/libprefs.so from 1 file(s) > Creating support/modules_cmds/java.base/java from 1 file(s) > Creating support/modules_cmds/java.base/keytool from 1 file(s) > Creating support/modules_libs/java.base/jexec from 1 file(s) > Creating support/modules_libs/java.base/jspawnhelper from 1 file(s) > Creating support/modules_libs/java.instrument/libinstrument.so from 12 file(s) > Creating support/modules_libs/java.desktop/libmlib_image.so from 50 file(s) > Creating support/modules_libs/java.desktop/libawt.so from 72 file(s) > Creating support/modules_libs/java.desktop/libawt_xawt.so from 51 file(s) > Creating support/modules_libs/java.desktop/liblcms.so from 27 file(s) > Creating support/modules_libs/java.desktop/libjavajpeg.so from 46 file(s) > Creating support/modules_libs/java.desktop/libawt_headless.so from 26 file(s) > Creating support/modules_libs/java.desktop/libharfbuzz.so from 53 file(s) > Creating support/modules_libs/java.desktop/libfontmanager.so from 8 file(s) > Creating support/modules_libs/java.desktop/libjawt.so from 1 file(s) > Creating support/modules_libs/java.desktop/libsplashscreen.so from 67 file(s) > Creating support/modules_libs/java.desktop/libjsound.so from 18 file(s) > Creating support/modules_libs/java.management/libmanagement.so from 9 file(s) > Creating support/modules_libs/java.rmi/librmi.so from 1 file(s) > Creating support/modules_cmds/java.rmi/rmid from 1 file(s) > Creating support/modules_cmds/java.rmi/rmiregistry from 1 file(s) > Creating support/modules_cmds/java.scripting/jrunscript from 1 file(s) > Creating support/modules_libs/java.security.jgss/libj2gss.so from 3 file(s) > Creating support/modules_libs/java.smartcardio/libj2pcsc.so from 2 file(s) > Creating support/modules_libs/jdk.attach/libattach.so from 1 file(s) > Creating support/modules_cmds/jdk.compiler/javac from 1 file(s) > Creating support/modules_cmds/jdk.compiler/serialver from 1 file(s) > Creating support/modules_libs/jdk.crypto.cryptoki/libj2pkcs11.so from 14 file(s) > Creating support/modules_libs/jdk.hotspot.agent/libsaproc.so from 10 file(s) > Creating support/modules_cmds/jdk.hotspot.agent/jhsdb from 1 file(s) > Creating support/modules_cmds/jdk.jdeps/javap from 1 file(s) > Creating support/modules_cmds/jdk.jdeps/jdeps from 1 file(s) > Creating support/modules_cmds/jdk.jdeps/jdeprscan from 1 file(s) > Creating support/modules_cmds/jdk.jlink/jimage from 1 file(s) > Creating support/modules_cmds/jdk.jlink/jlink from 1 file(s) > Creating support/modules_cmds/jdk.jlink/jmod from 1 file(s) > Creating jdk/modules/jdk.incubator.jpackage/jdk/incubator/jpackage/internal/resources/jpackageapplauncher from 15 file(s) > Creating support/modules_cmds/jdk.incubator.jpackage/jpackage from 1 file(s) > Creating support/modules_cmds/jdk.jartool/jar from 1 file(s) > Creating support/modules_cmds/jdk.jartool/jarsigner from 1 file(s) > Creating support/modules_cmds/jdk.javadoc/javadoc from 1 file(s) > Creating support/modules_cmds/jdk.jcmd/jinfo from 1 file(s) > Creating support/modules_cmds/jdk.jcmd/jmap from 1 file(s) > Creating support/modules_cmds/jdk.jcmd/jps from 1 file(s) > Creating support/modules_cmds/jdk.jcmd/jstack from 1 file(s) > Creating support/modules_cmds/jdk.jcmd/jstat from 1 file(s) > Creating support/modules_cmds/jdk.jcmd/jcmd from 1 file(s) > Creating support/modules_libs/jdk.management/libmanagement_ext.so from 8 file(s) > Creating support/modules_libs/jdk.management.agent/libmanagement_agent.so from 1 file(s) > Creating support/modules_cmds/jdk.jconsole/jconsole from 1 file(s) > Creating support/modules_libs/jdk.jdwp.agent/libdt_socket.so from 2 file(s) > Creating support/modules_libs/jdk.jdwp.agent/libjdwp.so from 43 file(s) > Creating support/modules_cmds/jdk.jdi/jdb from 1 file(s) > Creating support/modules_cmds/jdk.jfr/jfr from 1 file(s) > Creating support/modules_cmds/jdk.jshell/jshell from 1 file(s) > Creating support/modules_cmds/jdk.jstatd/jstatd from 1 file(s) > Creating support/modules_libs/jdk.net/libextnet.so from 1 file(s) > Creating support/modules_libs/jdk.sctp/libsctp.so from 2 file(s) > Creating support/modules_libs/jdk.security.auth/libjaas.so from 1 file(s) > Updating images/sec-bin.zip > Compiling 4 files for BUILD_JIGSAW_TOOLS > Optimizing the exploded image > Creating java.datatransfer.jmod > Creating java.compiler.jmod > Creating java.desktop.jmod > Creating java.instrument.jmod > Creating java.logging.jmod > Creating java.management.jmod > Creating java.management.rmi.jmod > Creating java.naming.jmod > Creating java.net.http.jmod > Creating java.prefs.jmod > Creating java.rmi.jmod > Creating java.scripting.jmod > Creating java.se.jmod > Creating java.security.jgss.jmod > Creating java.security.sasl.jmod > Creating java.smartcardio.jmod > Creating java.sql.jmod > Creating java.sql.rowset.jmod > Creating java.transaction.xa.jmod > Creating java.xml.jmod > Creating java.xml.crypto.jmod > Creating jdk.accessibility.jmod > Creating jdk.attach.jmod > Creating jdk.charsets.jmod > Creating jdk.compiler.jmod > Creating jdk.crypto.cryptoki.jmod > Creating jdk.crypto.ec.jmod > Creating jdk.dynalink.jmod > Creating jdk.editpad.jmod > Creating jdk.httpserver.jmod > Creating jdk.hotspot.agent.jmod > Creating jdk.incubator.foreign.jmod > Creating jdk.incubator.jpackage.jmod > Creating jdk.internal.ed.jmod > Creating jdk.internal.jvmstat.jmod > Creating jdk.internal.le.jmod > Creating jdk.internal.opt.jmod > Creating jdk.jartool.jmod > Creating jdk.javadoc.jmod > Creating jdk.jcmd.jmod > Creating jdk.jconsole.jmod > Creating jdk.jdeps.jmod > Creating jdk.jdi.jmod > Creating jdk.jdwp.agent.jmod > Creating jdk.jfr.jmod > Creating interim java.base.jmod > Creating interim java.logging.jmod > Creating jdk.jshell.jmod > Creating jdk.jsobject.jmod > Creating jdk.jstatd.jmod > Creating jdk.localedata.jmod > Creating jdk.management.jmod > Creating jdk.management.agent.jmod > Creating jdk.management.jfr.jmod > Creating jdk.naming.dns.jmod > Creating jdk.naming.rmi.jmod > Creating jdk.net.jmod > Creating jdk.nio.mapmode.jmod > Creating jdk.security.auth.jmod > Creating jdk.sctp.jmod > Creating jdk.security.jgss.jmod > Creating jdk.unsupported.jmod > Creating jdk.unsupported.desktop.jmod > Creating jdk.xml.dom.jmod > Creating jdk.zipfs.jmod > Creating interim jimage > Compiling 3 files for BUILD_DEMO_CodePointIM > Updating support/demos/image/jfc/CodePointIM/src.zip > Compiling 3 files for BUILD_DEMO_FileChooserDemo > Updating support/demos/image/jfc/FileChooserDemo/src.zip > Compiling 29 files for BUILD_DEMO_SwingSet2 > Updating support/demos/image/jfc/SwingSet2/src.zip > Compiling 3 files for BUILD_DEMO_Font2DTest > Updating support/demos/image/jfc/Font2DTest/src.zip > Compiling 64 files for BUILD_DEMO_J2Ddemo > Updating support/demos/image/jfc/J2Ddemo/src.zip > Compiling 15 files for BUILD_DEMO_Metalworks > Compiling 1 files for CLASSLIST_JAR > Updating support/demos/image/jfc/Metalworks/src.zip > Creating support/classlist.jar > Compiling 2 files for BUILD_DEMO_Notepad > Updating support/demos/image/jfc/Notepad/src.zip > Compiling 5 files for BUILD_DEMO_Stylepad > Updating support/demos/image/jfc/Stylepad/src.zip > Compiling 5 files for BUILD_DEMO_SampleTree > Updating support/demos/image/jfc/SampleTree/src.zip > Compiling 8 files for BUILD_DEMO_TableExample > Updating support/demos/image/jfc/TableExample/src.zip > /bin/bash: line 14: 20847 Aborted (core dumped) /workspace/build/linux-arm-server-release/support/interim-image/bin/java -XX:DumpLoadedClassList=/workspace/build/linux-arm-server-release/support/link_opt/classlist.raw.2 -XX:SharedClassListFile=/workspace/build/linux-arm-server-release/support/link_opt/classlist.interim -XX:SharedArchiveFile=/workspace/build/linux-arm-server-release/support/link_opt/classlist.jsa -Djava.lang.invoke.MethodHandle.TRACE_RESOLVE=true -Duser.language=en -Duser.country=US --module-path /workspace/build/linux-arm-server-release/support/classlist.jar -cp /workspace/build/linux-arm-server-release/support/classlist.jar build.tools.classlist.HelloClasslist 2> /workspace/build/linux-arm-server-release/support/link_opt/stderr > /workspace/build/linux-arm-server-release/support/link_opt/default_jli_trace.txt > ERROR: Failed to generate link optimization data. This is likely a problem with the newly built JVM/JDK. > make[3]: *** [/workspace/build/linux-arm-server-release/support/link_opt/classlist] Error 134 > make[2]: *** [generate-link-opt-data] Error 2 > make[2]: *** Waiting for unfinished jobs.... > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x00000000, pid=20847, tid=20848 > # > # JRE version: (16.0) (build ) > # Java VM: OpenJDK Server VM (16-internal+0-adhoc..workspace, mixed mode, sharing, g1 gc, linux-arm) > # Problematic frame: > # C 0x00000000 > # > # Core dump will be written. Default location: /workspace/make/core > # > # An error report file with more information is saved as: > # /workspace/make/hs_err_pid20847.log > # > # > GenerateLinkOptData.gmk:65: recipe for target '/workspace/build/linux-arm-server-release/support/link_opt/classlist' failed > make/Main.gmk:575: recipe for target 'generate-link-opt-data' failed > Compiling 1 files for BUILD_DEMO_TransparentRuler > Updating support/demos/image/jfc/TransparentRuler/src.zip > Creating support/demos/image/jfc/CodePointIM/CodePointIM.jar > Creating support/demos/image/jfc/FileChooserDemo/FileChooserDemo.jar > Creating support/demos/image/jfc/SwingSet2/SwingSet2.jar > Creating support/demos/image/jfc/Font2DTest/Font2DTest.jar > Creating support/demos/image/jfc/J2Ddemo/J2Ddemo.jar > Creating support/demos/image/jfc/Metalworks/Metalworks.jar > Creating support/demos/image/jfc/Notepad/Notepad.jar > Creating support/demos/image/jfc/Stylepad/Stylepad.jar > Creating support/demos/image/jfc/SampleTree/SampleTree.jar > Creating support/demos/image/jfc/TableExample/TableExample.jar > Creating support/demos/image/jfc/TransparentRuler/TransparentRuler.jar > > ERROR: Build failed for target 'images' in configuration 'linux-arm-server-release' (exit code 2) > Stopping sjavac server > > === Make failed targets repeated here === > GenerateLinkOptData.gmk:65: recipe for target '/workspace/build/linux-arm-server-release/support/link_opt/classlist' failed > make/Main.gmk:575: recipe for target 'generate-link-opt-data' failed > === End of repeated output === > > Hint: Try searching the build log for the name of the first failed target. > Hint: See doc/building.html#troubleshooting for assistance. > > /workspace/make/Init.gmk:310: recipe for target 'main' failed > make[1]: *** [main] Error 2 > /workspace/make/Init.gmk:186: recipe for target 'images' failed > make: *** [images] Error 2 > > > > > > > > > === hs_err_pid20847.log === > > > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x00000000, pid=20847, tid=20848 > # > # JRE version: (16.0) (build ) > # Java VM: OpenJDK Server VM (16-internal+0-adhoc..workspace, mixed mode, sharing, g1 gc, linux-arm) > # Problematic frame: > # C 0x00000000 > # > # Core dump will be written. Default location: /workspace/make/core > # > # > > --------------- S U M M A R Y ------------ > > Command Line: -XX:DumpLoadedClassList=/workspace/build/linux-arm-server-release/support/link_opt/classlist.raw.2 -XX:SharedClassListFile=/workspace/build/linux-arm-server-release/support/link_opt/classlist.interim -XX:SharedArchiveFile=/workspace/build/linux-arm-server-release/support/link_opt/classlist.jsa -Djava.lang.invoke.MethodHandle.TRACE_RESOLVE=true -Duser.language=en -Duser.country=US --module-path=/workspace/build/linux-arm-server-release/support/classlist.jar build.tools.classlist.HelloClasslist > > Host: rev 3 (v7l), 4 cores, 3G, Ubuntu 18.04.3 LTS > Time: Mon Oct 12 16:52:00 2020 UTC elapsed time: 0.158241 seconds (0d 0h 0m 0s) > > --------------- T H R E A D --------------- > > Current thread (0xb6314088): JavaThread "Unknown thread" [_thread_in_vm, id=20848, stack(0xb644c000,0xb649c000)] > > Stack: [0xb644c000,0xb649c000], sp=0xb649aac0, free space=314k > > siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x00000000 > > Registers: > r0 = 0x74f29720 > r1 = 0xb6314088 > r2 = 0x00000000 > r3 = 0x00000000 > r4 = 0xb6314088 > r5 = 0x74f29720 > r6 = 0x004403e0 > r7 = 0xb649aac8 > r8 = 0x00000000 > r9 = 0x00000000 > r10 = 0x00000000 > fp = 0x7529ef68 > r12 = 0xb6d2a1ac > sp = 0xb649aac0 > lr = 0xb68e99ef > pc = 0x00000000 > cpsr = 0x400f0010 > > Top of Stack: (sp=0xb649aac0) > 0xb649aac0: b6d31fb8 b64900e0 b6ceefe0 74f29720 > 0xb649aad0: 00000000 ad3ab600 00000000 00000000 > 0xb649aae0: b6d29dd4 b6314088 b649ab08 b63e6b08 > 0xb649aaf0: 00000000 0000000c 7529ef68 b69e3e65 > 0xb649ab00: 00000000 b6314088 00000000 74f29650 > 0xb649ab10: b6d29dd4 b6d48150 00000000 00000000 > 0xb649ab20: b6d29dd4 74f29720 b6314088 00000000 > 0xb649ab30: b649ab48 b63e6b08 7529ef60 0000000c > > Instructions: (pc=0x00000000) > 0xffffff00: > > > > --------------- P R O C E S S --------------- > > Threads class SMR info: > _java_thread_list=0xb6d82c70, length=0, elements={ > } > > Java Threads: ( => current thread ) > > Other Threads: > 0xb6376f68 GCTaskThread "GC Thread#0" [stack: 0x75fd2000,0x76052000] [id=20849] > 0xb637c228 ConcurrentGCThread "G1 Main Marker" [stack: 0x75f50000,0x75fd0000] [id=20850] > 0xb637d128 ConcurrentGCThread "G1 Conc#0" [stack: 0x75d80000,0x75e00000] [id=20851] > 0xb63dd150 ConcurrentGCThread "G1 Refine#0" [stack: 0x75a80000,0x75b00000] [id=20852] > 0xb63de020 ConcurrentGCThread "G1 Service" [stack: 0x75880000,0x75900000] [id=20853] > > =>0xb6314088 (exited) JavaThread "Unknown thread" [_thread_in_vm, id=20848, stack(0xb644c000,0xb649c000)] > > Threads with active compile tasks: > > VM state: not at safepoint (not fully initialized) > > VM Mutex/Monitor currently owned by a thread: None > > GC Precious Log: > CPUs: 4 total, 4 available > Memory: 3827M > Large Page Support: Disabled > NUMA Support: Disabled > Compressed Oops: Disabled > Heap Region Size: 1M > Heap Min Capacity: 6M > Heap Initial Capacity: 60M > Heap Max Capacity: 958M > Pre-touch: Disabled > Parallel Workers: 4 > Concurrent Workers: 1 > Concurrent Refinement Workers: 4 > Periodic GC: Disabled > > Heap: > garbage-first heap total 61440K, used 0K [0x78200000, 0xb4000000) > region size 1024K, 1 young (1024K), 0 survivors (0K) > Metaspace used 0K, capacity 0K, committed 0K, reserved 4400K > > Heap Regions: E=young(eden), S=young(survivor), O=old, HS=humongous(starts), HC=humongous(continues), CS=collection set, F=free, OA=open archive, CA=closed archive, TAMS=top-at-mark-start (previous, next) > | 0|0x78200000, 0x78200000, 0x78300000| 0%| F| |TAMS 0x78200000, 0x78200000| Untracked > | 1|0x78300000, 0x78300000, 0x78400000| 0%| F| |TAMS 0x78300000, 0x78300000| Untracked > | 2|0x78400000, 0x78400000, 0x78500000| 0%| F| |TAMS 0x78400000, 0x78400000| Untracked > | 3|0x78500000, 0x78500000, 0x78600000| 0%| F| |TAMS 0x78500000, 0x78500000| Untracked > | 4|0x78600000, 0x78600000, 0x78700000| 0%| F| |TAMS 0x78600000, 0x78600000| Untracked > | 5|0x78700000, 0x78700000, 0x78800000| 0%| F| |TAMS 0x78700000, 0x78700000| Untracked > | 6|0x78800000, 0x78800000, 0x78900000| 0%| F| |TAMS 0x78800000, 0x78800000| Untracked > | 7|0x78900000, 0x78900000, 0x78a00000| 0%| F| |TAMS 0x78900000, 0x78900000| Untracked > | 8|0x78a00000, 0x78a00000, 0x78b00000| 0%| F| |TAMS 0x78a00000, 0x78a00000| Untracked > | 9|0x78b00000, 0x78b00000, 0x78c00000| 0%| F| |TAMS 0x78b00000, 0x78b00000| Untracked > | 10|0x78c00000, 0x78c00000, 0x78d00000| 0%| F| |TAMS 0x78c00000, 0x78c00000| Untracked > | 11|0x78d00000, 0x78d00000, 0x78e00000| 0%| F| |TAMS 0x78d00000, 0x78d00000| Untracked > | 12|0x78e00000, 0x78e00000, 0x78f00000| 0%| F| |TAMS 0x78e00000, 0x78e00000| Untracked > | 13|0x78f00000, 0x78f00000, 0x79000000| 0%| F| |TAMS 0x78f00000, 0x78f00000| Untracked > | 14|0x79000000, 0x79000000, 0x79100000| 0%| F| |TAMS 0x79000000, 0x79000000| Untracked > | 15|0x79100000, 0x79100000, 0x79200000| 0%| F| |TAMS 0x79100000, 0x79100000| Untracked > | 16|0x79200000, 0x79200000, 0x79300000| 0%| F| |TAMS 0x79200000, 0x79200000| Untracked > | 17|0x79300000, 0x79300000, 0x79400000| 0%| F| |TAMS 0x79300000, 0x79300000| Untracked > | 18|0x79400000, 0x79400000, 0x79500000| 0%| F| |TAMS 0x79400000, 0x79400000| Untracked > | 19|0x79500000, 0x79500000, 0x79600000| 0%| F| |TAMS 0x79500000, 0x79500000| Untracked > | 20|0x79600000, 0x79600000, 0x79700000| 0%| F| |TAMS 0x79600000, 0x79600000| Untracked > | 21|0x79700000, 0x79700000, 0x79800000| 0%| F| |TAMS 0x79700000, 0x79700000| Untracked > | 22|0x79800000, 0x79800000, 0x79900000| 0%| F| |TAMS 0x79800000, 0x79800000| Untracked > | 23|0x79900000, 0x79900000, 0x79a00000| 0%| F| |TAMS 0x79900000, 0x79900000| Untracked > | 24|0x79a00000, 0x79a00000, 0x79b00000| 0%| F| |TAMS 0x79a00000, 0x79a00000| Untracked > | 25|0x79b00000, 0x79b00000, 0x79c00000| 0%| F| |TAMS 0x79b00000, 0x79b00000| Untracked > | 26|0x79c00000, 0x79c00000, 0x79d00000| 0%| F| |TAMS 0x79c00000, 0x79c00000| Untracked > | 27|0x79d00000, 0x79d00000, 0x79e00000| 0%| F| |TAMS 0x79d00000, 0x79d00000| Untracked > | 28|0x79e00000, 0x79e00000, 0x79f00000| 0%| F| |TAMS 0x79e00000, 0x79e00000| Untracked > | 29|0x79f00000, 0x79f00000, 0x7a000000| 0%| F| |TAMS 0x79f00000, 0x79f00000| Untracked > | 30|0x7a000000, 0x7a000000, 0x7a100000| 0%| F| |TAMS 0x7a000000, 0x7a000000| Untracked > | 31|0x7a100000, 0x7a100000, 0x7a200000| 0%| F| |TAMS 0x7a100000, 0x7a100000| Untracked > | 32|0x7a200000, 0x7a200000, 0x7a300000| 0%| F| |TAMS 0x7a200000, 0x7a200000| Untracked > | 33|0x7a300000, 0x7a300000, 0x7a400000| 0%| F| |TAMS 0x7a300000, 0x7a300000| Untracked > | 34|0x7a400000, 0x7a400000, 0x7a500000| 0%| F| |TAMS 0x7a400000, 0x7a400000| Untracked > | 35|0x7a500000, 0x7a500000, 0x7a600000| 0%| F| |TAMS 0x7a500000, 0x7a500000| Untracked > | 36|0x7a600000, 0x7a600000, 0x7a700000| 0%| F| |TAMS 0x7a600000, 0x7a600000| Untracked > | 37|0x7a700000, 0x7a700000, 0x7a800000| 0%| F| |TAMS 0x7a700000, 0x7a700000| Untracked > | 38|0x7a800000, 0x7a800000, 0x7a900000| 0%| F| |TAMS 0x7a800000, 0x7a800000| Untracked > | 39|0x7a900000, 0x7a900000, 0x7aa00000| 0%| F| |TAMS 0x7a900000, 0x7a900000| Untracked > | 40|0x7aa00000, 0x7aa00000, 0x7ab00000| 0%| F| |TAMS 0x7aa00000, 0x7aa00000| Untracked > | 41|0x7ab00000, 0x7ab00000, 0x7ac00000| 0%| F| |TAMS 0x7ab00000, 0x7ab00000| Untracked > | 42|0x7ac00000, 0x7ac00000, 0x7ad00000| 0%| F| |TAMS 0x7ac00000, 0x7ac00000| Untracked > | 43|0x7ad00000, 0x7ad00000, 0x7ae00000| 0%| F| |TAMS 0x7ad00000, 0x7ad00000| Untracked > | 44|0x7ae00000, 0x7ae00000, 0x7af00000| 0%| F| |TAMS 0x7ae00000, 0x7ae00000| Untracked > | 45|0x7af00000, 0x7af00000, 0x7b000000| 0%| F| |TAMS 0x7af00000, 0x7af00000| Untracked > | 46|0x7b000000, 0x7b000000, 0x7b100000| 0%| F| |TAMS 0x7b000000, 0x7b000000| Untracked > | 47|0x7b100000, 0x7b100000, 0x7b200000| 0%| F| |TAMS 0x7b100000, 0x7b100000| Untracked > | 48|0x7b200000, 0x7b200000, 0x7b300000| 0%| F| |TAMS 0x7b200000, 0x7b200000| Untracked > | 49|0x7b300000, 0x7b300000, 0x7b400000| 0%| F| |TAMS 0x7b300000, 0x7b300000| Untracked > | 50|0x7b400000, 0x7b400000, 0x7b500000| 0%| F| |TAMS 0x7b400000, 0x7b400000| Untracked > | 51|0x7b500000, 0x7b500000, 0x7b600000| 0%| F| |TAMS 0x7b500000, 0x7b500000| Untracked > | 52|0x7b600000, 0x7b600000, 0x7b700000| 0%| F| |TAMS 0x7b600000, 0x7b600000| Untracked > | 53|0x7b700000, 0x7b700000, 0x7b800000| 0%| F| |TAMS 0x7b700000, 0x7b700000| Untracked > | 54|0x7b800000, 0x7b800000, 0x7b900000| 0%| F| |TAMS 0x7b800000, 0x7b800000| Untracked > | 55|0x7b900000, 0x7b900000, 0x7ba00000| 0%| F| |TAMS 0x7b900000, 0x7b900000| Untracked > | 56|0x7ba00000, 0x7ba00000, 0x7bb00000| 0%| F| |TAMS 0x7ba00000, 0x7ba00000| Untracked > | 57|0x7bb00000, 0x7bb00000, 0x7bc00000| 0%| F| |TAMS 0x7bb00000, 0x7bb00000| Untracked > | 58|0x7bc00000, 0x7bc00000, 0x7bd00000| 0%| F| |TAMS 0x7bc00000, 0x7bc00000| Untracked > | 59|0x7bd00000, 0x7bd42908, 0x7be00000| 26%| E| |TAMS 0x7bd00000, 0x7bd00000| Complete > > Card table byte_map: [0x78021000,0x78200000] _byte_map_base: 0x77c60000 > > Marking Bits (Prev, Next): (CMBitMap*) 0xb6377de0, (CMBitMap*) 0xb6377e00 > Prev Bits: [0x76f4a000, 0x77e42000) > Next Bits: [0x76052000, 0x76f4a000) > > GC Heap History (0 events): > No events > > Deoptimization events (0 events): > No events > > Classes unloaded (0 events): > No events > > Classes redefined (0 events): > No events > > Internal exceptions (0 events): > No events > > Events (2 events): > Event: 0.005 Protecting memory [0xb644c000,0xb644f000] with protection modes 0 > Event: 0.007 Loaded shared library /workspace/build/linux-arm-server-release/support/interim-image/lib/libjava.so > > > Dynamic libraries: > 00454000-00455000 r-xp 00000000 b3:02 1685994 /workspace/build/linux-arm-server-release/support/interim-image/bin/java > 00464000-00465000 r--p 00000000 b3:02 1685994 /workspace/build/linux-arm-server-release/support/interim-image/bin/java > 00465000-00466000 rw-p 00001000 b3:02 1685994 /workspace/build/linux-arm-server-release/support/interim-image/bin/java > 01293000-012b4000 rw-p 00000000 00:00 0 [heap] > 74ad8000-74f24000 ---p 00000000 00:00 0 > 74f27000-74f29000 rwxp 00001000 b3:02 1810901 /workspace/build/linux-arm-server-release/support/link_opt/classlist.jsa > 74f29000-75700000 rw-p 00003000 b3:02 1810901 /workspace/build/linux-arm-server-release/support/link_opt/classlist.jsa > 75700000-75721000 rw-p 00000000 00:00 0 > 75721000-75800000 ---p 00000000 00:00 0 > 7587e000-7587f000 ---p 00000000 00:00 0 > 7587f000-75900000 rw-p 00000000 00:00 0 > 75900000-75921000 rw-p 00000000 00:00 0 > 75921000-75a00000 ---p 00000000 00:00 0 > 75a23000-75a7e000 rw-p 00000000 00:00 0 > 75a7e000-75a7f000 ---p 00000000 00:00 0 > 75a7f000-75b00000 rw-p 00000000 00:00 0 > 75b00000-75b21000 rw-p 00000000 00:00 0 > 75b21000-75c00000 ---p 00000000 00:00 0 > 75c00000-75c21000 rw-p 00000000 00:00 0 > 75c21000-75d00000 ---p 00000000 00:00 0 > 75d7e000-75d7f000 ---p 00000000 00:00 0 > 75d7f000-75e00000 rw-p 00000000 00:00 0 > 75e00000-75e21000 rw-p 00000000 00:00 0 > 75e21000-75f00000 ---p 00000000 00:00 0 > 75f0d000-75f4e000 rw-p 00000000 00:00 0 > 75f4e000-75f4f000 ---p 00000000 00:00 0 > 75f4f000-75fd0000 rw-p 00000000 00:00 0 > 75fd0000-75fd1000 ---p 00000000 00:00 0 > 75fd1000-76142000 rw-p 00000000 00:00 0 > 76142000-76f4a000 ---p 00000000 00:00 0 > 76f4a000-7703a000 rw-p 00000000 00:00 0 > 7703a000-77e42000 ---p 00000000 00:00 0 > 77e42000-77e60000 rw-p 00000000 00:00 0 > 77e60000-78021000 ---p 00000000 00:00 0 > 78021000-7803f000 rw-p 00000000 00:00 0 > 7803f000-78200000 ---p 00000000 00:00 0 > 78200000-7be00000 rw-p 00000000 00:00 0 > 7be00000-b4000000 ---p 00000000 00:00 0 > b4001000-b402f000 rw-p 00000000 00:00 0 > b402f000-b41f0000 ---p 00000000 00:00 0 > b41f0000-b41f3000 rw-p 00000000 00:00 0 > b41f3000-b4230000 ---p 00000000 00:00 0 > b4230000-b43b0000 rwxp 00000000 00:00 0 > b43b0000-b6230000 ---p 00000000 00:00 0 > b6230000-b6246000 r-xp 00000000 b3:02 659335 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjava.so > b6246000-b6255000 ---p 00016000 b3:02 659335 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjava.so > b6255000-b6256000 r--p 00015000 b3:02 659335 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjava.so > b6256000-b6257000 rw-p 00016000 b3:02 659335 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjava.so > b6257000-b625e000 r-xp 00000000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so > b625e000-b626d000 ---p 00007000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so > b626d000-b626e000 r--p 00006000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so > b626e000-b626f000 rw-p 00007000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so > b626f000-b6275000 rw-p 00000000 00:00 0 > b6275000-b6282000 r-xp 00000000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so > b6282000-b6291000 ---p 0000d000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so > b6291000-b6292000 r--p 0000c000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so > b6292000-b6293000 rw-p 0000d000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so > b6293000-b6295000 rw-p 00000000 00:00 0 > b6295000-b629c000 r-xp 00000000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so > b629c000-b62ab000 ---p 00007000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so > b62ab000-b62ac000 r--p 00006000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so > b62ac000-b62ad000 rw-p 00007000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so > b62ad000-b6300000 r--s 00000000 b3:02 659315 /workspace/build/linux-arm-server-release/support/interim-image/lib/modules > b6300000-b63f1000 rw-p 00000000 00:00 0 > b63f1000-b6400000 ---p 00000000 00:00 0 > b640f000-b6417000 rw-s 00000000 b3:02 2475046 /tmp/hsperfdata_root/20847 > b6417000-b641c000 r-xp 00000000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so > b641c000-b642b000 ---p 00005000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so > b642b000-b642c000 r--p 00004000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so > b642c000-b642d000 rw-p 00005000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so > b642d000-b643b000 r-xp 00000000 b3:02 659333 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjimage.so > b643b000-b644a000 ---p 0000e000 b3:02 659333 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjimage.so > b644a000-b644b000 r--p 0000d000 b3:02 659333 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjimage.so > b644b000-b644c000 rw-p 0000e000 b3:02 659333 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjimage.so > b644c000-b644f000 ---p 00000000 00:00 0 > b644f000-b649c000 rw-p 00000000 00:00 0 > b649c000-b650b000 r-xp 00000000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so > b650b000-b651b000 ---p 0006f000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so > b651b000-b651c000 r--p 0006f000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so > b651c000-b651d000 rw-p 00070000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so > b651d000-b6cdf000 r-xp 00000000 b3:02 1686001 /workspace/build/linux-arm-server-release/support/interim-image/lib/server/libjvm.so > b6cdf000-b6cee000 ---p 007c2000 b3:02 1686001 /workspace/build/linux-arm-server-release/support/interim-image/lib/server/libjvm.so > b6cee000-b6d30000 r--p 007c1000 b3:02 1686001 /workspace/build/linux-arm-server-release/support/interim-image/lib/server/libjvm.so > b6d30000-b6d47000 rw-p 00803000 b3:02 1686001 /workspace/build/linux-arm-server-release/support/interim-image/lib/server/libjvm.so > b6d47000-b6d8a000 rw-p 00000000 00:00 0 > b6d8a000-b6d9b000 r-xp 00000000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so > b6d9b000-b6dab000 ---p 00011000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so > b6dab000-b6dac000 r--p 00011000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so > b6dac000-b6dad000 rw-p 00012000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so > b6dad000-b6daf000 rw-p 00000000 00:00 0 > b6daf000-b6db1000 r-xp 00000000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so > b6db1000-b6dc0000 ---p 00002000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so > b6dc0000-b6dc1000 r--p 00001000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so > b6dc1000-b6dc2000 rw-p 00002000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so > b6dc2000-b6ddb000 r-xp 00000000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 > b6ddb000-b6dea000 ---p 00019000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 > b6dea000-b6deb000 r--p 00018000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 > b6deb000-b6dec000 rw-p 00019000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 > b6dec000-b6ece000 r-xp 00000000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so > b6ece000-b6ede000 ---p 000e2000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so > b6ede000-b6ee0000 r--p 000e2000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so > b6ee0000-b6ee1000 rw-p 000e4000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so > b6ee1000-b6ee4000 rw-p 00000000 00:00 0 > b6ee4000-b6eee000 r-xp 00000000 b3:02 659325 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjli.so > b6eee000-b6efe000 ---p 0000a000 b3:02 659325 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjli.so > b6efe000-b6eff000 r--p 0000a000 b3:02 659325 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjli.so > b6eff000-b6f00000 rw-p 0000b000 b3:02 659325 /workspace/build/linux-arm-server-release/support/interim-image/lib/libjli.so > b6f00000-b6f18000 r-xp 00000000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so > b6f1f000-b6f21000 rw-p 00000000 00:00 0 > b6f23000-b6f24000 ---p 00000000 00:00 0 > b6f24000-b6f25000 r--p 00000000 00:00 0 > b6f25000-b6f26000 ---p 00000000 00:00 0 > b6f26000-b6f28000 rw-p 00000000 00:00 0 > b6f28000-b6f29000 r--p 00018000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so > b6f29000-b6f2a000 rw-p 00019000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so > beefd000-bef1e000 rw-p 00000000 00:00 0 [stack] > befd8000-befd9000 r-xp 00000000 00:00 0 [sigpage] > befd9000-befda000 r--p 00000000 00:00 0 [vvar] > befda000-befdb000 r-xp 00000000 00:00 0 [vdso] > ffff0000-ffff1000 r-xp 00000000 00:00 0 [vectors] > > > VM Arguments: > jvm_args: -XX:DumpLoadedClassList=/workspace/build/linux-arm-server-release/support/link_opt/classlist.raw.2 -XX:SharedClassListFile=/workspace/build/linux-arm-server-release/support/link_opt/classlist.interim -XX:SharedArchiveFile=/workspace/build/linux-arm-server-release/support/link_opt/classlist.jsa -Djava.lang.invoke.MethodHandle.TRACE_RESOLVE=true -Duser.language=en -Duser.country=US --module-path=/workspace/build/linux-arm-server-release/support/classlist.jar > java_command: build.tools.classlist.HelloClasslist > java_class_path (initial): /workspace/build/linux-arm-server-release/support/classlist.jar > Launcher Type: SUN_STANDARD > > [Global flags] > uint ConcGCThreads = 1 {product} {ergonomic} > ccstr DumpLoadedClassList = /workspace/build/linux-arm-server-release/support/link_opt/classlist.raw.2 {product} {command line} > uint G1ConcRefinementThreads = 4 {product} {ergonomic} > size_t G1HeapRegionSize = 1048576 {product} {ergonomic} > uintx GCDrainStackTargetSize = 64 {product} {ergonomic} > size_t InitialHeapSize = 62914560 {product} {ergonomic} > size_t MarkStackSize = 32768 {product} {ergonomic} > size_t MaxHeapSize = 1004535808 {product} {ergonomic} > size_t MaxNewSize = 601882624 {product} {ergonomic} > size_t MinHeapDeltaBytes = 1048576 {product} {ergonomic} > size_t MinHeapSize = 6291456 {product} {ergonomic} > uintx NonProfiledCodeHeapSize = 0 {pd product} {ergonomic} > uintx ProfiledCodeHeapSize = 0 {pd product} {ergonomic} > ccstr SharedArchiveFile = /workspace/build/linux-arm-server-release/support/link_opt/classlist.jsa {product} {command line} > ccstr SharedClassListFile = /workspace/build/linux-arm-server-release/support/link_opt/classlist.interim {product} {command line} > size_t SoftMaxHeapSize = 1004535808 {manageable} {ergonomic} > bool UseG1GC = true {product} {ergonomic} > > Logging: > Log output configuration: > #0: stdout all=warning uptime,level,tags > #1: stderr all=off uptime,level,tags > > Environment Variables: > JAVA_HOME=/opt/java/openjdk > PATH=/opt/java/openjdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin > LC_ALL=C > > Signal Handlers: > SIGSEGV: [libjvm.so+0x71d0e1], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO > SIGBUS: [libjvm.so+0x71d0e1], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO > SIGFPE: [libjvm.so+0x71d0e1], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO > SIGPIPE: [libjvm.so+0x6549d1], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO > SIGXFSZ: [libjvm.so+0x6549d1], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO > SIGILL: [libjvm.so+0x71d0e1], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO > SIGUSR2: [libjvm.so+0x6548c1], sa_mask[0]=00000000000000000000000000000000, sa_flags=SA_RESTART|SA_SIGINFO > SIGHUP: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none > SIGINT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none > SIGTERM: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none > SIGQUIT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none > > > --------------- S Y S T E M --------------- > > OS: > DISTRIB_ID=Ubuntu > DISTRIB_RELEASE=18.04 > DISTRIB_CODENAME=bionic > DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS" > uname: Linux 5.4.51-v7l+ #1333 SMP Mon Aug 10 16:51:40 BST 2020 armv7l > OS uptime: 14 days 4:29 hours > libc: glibc 2.27 NPTL 2.27 > rlimit (soft/hard): STACK 8192k/infinity , CORE infinity/infinity , NPROC infinity/infinity , NOFILE 1048576/1048576 , AS infinity/infinity , CPU infinity/infinity , DATA infinity/infinity , FSIZE infinity/infinity , MEMLOCK 64k/64k > load average: 5.03 3.88 3.74 > > /proc/meminfo: > MemTotal: 3919812 kB > MemFree: 1065720 kB > MemAvailable: 3390780 kB > Buffers: 131112 kB > Cached: 2181160 kB > SwapCached: 0 kB > Active: 1343692 kB > Inactive: 1274264 kB > Active(anon): 248060 kB > Inactive(anon): 69168 kB > Active(file): 1095632 kB > Inactive(file): 1205096 kB > Unevictable: 16 kB > Mlocked: 16 kB > HighTotal: 3264512 kB > HighFree: 854016 kB > LowTotal: 655300 kB > LowFree: 211704 kB > SwapTotal: 102396 kB > SwapFree: 102396 kB > Dirty: 29260 kB > Writeback: 0 kB > AnonPages: 305752 kB > Mapped: 120900 kB > Shmem: 16892 kB > KReclaimable: 186028 kB > Slab: 209376 kB > SReclaimable: 186028 kB > SUnreclaim: 23348 kB > KernelStack: 2608 kB > PageTables: 3244 kB > NFS_Unstable: 0 kB > Bounce: 0 kB > WritebackTmp: 0 kB > CommitLimit: 2062300 kB > Committed_AS: 1266456 kB > VmallocTotal: 245760 kB > VmallocUsed: 5520 kB > VmallocChunk: 0 kB > Percpu: 512 kB > CmaTotal: 262144 kB > CmaFree: 170068 kB > > /sys/kernel/mm/transparent_hugepage/enabled: > /sys/kernel/mm/transparent_hugepage/defrag (defrag/compaction efforts parameter): > > Process Memory: > Virtual Size: 1084744K (peak: 1084996K) > Resident Set Size: 15988K (peak: 16088K) (anon: 7936K, file: 8052K, shmem: 0K) > Swapped out: 0K > C-Heap outstanding allocations: 994K > > /proc/sys/kernel/threads-max (system-wide limit on the number of threads): 57119 > /proc/sys/vm/max_map_count (maximum number of memory map areas a process may have): 65530 > /proc/sys/kernel/pid_max (system-wide limit on number of process identifiers): 32768 > > Steal ticks since vm start: 0 > Steal ticks percentage since vm start: 0.000 > > CPU: total 4 (initial active 4) (ARMv7), vfp, vfp3-32, simd, mp_ext > /proc/cpuinfo: > processor : 0 > model name : ARMv7 Processor rev 3 (v7l) > BogoMIPS : 270.00 > Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 > CPU implementer : 0x41 > CPU architecture: 7 > CPU variant : 0x0 > CPU part : 0xd08 > CPU revision : 3 > > processor : 1 > model name : ARMv7 Processor rev 3 (v7l) > BogoMIPS : 270.00 > Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 > CPU implementer : 0x41 > CPU architecture: 7 > CPU variant : 0x0 > CPU part : 0xd08 > CPU revision : 3 > > processor : 2 > model name : ARMv7 Processor rev 3 (v7l) > BogoMIPS : 270.00 > Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 > CPU implementer : 0x41 > CPU architecture: 7 > CPU variant : 0x0 > CPU part : 0xd08 > CPU revision : 3 > > processor : 3 > model name : ARMv7 Processor rev 3 (v7l) > BogoMIPS : 270.00 > Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 > CPU implementer : 0x41 > CPU architecture: 7 > CPU variant : 0x0 > CPU part : 0xd08 > CPU revision : 3 > > Hardware : BCM2711 > Revision : c03111 > Serial : 100000001c47254f > Model : Raspberry Pi 4 Model B Rev 1.1 > > Online cpus: 0-3 > Offline cpus: > > Memory: 4k page, physical 3919812k(1065720k free), swap 102396k(102396k free) > > vm_info: OpenJDK Server VM (16-internal+0-adhoc..workspace) for linux-arm JRE (16-internal+0-adhoc..workspace), built on Oct 12 2020 16:24:13 by "" with gcc 7.5.0 > > END. From mcimadamore at openjdk.java.net Mon Oct 12 20:18:50 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Mon, 12 Oct 2020 20:18:50 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v8] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the third incubation round of the foreign memory access API incubation > (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: > * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from > multiple threads > * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee > that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class > has been added, which defines several useful dereference routines; these are really just thin wrappers around memory > access var handles, but they make the barrier of entry for using this API somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not > the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link > to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit > of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which > wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as > dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability > in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; > secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can > use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done > by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided > below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be > happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, > Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd > like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio > Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative > to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a > carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access > base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte > offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. > `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which > it is easy to derive all the other handles using plain var handle combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both > `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients > can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. > * `MemoryLayouts` > * A new layout, for machine addresses, has been added to the mix. > > > > ### Implementation changes > > There are two main things to discuss here: support for shared segments, and the general simplification of the memory > access var handle support. > #### Shared segments > > The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to > achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment > is shared, it would be possible for a thread to close it while another is accessing it. After considering several > options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he > reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world > (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a > close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and > the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). > It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of > these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, > we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to > whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of > stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, > once we detect that a thread is accessing the very segment we're about to close, what should happen? We first > experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it > fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the > machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to > minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread > is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and > try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should > be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single > place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in > addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` > annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) > class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, > like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is > tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during > access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory > access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead > of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a > `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed > successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two > implementations, one for confined segments and one for shared segments; the main difference between the two is what > happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared > segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or > `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` > state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. #### > Memory access var handles overhaul The key realization here was that if all memory access var handles took a > coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle > form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var > handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that > e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the > implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level > access on the innards of the memory access var handle. All that code is now gone. #### Test changes Not much to see > here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, > since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` > functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the > microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared > segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - > https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Simplify example in the toplevel javadoc ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/548/files - new: https://git.openjdk.java.net/jdk/pull/548/files/75e406c0..d14d06a4 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=07 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=06-07 Stats: 14 lines in 1 file changed: 4 ins; 5 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/548.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/548/head:pull/548 PR: https://git.openjdk.java.net/jdk/pull/548 From hoffmann at mountainminds.com Mon Oct 12 20:34:04 2020 From: hoffmann at mountainminds.com (Marc Hoffmann) Date: Mon, 12 Oct 2020 22:34:04 +0200 Subject: arm32 builds continue to fail for me after 8253540 and 8253901 In-Reply-To: <56ff08d5-a4e5-788a-1c29-02f76e8755d2@redhat.com> References: <56ff08d5-a4e5-788a-1c29-02f76e8755d2@redhat.com> Message-ID: <17F91692-4F3D-4FAA-AB94-361B6C84F982@mountainminds.com> Hi Aleksey, hi Boris, for me the crash is always reproducible: Every single build after 77a0f3999afa322b64643afd4a161164440af975 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function fails on arm32 (build on ubuntu in docker on a raspberry pi 4). Before this commit I haven?t encountered any failures. Here is the hs_err file with ?enable-debug (reproduced with current master c7f00640627eab38b77d23d07876cf0247fa18f3). Cheers, -marc # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/workspace/src/hotspot/share/asm/register.hpp:160), pid=14700, tid=14705 # assert(a != b && a != c && a != d && a != e && b != c && b != d && b != e && c != d && c != e && d != e) failed: registers must be different: a=0x00000002, b=0x00000003, c=0x00000000, d=0x00000000, e=0x0000000c # # JRE version: (16.0) (fastdebug build ) # Java VM: OpenJDK Server VM (fastdebug 16-internal+0-adhoc..workspace, mixed mode, g1 gc, linux-arm) # Problematic frame: # V [libjvm.so+0x7571fc] InterpreterMacroAssembler::unlock_object(RegisterImpl*) [clone .part.34]+0x63 # # Core dump will be written. Default location: /workspace/make/core # # --------------- S U M M A R Y ------------ Command Line: -Xms64M -Xmx768M --add-exports=java.base/jdk.internal.module=ALL-UNNAMED build.tools.jigsaw.AddPackagesAttribute /workspace/build/linux-arm-server-fastdebug/jdk Host: 20431585315d, rev 3 (v7l), 4 cores, 3G, Ubuntu 18.04.3 LTS Time: Mon Oct 12 20:22:15 2020 UTC elapsed time: 0.144243 seconds (0d 0h 0m 0s) --------------- T H R E A D --------------- Current thread (0xb5b16460): JavaThread "Unknown thread" [_thread_in_vm, id=14705, stack(0xb5c6e000,0xb5cbe000)] Stack: [0xb5c6e000,0xb5cbe000], sp=0xb5cbc2d0, free space=312k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x7571fc] InterpreterMacroAssembler::unlock_object(RegisterImpl*) [clone .part.34]+0x63 Registers: r0 = 0x00000003 r1 = 0x000000a0 r2 = 0x00000002 r3 = 0x00000000 r4 = 0xb5b168b0 r5 = 0x0000000c r6 = 0x00000000 r7 = 0xb5cbc2e8 r8 = 0xb6db1fa8 r9 = 0xb5cbc760 r10 = 0xe3520000 fp = 0xb6db1fa8 r12 = 0xb6ff8000 sp = 0xb5cbc2d0 lr = 0x00000058 pc = 0xb64961fc cpsr = 0x200f0030 Top of Stack: (sp=0xb5cbc2d0) 0xb5cbc2d0: 00000002 00000003 00000000 00000000 0xb5cbc2e0: 0000000c 00000048 0000006e 00000000 0xb5cbc2f0: 00000000 00000000 00000000 0000007c 0xb5cbc300: 00000000 00000077 b5cbc378 00000000 0xb5cbc310: b5cbc380 0000000f b6db1fa8 b5cbc340 0xb5cbc320: b5b168b0 b6db1fa8 b5cbc4c4 b6008a2b 0xb5cbc330: b5cbc454 b5cbc348 0000000f b61971cf 0xb5cbc340: b5cbc380 b5cbc3b0 b5b168b0 b5cbc388 Instructions: (pc=0xb64961fc) 0xb64960fc: 440be9c7 a034f8c7 f5e06139 68ebf7ed 0xb649610c: f040689a 46184164 1180f441 68996011 0xb649611c: f5e03104 4b12f781 46284622 1003f858 0xb649612c: f998f2e4 f64f68eb f2ce7210 6899122f 0xb649613c: 600a4618 31046899 f76ef5e0 46284631 0xb649614c: f4e6f69f f5e04630 f507f79b 46bd7786 0xb649615c: 8ff0e8bd 0091bfd0 00006ee4 000059d0 0xb649616c: 00007d1c 4a084b07 b480447b 589baf00 0xb649617c: b91b781b f85d46bd 47707b04 f85d46bd 0xb649618c: e7197b04 0091be30 00006a24 bf182900 0xb649619c: f1a1290c e92d0202 b0f34ff0 2301bf08 0xb64961ac: bf18af06 f8df2300 2a018268 461abf8c 0xb64961bc: 0201f043 f04f460e 44f833ff 30d0f8c7 0xb64961cc: f8c74604 23003140 333de9c7 30fcf887 0xb64961dc: 3359e9c7 316cf887 4a8eb1da 0e58f04f 0xb64961ec: 250c2003 1002f858 f8d12202 21a0c000 0xb64961fc: e000f88c 2000e9cd 6302e9cd 4b874a86 0xb649620c: 447a4887 9504447b f58f4478 f00cfaa3 0xb649621c: 68e2f2d5 4340f44f 33a0f2ce 0a04f04f 0xb649622c: 7980f04f 0b00f04f 46106891 600b2501 0xb649623c: 44516891 f6f0f5e0 4a7b497a 3001f858 0xb649624c: 0140f107 a048f8c7 4608643e f8c7681b 0xb649625c: 60fb904c f858647b f8c72002 f8c7b054 0xb649626c: 60bab058 e9c73208 653abb19 66fd607a 0xb649627c: f732f5e0 689a68e3 4164f040 f4414618 0xb649628c: 60111181 44516899 f6c6f5e0 0110f107 0xb649629c: 68fb687a f8c74608 623aa018 6304e9c7 0xb64962ac: 901cf8c7 bb09e9c7 bb0de9c7 f5e063fd 0xb64962bc: 68e3f713 f0406899 46184264 4240f442 0xb64962cc: 6899600a f1074451 f5e00ad0 4b57f6a5 0xb64962dc: 3003f858 2b00781b 8093f040 f04f68bb 0xb64962ec: 68f97c80 33082500 c0acf8c7 0b01f04f --------------- P R O C E S S --------------- uid : 0 euid : 0 gid : 0 egid : 0 umask: 0022 (----w--w-) Threads class SMR info: _java_thread_list=0xb6e56078, length=0, elements={ } _java_thread_list_alloc_cnt=1, _java_thread_list_free_cnt=0, _java_thread_list_max=0, _nested_thread_list_max=0 _delete_lock_wait_cnt=0, _delete_lock_wait_max=0 _to_delete_list_cnt=0, _to_delete_list_max=0 Java Threads: ( => current thread ) Other Threads: 0xb5b73188 GCTaskThread "GC Thread#0" [stack: 0x81d00000,0x81d80000] [id=14706] 0xb5b77dc0 ConcurrentGCThread "G1 Main Marker" [stack: 0x81c7e000,0x81cfe000] [id=14707] 0xb5b790c0 ConcurrentGCThread "G1 Conc#0" [stack: 0x81a80000,0x81b00000] [id=14708] 0xb5bde230 ConcurrentGCThread "G1 Refine#0" [stack: 0x81780000,0x81800000] [id=14709] 0xb5bdf488 ConcurrentGCThread "G1 Service" [stack: 0x81580000,0x81600000] [id=14710] =>0xb5b16460 (exited) JavaThread "Unknown thread" [_thread_in_vm, id=14705, stack(0xb5c6e000,0xb5cbe000)] Threads with active compile tasks: VM state: not at safepoint (not fully initialized) VM Mutex/Monitor currently owned by a thread: None GC Precious Log: CPUs: 4 total, 4 available Memory: 3827M Large Page Support: Disabled NUMA Support: Disabled Compressed Oops: Disabled Heap Region Size: 1M Heap Min Capacity: 64M Heap Initial Capacity: 64M Heap Max Capacity: 768M Pre-touch: Disabled Parallel Workers: 4 Concurrent Workers: 1 Concurrent Refinement Workers: 4 Periodic GC: Disabled Heap: garbage-first heap total 65536K, used 0K [0x83a00000, 0xb3a00000) region size 1024K, 1 young (1024K), 0 survivors (0K) Metaspace used 944K, capacity 2200K, committed 2200K, reserved 4400K Heap Regions: E=young(eden), S=young(survivor), O=old, HS=humongous(starts), HC=humongous(continues), CS=collection set, F=free, OA=open archive, CA=closed archive, TAMS=top-at-mark-start (previous, next) | 0|0x83a00000, 0x83a00000, 0x83b00000| 0%| F| |TAMS 0x83a00000, 0x83a00000| Untracked | 1|0x83b00000, 0x83b00000, 0x83c00000| 0%| F| |TAMS 0x83b00000, 0x83b00000| Untracked | 2|0x83c00000, 0x83c00000, 0x83d00000| 0%| F| |TAMS 0x83c00000, 0x83c00000| Untracked | 3|0x83d00000, 0x83d00000, 0x83e00000| 0%| F| |TAMS 0x83d00000, 0x83d00000| Untracked | 4|0x83e00000, 0x83e00000, 0x83f00000| 0%| F| |TAMS 0x83e00000, 0x83e00000| Untracked | 5|0x83f00000, 0x83f00000, 0x84000000| 0%| F| |TAMS 0x83f00000, 0x83f00000| Untracked | 6|0x84000000, 0x84000000, 0x84100000| 0%| F| |TAMS 0x84000000, 0x84000000| Untracked | 7|0x84100000, 0x84100000, 0x84200000| 0%| F| |TAMS 0x84100000, 0x84100000| Untracked | 8|0x84200000, 0x84200000, 0x84300000| 0%| F| |TAMS 0x84200000, 0x84200000| Untracked | 9|0x84300000, 0x84300000, 0x84400000| 0%| F| |TAMS 0x84300000, 0x84300000| Untracked | 10|0x84400000, 0x84400000, 0x84500000| 0%| F| |TAMS 0x84400000, 0x84400000| Untracked | 11|0x84500000, 0x84500000, 0x84600000| 0%| F| |TAMS 0x84500000, 0x84500000| Untracked | 12|0x84600000, 0x84600000, 0x84700000| 0%| F| |TAMS 0x84600000, 0x84600000| Untracked | 13|0x84700000, 0x84700000, 0x84800000| 0%| F| |TAMS 0x84700000, 0x84700000| Untracked | 14|0x84800000, 0x84800000, 0x84900000| 0%| F| |TAMS 0x84800000, 0x84800000| Untracked | 15|0x84900000, 0x84900000, 0x84a00000| 0%| F| |TAMS 0x84900000, 0x84900000| Untracked | 16|0x84a00000, 0x84a00000, 0x84b00000| 0%| F| |TAMS 0x84a00000, 0x84a00000| Untracked | 17|0x84b00000, 0x84b00000, 0x84c00000| 0%| F| |TAMS 0x84b00000, 0x84b00000| Untracked | 18|0x84c00000, 0x84c00000, 0x84d00000| 0%| F| |TAMS 0x84c00000, 0x84c00000| Untracked | 19|0x84d00000, 0x84d00000, 0x84e00000| 0%| F| |TAMS 0x84d00000, 0x84d00000| Untracked | 20|0x84e00000, 0x84e00000, 0x84f00000| 0%| F| |TAMS 0x84e00000, 0x84e00000| Untracked | 21|0x84f00000, 0x84f00000, 0x85000000| 0%| F| |TAMS 0x84f00000, 0x84f00000| Untracked | 22|0x85000000, 0x85000000, 0x85100000| 0%| F| |TAMS 0x85000000, 0x85000000| Untracked | 23|0x85100000, 0x85100000, 0x85200000| 0%| F| |TAMS 0x85100000, 0x85100000| Untracked | 24|0x85200000, 0x85200000, 0x85300000| 0%| F| |TAMS 0x85200000, 0x85200000| Untracked | 25|0x85300000, 0x85300000, 0x85400000| 0%| F| |TAMS 0x85300000, 0x85300000| Untracked | 26|0x85400000, 0x85400000, 0x85500000| 0%| F| |TAMS 0x85400000, 0x85400000| Untracked | 27|0x85500000, 0x85500000, 0x85600000| 0%| F| |TAMS 0x85500000, 0x85500000| Untracked | 28|0x85600000, 0x85600000, 0x85700000| 0%| F| |TAMS 0x85600000, 0x85600000| Untracked | 29|0x85700000, 0x85700000, 0x85800000| 0%| F| |TAMS 0x85700000, 0x85700000| Untracked | 30|0x85800000, 0x85800000, 0x85900000| 0%| F| |TAMS 0x85800000, 0x85800000| Untracked | 31|0x85900000, 0x85900000, 0x85a00000| 0%| F| |TAMS 0x85900000, 0x85900000| Untracked | 32|0x85a00000, 0x85a00000, 0x85b00000| 0%| F| |TAMS 0x85a00000, 0x85a00000| Untracked | 33|0x85b00000, 0x85b00000, 0x85c00000| 0%| F| |TAMS 0x85b00000, 0x85b00000| Untracked | 34|0x85c00000, 0x85c00000, 0x85d00000| 0%| F| |TAMS 0x85c00000, 0x85c00000| Untracked | 35|0x85d00000, 0x85d00000, 0x85e00000| 0%| F| |TAMS 0x85d00000, 0x85d00000| Untracked | 36|0x85e00000, 0x85e00000, 0x85f00000| 0%| F| |TAMS 0x85e00000, 0x85e00000| Untracked | 37|0x85f00000, 0x85f00000, 0x86000000| 0%| F| |TAMS 0x85f00000, 0x85f00000| Untracked | 38|0x86000000, 0x86000000, 0x86100000| 0%| F| |TAMS 0x86000000, 0x86000000| Untracked | 39|0x86100000, 0x86100000, 0x86200000| 0%| F| |TAMS 0x86100000, 0x86100000| Untracked | 40|0x86200000, 0x86200000, 0x86300000| 0%| F| |TAMS 0x86200000, 0x86200000| Untracked | 41|0x86300000, 0x86300000, 0x86400000| 0%| F| |TAMS 0x86300000, 0x86300000| Untracked | 42|0x86400000, 0x86400000, 0x86500000| 0%| F| |TAMS 0x86400000, 0x86400000| Untracked | 43|0x86500000, 0x86500000, 0x86600000| 0%| F| |TAMS 0x86500000, 0x86500000| Untracked | 44|0x86600000, 0x86600000, 0x86700000| 0%| F| |TAMS 0x86600000, 0x86600000| Untracked | 45|0x86700000, 0x86700000, 0x86800000| 0%| F| |TAMS 0x86700000, 0x86700000| Untracked | 46|0x86800000, 0x86800000, 0x86900000| 0%| F| |TAMS 0x86800000, 0x86800000| Untracked | 47|0x86900000, 0x86900000, 0x86a00000| 0%| F| |TAMS 0x86900000, 0x86900000| Untracked | 48|0x86a00000, 0x86a00000, 0x86b00000| 0%| F| |TAMS 0x86a00000, 0x86a00000| Untracked | 49|0x86b00000, 0x86b00000, 0x86c00000| 0%| F| |TAMS 0x86b00000, 0x86b00000| Untracked | 50|0x86c00000, 0x86c00000, 0x86d00000| 0%| F| |TAMS 0x86c00000, 0x86c00000| Untracked | 51|0x86d00000, 0x86d00000, 0x86e00000| 0%| F| |TAMS 0x86d00000, 0x86d00000| Untracked | 52|0x86e00000, 0x86e00000, 0x86f00000| 0%| F| |TAMS 0x86e00000, 0x86e00000| Untracked | 53|0x86f00000, 0x86f00000, 0x87000000| 0%| F| |TAMS 0x86f00000, 0x86f00000| Untracked | 54|0x87000000, 0x87000000, 0x87100000| 0%| F| |TAMS 0x87000000, 0x87000000| Untracked | 55|0x87100000, 0x87100000, 0x87200000| 0%| F| |TAMS 0x87100000, 0x87100000| Untracked | 56|0x87200000, 0x87200000, 0x87300000| 0%| F| |TAMS 0x87200000, 0x87200000| Untracked | 57|0x87300000, 0x87300000, 0x87400000| 0%| F| |TAMS 0x87300000, 0x87300000| Untracked | 58|0x87400000, 0x87400000, 0x87500000| 0%| F| |TAMS 0x87400000, 0x87400000| Untracked | 59|0x87500000, 0x87500000, 0x87600000| 0%| F| |TAMS 0x87500000, 0x87500000| Untracked | 60|0x87600000, 0x87600000, 0x87700000| 0%| F| |TAMS 0x87600000, 0x87600000| Untracked | 61|0x87700000, 0x87700000, 0x87800000| 0%| F| |TAMS 0x87700000, 0x87700000| Untracked | 62|0x87800000, 0x87800000, 0x87900000| 0%| F| |TAMS 0x87800000, 0x87800000| Untracked | 63|0x87900000, 0x87942908, 0x87a00000| 26%| E| |TAMS 0x87900000, 0x87900000| Complete Card table byte_map: [0x83700000,0x83880000] _byte_map_base: 0x832e3000 Marking Bits (Prev, Next): (CMBitMap*) 0xb5b74324, (CMBitMap*) 0xb5b74344 Prev Bits: [0x82980000, 0x83580000) Next Bits: [0x81d80000, 0x82980000) GC Heap History (0 events): No events Deoptimization events (0 events): No events Classes unloaded (0 events): No events Classes redefined (0 events): No events Internal exceptions (0 events): No events Events (20 events): Event: 0.113 loading class java/lang/Character Event: 0.114 loading class java/lang/Character done Event: 0.114 loading class java/lang/Float Event: 0.115 loading class java/lang/Number Event: 0.115 loading class java/lang/Number done Event: 0.115 loading class java/lang/Float done Event: 0.115 loading class java/lang/Double Event: 0.116 loading class java/lang/Double done Event: 0.116 loading class java/lang/Byte Event: 0.116 loading class java/lang/Byte done Event: 0.116 loading class java/lang/Short Event: 0.117 loading class java/lang/Short done Event: 0.117 loading class java/lang/Integer Event: 0.118 loading class java/lang/Integer done Event: 0.118 loading class java/lang/Long Event: 0.119 loading class java/lang/Long done Event: 0.119 loading class java/util/Iterator Event: 0.119 loading class java/util/Iterator done Event: 0.119 loading class java/lang/reflect/RecordComponent Event: 0.119 loading class java/lang/reflect/RecordComponent done Dynamic libraries: 00410000-00411000 r-xp 00000000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java 00420000-00421000 r--p 00000000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java 00421000-00422000 rw-p 00001000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java 019b6000-019d7000 rw-p 00000000 00:00 0 [heap] 809c9000-80e00000 rw-p 00000000 00:00 0 80e00000-80e8e000 rw-p 00000000 00:00 0 80e8e000-80f00000 ---p 00000000 00:00 0 80fb4000-811da000 rw-p 00000000 00:00 0 811da000-81400000 ---p 00000000 00:00 0 81400000-81421000 rw-p 00000000 00:00 0 81421000-81500000 ---p 00000000 00:00 0 8157e000-8157f000 ---p 00000000 00:00 0 8157f000-81600000 rw-p 00000000 00:00 0 81600000-81621000 rw-p 00000000 00:00 0 81621000-81700000 ---p 00000000 00:00 0 8177e000-8177f000 ---p 00000000 00:00 0 8177f000-81800000 rw-p 00000000 00:00 0 81800000-81821000 rw-p 00000000 00:00 0 81821000-81900000 ---p 00000000 00:00 0 81900000-81921000 rw-p 00000000 00:00 0 81921000-81a00000 ---p 00000000 00:00 0 81a7e000-81a7f000 ---p 00000000 00:00 0 81a7f000-81b00000 rw-p 00000000 00:00 0 81b00000-81b21000 rw-p 00000000 00:00 0 81b21000-81c00000 ---p 00000000 00:00 0 81c21000-81c7c000 rw-p 00000000 00:00 0 81c7c000-81c7d000 ---p 00000000 00:00 0 81c7d000-81cfe000 rw-p 00000000 00:00 0 81cfe000-81cff000 ---p 00000000 00:00 0 81cff000-81e80000 rw-p 00000000 00:00 0 81e80000-82980000 ---p 00000000 00:00 0 82980000-82a80000 rw-p 00000000 00:00 0 82a80000-83580000 ---p 00000000 00:00 0 83580000-835a0000 rw-p 00000000 00:00 0 835a0000-83700000 ---p 00000000 00:00 0 83700000-83720000 rw-p 00000000 00:00 0 83720000-83880000 ---p 00000000 00:00 0 83880000-838a0000 rw-p 00000000 00:00 0 838a0000-83a00000 ---p 00000000 00:00 0 83a00000-87a00000 rw-p 00000000 00:00 0 87a00000-b3a00000 ---p 00000000 00:00 0 b3a25000-b3a76000 rw-p 00000000 00:00 0 b3a76000-b3ab3000 ---p 00000000 00:00 0 b3ab3000-b3c33000 rwxp 00000000 00:00 0 b3c33000-b5ab3000 ---p 00000000 00:00 0 b5ab3000-b5ac8000 r-xp 00000000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so b5ac8000-b5ad8000 ---p 00015000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so b5ad8000-b5ad9000 r--p 00015000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so b5ad9000-b5ada000 rw-p 00016000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so b5ada000-b5ae2000 rw-s 00000000 b3:02 2576900 /tmp/hsperfdata_root/14700 b5ae2000-b5ae9000 r-xp 00000000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so b5ae9000-b5af8000 ---p 00007000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so b5af8000-b5af9000 r--p 00006000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so b5af9000-b5afa000 rw-p 00007000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so b5afa000-b5b00000 rw-p 00000000 00:00 0 b5b00000-b5c00000 rw-p 00000000 00:00 0 b5c00000-b5c0d000 r-xp 00000000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so b5c0d000-b5c1c000 ---p 0000d000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so b5c1c000-b5c1d000 r--p 0000c000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so b5c1d000-b5c1e000 rw-p 0000d000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so b5c1e000-b5c20000 rw-p 00000000 00:00 0 b5c20000-b5c27000 r-xp 00000000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so b5c27000-b5c36000 ---p 00007000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so b5c36000-b5c37000 r--p 00006000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so b5c37000-b5c38000 rw-p 00007000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so b5c38000-b5c3d000 r-xp 00000000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so b5c3d000-b5c4c000 ---p 00005000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so b5c4c000-b5c4d000 r--p 00004000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so b5c4d000-b5c4e000 rw-p 00005000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so b5c4e000-b5c5d000 r-xp 00000000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so b5c5d000-b5c6c000 ---p 0000f000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so b5c6c000-b5c6d000 r--p 0000e000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so b5c6d000-b5c6e000 rw-p 0000f000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so b5c6e000-b5c71000 ---p 00000000 00:00 0 b5c71000-b5cbe000 rw-p 00000000 00:00 0 b5cbe000-b5d2d000 r-xp 00000000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so b5d2d000-b5d3d000 ---p 0006f000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so b5d3d000-b5d3e000 r--p 0006f000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so b5d3e000-b5d3f000 rw-p 00070000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so b5d3f000-b6d56000 r-xp 00000000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so b6d56000-b6d65000 ---p 01017000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so b6d65000-b6dba000 r--p 01016000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so b6dba000-b6dd2000 rw-p 0106b000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so b6dd2000-b6e5e000 rw-p 00000000 00:00 0 b6e5e000-b6e6f000 r-xp 00000000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so b6e6f000-b6e7f000 ---p 00011000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so b6e7f000-b6e80000 r--p 00011000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so b6e80000-b6e81000 rw-p 00012000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so b6e81000-b6e83000 rw-p 00000000 00:00 0 b6e83000-b6e85000 r-xp 00000000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so b6e85000-b6e94000 ---p 00002000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so b6e94000-b6e95000 r--p 00001000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so b6e95000-b6e96000 rw-p 00002000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so b6e96000-b6eaf000 r-xp 00000000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 b6eaf000-b6ebe000 ---p 00019000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 b6ebe000-b6ebf000 r--p 00018000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 b6ebf000-b6ec0000 rw-p 00019000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 b6ec0000-b6fa2000 r-xp 00000000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so b6fa2000-b6fb2000 ---p 000e2000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so b6fb2000-b6fb4000 r--p 000e2000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so b6fb4000-b6fb5000 rw-p 000e4000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so b6fb5000-b6fb8000 rw-p 00000000 00:00 0 b6fb8000-b6fc2000 r-xp 00000000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so b6fc2000-b6fd1000 ---p 0000a000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so b6fd1000-b6fd2000 r--p 00009000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so b6fd2000-b6fd3000 rw-p 0000a000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so b6fd3000-b6feb000 r-xp 00000000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so b6ff2000-b6ff4000 rw-p 00000000 00:00 0 b6ff6000-b6ff7000 ---p 00000000 00:00 0 b6ff7000-b6ff8000 r--p 00000000 00:00 0 b6ff8000-b6ff9000 rwxp 00000000 00:00 0 b6ff9000-b6ffb000 rw-p 00000000 00:00 0 b6ffb000-b6ffc000 r--p 00018000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so b6ffc000-b6ffd000 rw-p 00019000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so bed97000-bedb8000 rw-p 00000000 00:00 0 [stack] beeb0000-beeb1000 r-xp 00000000 00:00 0 [sigpage] beeb1000-beeb2000 r--p 00000000 00:00 0 [vvar] beeb2000-beeb3000 r-xp 00000000 00:00 0 [vdso] ffff0000-ffff1000 r-xp 00000000 00:00 0 [vectors] VM Arguments: jvm_args: -Xms64M -Xmx768M --add-exports=java.base/jdk.internal.module=ALL-UNNAMED java_command: build.tools.jigsaw.AddPackagesAttribute /workspace/build/linux-arm-server-fastdebug/jdk java_class_path (initial): /workspace/build/linux-arm-server-fastdebug/buildtools/tools_jigsaw_classes Launcher Type: SUN_STANDARD [Global flags] uint ConcGCThreads = 1 {product} {ergonomic} Number of threads concurrent gc will use uint G1ConcRefinementThreads = 4 {product} {ergonomic} The number of parallel rem set update threads. Will be set ergonomically by default. size_t G1HeapRegionSize = 1048576 {product} {ergonomic} Size of the G1 regions. uintx GCDrainStackTargetSize = 64 {product} {ergonomic} Number of entries we will try to leave on the stack during parallel gc size_t InitialHeapSize = 67108864 {product} {command line} Initial heap size (in bytes); zero means use ergonomics size_t MarkStackSize = 32768 {product} {ergonomic} Size of marking stack size_t MaxHeapSize = 805306368 {product} {command line} Maximum heap size (in bytes) size_t MaxNewSize = 482344960 {product} {ergonomic} Maximum new generation size (in bytes), max_uintx means set ergonomically size_t MinHeapDeltaBytes = 1048576 {product} {ergonomic} The minimum change in heap space due to GC (in bytes) size_t MinHeapSize = 67108864 {product} {command line} Minimum heap size (in bytes); zero means use ergonomics uintx NonProfiledCodeHeapSize = 0 {pd product} {ergonomic} Size of code heap with non-profiled methods (in bytes) uintx ProfiledCodeHeapSize = 0 {pd product} {ergonomic} Size of code heap with profiled methods (in bytes) size_t SoftMaxHeapSize = 805306368 {manageable} {ergonomic} Soft limit for maximum heap size (in bytes) bool UseG1GC = true {product} {ergonomic} Use the Garbage-First garbage collector Logging: Log output configuration: #0: stdout all=warning uptime,level,tags #1: stderr all=off uptime,level,tags Environment Variables: JAVA_HOME=/opt/java/openjdk PATH=/opt/java/openjdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin LC_ALL=C Signal Handlers: SIGSEGV: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO SIGBUS: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO SIGFPE: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO SIGPIPE: [libjvm.so+0xc9aa9d], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO SIGXFSZ: [libjvm.so+0xc9aa9d], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO SIGILL: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO SIGUSR2: [libjvm.so+0xc9ad95], sa_mask[0]=00000000000000000000000000000000, sa_flags=SA_RESTART|SA_SIGINFO SIGHUP: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none SIGINT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none SIGTERM: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none SIGQUIT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none --------------- S Y S T E M --------------- OS: DISTRIB_ID=Ubuntu DISTRIB_RELEASE=18.04 DISTRIB_CODENAME=bionic DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS" uname: Linux 20431585315d 5.4.51-v7l+ #1333 SMP Mon Aug 10 16:51:40 BST 2020 armv7l OS uptime: 14 days 7:59 hours libc: glibc 2.27 NPTL 2.27 rlimit (soft/hard): STACK 8192k/infinity , CORE infinity/infinity , NPROC infinity/infinity , NOFILE 1048576/1048576 , AS infinity/infinity , CPU infinity/infinity , DATA infinity/infinity , FSIZE infinity/infinity , MEMLOCK 64k/64k load average: 3.37 3.26 3.09 /proc/meminfo: MemTotal: 3919812 kB MemFree: 1255688 kB MemAvailable: 3518740 kB Buffers: 134316 kB Cached: 2117828 kB SwapCached: 0 kB Active: 1266624 kB Inactive: 1167412 kB Active(anon): 110360 kB Inactive(anon): 80744 kB Active(file): 1156264 kB Inactive(file): 1086668 kB Unevictable: 16 kB Mlocked: 16 kB HighTotal: 3264512 kB HighFree: 1038848 kB LowTotal: 655300 kB LowFree: 216840 kB SwapTotal: 102396 kB SwapFree: 102396 kB Dirty: 24916 kB Writeback: 0 kB AnonPages: 181884 kB Mapped: 125864 kB Shmem: 16892 kB KReclaimable: 181816 kB Slab: 205164 kB SReclaimable: 181816 kB SUnreclaim: 23348 kB KernelStack: 2240 kB PageTables: 2684 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 2062300 kB Committed_AS: 1125176 kB VmallocTotal: 245760 kB VmallocUsed: 5520 kB VmallocChunk: 0 kB Percpu: 512 kB CmaTotal: 262144 kB CmaFree: 171244 kB /sys/kernel/mm/transparent_hugepage/enabled: /sys/kernel/mm/transparent_hugepage/defrag (defrag/compaction efforts parameter): Process Memory: Virtual Size: 888828K (peak: 888828K) Resident Set Size: 25020K (peak: 25020K) (anon: 11372K, file: 13648K, shmem: 0K) Swapped out: 0K C-Heap outstanding allocations: 1636K /proc/sys/kernel/threads-max (system-wide limit on the number of threads): 57119 /proc/sys/vm/max_map_count (maximum number of memory map areas a process may have): 65530 /proc/sys/kernel/pid_max (system-wide limit on number of process identifiers): 32768 Steal ticks since vm start: 0 Steal ticks percentage since vm start: 0.000 CPU: total 4 (initial active 4) (ARMv7), vfp, vfp3-32, simd, mp_ext /proc/cpuinfo: processor : 0 model name : ARMv7 Processor rev 3 (v7l) BogoMIPS : 270.00 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x0 CPU part : 0xd08 CPU revision : 3 processor : 1 model name : ARMv7 Processor rev 3 (v7l) BogoMIPS : 270.00 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x0 CPU part : 0xd08 CPU revision : 3 processor : 2 model name : ARMv7 Processor rev 3 (v7l) BogoMIPS : 270.00 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x0 CPU part : 0xd08 CPU revision : 3 processor : 3 model name : ARMv7 Processor rev 3 (v7l) BogoMIPS : 270.00 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x0 CPU part : 0xd08 CPU revision : 3 Hardware : BCM2711 Revision : c03111 Serial : 100000001c47254f Model : Raspberry Pi 4 Model B Rev 1.1 Online cpus: 0-3 Offline cpus: Memory: 4k page, physical 3919812k(1255688k free), swap 102396k(102396k free) vm_info: OpenJDK Server VM (fastdebug 16-internal+0-adhoc..workspace) for linux-arm JRE (16-internal+0-adhoc..workspace), built on Oct 12 2020 19:49:51 by "" with gcc 7.5.0 END. > On 12. Oct 2020, at 20:24, Aleksey Shipilev wrote: > > Hi, > > On 10/12/20 8:12 PM, Marc Hoffmann wrote: >> Please find the build log and the hs_err file for commit fd0cb98ed03c6214c02ccd3503c1e6d77065a428 attached. > > Please try to build with fastdebug (./configure --enable-debug), so that JVM asserts meaninfully somewhere? > >> Is there any additional information I can provide to help getting these builds fixed again? > > I am seeing plenty of weird x86_32 crashes since last week. Pretty sure some of them would manifest on ARM32 as well. This is why building with fastdebug is the next step: it maps out the bug symptoms. > > -- > Thanks, > -Aleksey > From github.com+51754783+coreyashford at openjdk.java.net Mon Oct 12 21:41:37 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Mon, 12 Oct 2020 21:41:37 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v5] In-Reply-To: References: Message-ID: > This patch set encompasses the following commits: > > - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for > the intrinsic. The API is similar to the existing encodeBlock intrinsic. > - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic > implementation > - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. > - Adds a JMH microbenchmark for both Base64 encoding and encoding. > - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. CoreyAshford has updated the pull request incrementally with one additional commit since the last revision: Per Martin Doerr's v4 review: fix regression, and speed up return time for buffers that are too small - Check for case where the result of subtacting 12 off of the source length produces a negative number. To do this efficiently, I added the instruction definition for mcrxrx, which is implemented on Power9+. - Rearrange the code so that minimal initialization is performed before checking the size, so that the intrinsic can return quickly in the event that the buffer is too small to process. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/293/files - new: https://git.openjdk.java.net/jdk/pull/293/files/164fa2a9..b5acb75c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=293&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=293&range=03-04 Stats: 54 lines in 3 files changed: 33 ins; 19 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/293.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/293/head:pull/293 PR: https://git.openjdk.java.net/jdk/pull/293 From github.com+51754783+coreyashford at openjdk.java.net Mon Oct 12 21:41:38 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Mon, 12 Oct 2020 21:41:38 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v4] In-Reply-To: References: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> Message-ID: <7-p-Kc9lQyyuoWdNtmgbXbwkxsgk4oQGKmFSCcMpvnU=.97810c01-3200-4767-bbd4-35d53c2bc5ca@github.com> On Mon, 12 Oct 2020 11:06:23 GMT, Martin Doerr wrote: >> CoreyAshford has updated the pull request incrementally with two additional commits since the last revision: >> >> - TestBase64.java: fix comment to correctly reflect actual intrinsic names. >> >> The intrinsic names that are visible with -XX:+PrintCompilation are encode >> and decode, rather than encodeBlock and decodeBlock. >> - stubGenerator_ppc.cpp: fix regression caused by change to using loop counter >> >> My original fix didn't account for the case where sl < block_size. In the >> event sl < block_size, the shifted sl will become zero, so it should >> jump to the code that computes how much data was processed - 0 - and return. > > Test java/util/Base64/TestBase64.java failed on Power9: > JavaTest Message: Test threw exception: java.lang.RuntimeException: Base64 decoding(String) failed! > Seed from RandomFactory = -8714459054005749075L > > java.lang.RuntimeException: Base64 decoding(String) failed! > at TestBase64.checkEqual(TestBase64.java:523) > at TestBase64.test(TestBase64.java:185) > at TestBase64.main(TestBase64.java:61) This latest push passes the regression test. I thought I had run it last time, though, which confuses me. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From github.com+51754783+coreyashford at openjdk.java.net Mon Oct 12 22:03:20 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Mon, 12 Oct 2020 22:03:20 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v4] In-Reply-To: <7-p-Kc9lQyyuoWdNtmgbXbwkxsgk4oQGKmFSCcMpvnU=.97810c01-3200-4767-bbd4-35d53c2bc5ca@github.com> References: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> <7-p-Kc9lQyyuoWdNtmgbXbwkxsgk4oQGKmFSCcMpvnU=.97810c01-3200-4767-bbd4-35d53c2bc5ca@github.com> Message-ID: <6Voyfr_s-ieyRA-8Rtvvpz7tkhhicA8sY2d2KTp3Kmw=.fa256bae-2143-4b43-bfea-5837ad31eb6a@github.com> On Mon, 12 Oct 2020 21:38:36 GMT, CoreyAshford wrote: >> Test java/util/Base64/TestBase64.java failed on Power9: >> JavaTest Message: Test threw exception: java.lang.RuntimeException: Base64 decoding(String) failed! >> Seed from RandomFactory = -8714459054005749075L >> >> java.lang.RuntimeException: Base64 decoding(String) failed! >> at TestBase64.checkEqual(TestBase64.java:523) >> at TestBase64.test(TestBase64.java:185) >> at TestBase64.main(TestBase64.java:61) > > This latest push passes the intrinsic regression test. I had run the intrinsic TestBase64 regression test on the > previous push, but not the one in utils. Interesting. Somehow it didn't occur to me that there could be a problem > there if the intrinsic TestBase64 test passed. I will check into the other regression test. Don't review this latest > push just yet. Ok, all is clear. I just ran `jdk/java/util/Base64/TestBase64.java` which passes as well. Please review again when convenient. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From iveresov at openjdk.java.net Mon Oct 12 22:19:22 2020 From: iveresov at openjdk.java.net (Igor Veresov) Date: Mon, 12 Oct 2020 22:19:22 GMT Subject: RFR: 8253402: Convert vmSymbols::SID to enum class [v4] In-Reply-To: <9m7uFY5ij94oj3SQ9pTHNq-tsw0NnPHDVqHhznmAuOo=.bc75f19e-0527-4962-8b19-b178cfa8e572@github.com> References: <9m7uFY5ij94oj3SQ9pTHNq-tsw0NnPHDVqHhznmAuOo=.bc75f19e-0527-4962-8b19-b178cfa8e572@github.com> Message-ID: On Sun, 11 Oct 2020 22:54:21 GMT, Ioi Lam wrote: >> Convert `vmSymbols::SID` to an `enum class` to provide better type safety. >> >> - The original enum type `vmSymbols::SID` cannot be forward-declared. I moved it out of the `vmSymbols` class and >> renamed, so now it can be forward-declared as `enum class vmSymbolID : int;`, without including the large vmSymbols.hpp >> file. >> - This also breaks the mutual dependency between the `vmSymbols` and `vmIntrinsics` classes. Now the declaration of >> `vmIntrinsics` can be moved from vmSymbols.hpp to vmIntrinsics.hpp, where it naturally belongs. >> - Type-safe enumeration (contributed by Kim Barrett) >> for (vmSymbolsIterator it = vmSymbolsRange.begin(); it != vmSymbolsRange.end(); ++it) { >> vmSymbolID index = *it; .... >> } >> - I moved `vmSymbols::_symbols[]` to `Symbol::_vm_symbols[]`, and made it accessible via `Symbol::vm_symbol_at()`. This >> way, header files (e.g. fieldInfo.hpp) that need to convert from `vmSymbolID` to `Symbol*` don't need to include the >> large vmSymbols.hpp file. >> - I changed the `VM_SYMBOL_ENUM_NAME` macro so that the users don't need to explicitly add the `vmSymbolID::` scope. >> - I removed many unnecessary casts between `int` and `vmSymbolID`. >> - The remaining casts are done via `vmSymbol::as_int()` and `vmSymbols::as_SID()` with range checks. >> >> ----- >> If this is successful, I will do the same for `vmIntrinsics::ID`. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > added missing #include from enumIterator.hpp Marked as reviewed by iveresov (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/276 From valeriep at openjdk.java.net Mon Oct 12 23:00:12 2020 From: valeriep at openjdk.java.net (Valerie Peng) Date: Mon, 12 Oct 2020 23:00:12 GMT Subject: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v5] In-Reply-To: References: <4NM17B6l4GvNgCbmmQTUcnfZTA6G-IEc85O8jH_q-xA=.63b10da7-bab7-44bc-a4c8-0a675aca45c0@github.com> Message-ID: On Mon, 12 Oct 2020 07:02:05 GMT, Fei Yang wrote: > > > > I have looked at the java security changes, i.e. src/java.base/share/classes/sun/security/provider/SHA3.java. It looks > > fine. > > @valeriepeng : I see you are not listed under "Reviewers" commit message part, could you please press the magic > button(s)(approve?) so you get the credit? Thanks. It's fine, the part I reviewed is only a small part of the changes, so I will leave the reviewer approval upto the hotspot team. Thanks, Valerie ------------- PR: https://git.openjdk.java.net/jdk/pull/207 From rkennke at openjdk.java.net Mon Oct 12 23:01:24 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 12 Oct 2020 23:01:24 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v8] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference > and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads > that make heavvy use of such weak references will therefore potentially cause significant GC pauses. There are 3 main > items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is > theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly > reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking > bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever > marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all > objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it > will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter > of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for > FinalReference, marking stops there, and does not mark through the referent. 2. Concurrent processing is performed > after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and > depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list > (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no > referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers > in Reference.get() intrinsics that return NULL when the referent is not reachable. Testing: hotspot_gc_shenadoah > (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions Roman Kennke has updated the pull request incrementally with four additional commits since the last revision: - Remove unnecessary par_is_marked* methods - Carry precise liveness and reachability information in ObjArrayChunkTask - Don't allow safepoints when acquiring the Heap_lock for reference enqueuing - Don't LRB when fetching the referent: we must avoid accidentally making finalizably reachable objects strongly reachable ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/505/files - new: https://git.openjdk.java.net/jdk/pull/505/files/34ca4991..ee7412e2 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=07 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=06-07 Stats: 146 lines in 9 files changed: 21 ins; 96 del; 29 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From psandoz at openjdk.java.net Mon Oct 12 23:12:36 2020 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Mon, 12 Oct 2020 23:12:36 GMT Subject: RFR: 8223347: Integration of Vector API (Incubator) [v2] In-Reply-To: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> References: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> Message-ID: > This pull request is for integration of the Vector API. It was previously reviewed under conditions when mercurial was > used for the source code control system. Review threads can be found here (searching for issue number 8223347 in the > title): https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-April/thread.html > https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-May/thread.html > https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-July/thread.html > > If mercurial was still being used the code would be pushed directly, once the CSR is approved. However, in this case a > pull request is required and needs explicit reviewer approval. Between the final review and this pull request no code > has changed, except for that related to merging. Paul Sandoz has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: - HotspotIntrinsicCandidate to IntrinsicCandidate - Merge master - Fix permissions - Fix permissions - Merge master - Vector API new files - Integration of Vector API (Incubator) ------------- Changes: https://git.openjdk.java.net/jdk/pull/367/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=367&range=01 Stats: 295150 lines in 336 files changed: 292957 ins; 1062 del; 1131 mod Patch: https://git.openjdk.java.net/jdk/pull/367.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/367/head:pull/367 PR: https://git.openjdk.java.net/jdk/pull/367 From kvn at openjdk.java.net Mon Oct 12 23:29:24 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 12 Oct 2020 23:29:24 GMT Subject: RFR: 8253402: Convert vmSymbols::SID to enum class [v4] In-Reply-To: <9m7uFY5ij94oj3SQ9pTHNq-tsw0NnPHDVqHhznmAuOo=.bc75f19e-0527-4962-8b19-b178cfa8e572@github.com> References: <9m7uFY5ij94oj3SQ9pTHNq-tsw0NnPHDVqHhznmAuOo=.bc75f19e-0527-4962-8b19-b178cfa8e572@github.com> Message-ID: On Sun, 11 Oct 2020 22:54:21 GMT, Ioi Lam wrote: >> Convert `vmSymbols::SID` to an `enum class` to provide better type safety. >> >> - The original enum type `vmSymbols::SID` cannot be forward-declared. I moved it out of the `vmSymbols` class and >> renamed, so now it can be forward-declared as `enum class vmSymbolID : int;`, without including the large vmSymbols.hpp >> file. >> - This also breaks the mutual dependency between the `vmSymbols` and `vmIntrinsics` classes. Now the declaration of >> `vmIntrinsics` can be moved from vmSymbols.hpp to vmIntrinsics.hpp, where it naturally belongs. >> - Type-safe enumeration (contributed by Kim Barrett) >> for (vmSymbolsIterator it = vmSymbolsRange.begin(); it != vmSymbolsRange.end(); ++it) { >> vmSymbolID index = *it; .... >> } >> - I moved `vmSymbols::_symbols[]` to `Symbol::_vm_symbols[]`, and made it accessible via `Symbol::vm_symbol_at()`. This >> way, header files (e.g. fieldInfo.hpp) that need to convert from `vmSymbolID` to `Symbol*` don't need to include the >> large vmSymbols.hpp file. >> - I changed the `VM_SYMBOL_ENUM_NAME` macro so that the users don't need to explicitly add the `vmSymbolID::` scope. >> - I removed many unnecessary casts between `int` and `vmSymbolID`. >> - The remaining casts are done via `vmSymbol::as_int()` and `vmSymbols::as_SID()` with range checks. >> >> ----- >> If this is successful, I will do the same for `vmIntrinsics::ID`. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > added missing #include from enumIterator.hpp Marked as reviewed by kvn (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/276 From boris.ulasevich at bell-sw.com Tue Oct 13 06:48:20 2020 From: boris.ulasevich at bell-sw.com (Boris Ulasevich) Date: Tue, 13 Oct 2020 09:48:20 +0300 Subject: arm32 builds continue to fail for me after 8253540 and 8253901 In-Reply-To: <17F91692-4F3D-4FAA-AB94-361B6C84F982@mountainminds.com> References: <56ff08d5-a4e5-788a-1c29-02f76e8755d2@redhat.com> <17F91692-4F3D-4FAA-AB94-361B6C84F982@mountainminds.com> Message-ID: Hi Marc, I created JDK-8254661 for the issue. I would love to fix it, but still can't reproduce the crash (even on Raspberry Pi). What configuration do you have? The following sequence works Ok for me: pi at raspberrypi $ git clone https://github.com/openjdk/jdk pi at raspberrypi $ cd jdk pi at raspberrypi $ bash configure --with-boot-jdk=/home/pi/jdk-15 pi at raspberrypi $ make Your debug build shows that I did not fix the assert_different_registers in InterpreterMacroAssembler::unlock_object() body (and the function comment by the way!), though with eyeballing I do not see what is wrong for Rlock=R0: https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/arm/interp_masm_arm.cpp#L1000 regards, Boris On Mon, Oct 12, 2020 at 11:34 PM Marc Hoffmann wrote: > > Hi Aleksey, hi Boris, > > for me the crash is always reproducible: Every single build after > > 77a0f3999afa322b64643afd4a161164440af975 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function > > fails on arm32 (build on ubuntu in docker on a raspberry pi 4). Before this commit I haven?t encountered any failures. > > Here is the hs_err file with ?enable-debug (reproduced with current master c7f00640627eab38b77d23d07876cf0247fa18f3). > > Cheers, > -marc > > > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (/workspace/src/hotspot/share/asm/register.hpp:160), pid=14700, tid=14705 > # assert(a != b && a != c && a != d && a != e && b != c && b != d && b != e && c != d && c != e && d != e) failed: registers must be different: a=0x00000002, b=0x00000003, c=0x00000000, d=0x00000000, e=0x0000000c > # > # JRE version: (16.0) (fastdebug build ) > # Java VM: OpenJDK Server VM (fastdebug 16-internal+0-adhoc..workspace, mixed mode, g1 gc, linux-arm) > # Problematic frame: > # V [libjvm.so+0x7571fc] InterpreterMacroAssembler::unlock_object(RegisterImpl*) [clone .part.34]+0x63 > # > # Core dump will be written. Default location: /workspace/make/core > # > # > > --------------- S U M M A R Y ------------ > > Command Line: -Xms64M -Xmx768M --add-exports=java.base/jdk.internal.module=ALL-UNNAMED build.tools.jigsaw.AddPackagesAttribute /workspace/build/linux-arm-server-fastdebug/jdk > > Host: 20431585315d, rev 3 (v7l), 4 cores, 3G, Ubuntu 18.04.3 LTS > Time: Mon Oct 12 20:22:15 2020 UTC elapsed time: 0.144243 seconds (0d 0h 0m 0s) > > --------------- T H R E A D --------------- > > Current thread (0xb5b16460): JavaThread "Unknown thread" [_thread_in_vm, id=14705, stack(0xb5c6e000,0xb5cbe000)] > > Stack: [0xb5c6e000,0xb5cbe000], sp=0xb5cbc2d0, free space=312k > Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x7571fc] InterpreterMacroAssembler::unlock_object(RegisterImpl*) [clone .part.34]+0x63 > > Registers: > r0 = 0x00000003 > r1 = 0x000000a0 > r2 = 0x00000002 > r3 = 0x00000000 > r4 = 0xb5b168b0 > r5 = 0x0000000c > r6 = 0x00000000 > r7 = 0xb5cbc2e8 > r8 = 0xb6db1fa8 > r9 = 0xb5cbc760 > r10 = 0xe3520000 > fp = 0xb6db1fa8 > r12 = 0xb6ff8000 > sp = 0xb5cbc2d0 > lr = 0x00000058 > pc = 0xb64961fc > cpsr = 0x200f0030 > > Top of Stack: (sp=0xb5cbc2d0) > 0xb5cbc2d0: 00000002 00000003 00000000 00000000 > 0xb5cbc2e0: 0000000c 00000048 0000006e 00000000 > 0xb5cbc2f0: 00000000 00000000 00000000 0000007c > 0xb5cbc300: 00000000 00000077 b5cbc378 00000000 > 0xb5cbc310: b5cbc380 0000000f b6db1fa8 b5cbc340 > 0xb5cbc320: b5b168b0 b6db1fa8 b5cbc4c4 b6008a2b > 0xb5cbc330: b5cbc454 b5cbc348 0000000f b61971cf > 0xb5cbc340: b5cbc380 b5cbc3b0 b5b168b0 b5cbc388 > > Instructions: (pc=0xb64961fc) > 0xb64960fc: 440be9c7 a034f8c7 f5e06139 68ebf7ed > 0xb649610c: f040689a 46184164 1180f441 68996011 > 0xb649611c: f5e03104 4b12f781 46284622 1003f858 > 0xb649612c: f998f2e4 f64f68eb f2ce7210 6899122f > 0xb649613c: 600a4618 31046899 f76ef5e0 46284631 > 0xb649614c: f4e6f69f f5e04630 f507f79b 46bd7786 > 0xb649615c: 8ff0e8bd 0091bfd0 00006ee4 000059d0 > 0xb649616c: 00007d1c 4a084b07 b480447b 589baf00 > 0xb649617c: b91b781b f85d46bd 47707b04 f85d46bd > 0xb649618c: e7197b04 0091be30 00006a24 bf182900 > 0xb649619c: f1a1290c e92d0202 b0f34ff0 2301bf08 > 0xb64961ac: bf18af06 f8df2300 2a018268 461abf8c > 0xb64961bc: 0201f043 f04f460e 44f833ff 30d0f8c7 > 0xb64961cc: f8c74604 23003140 333de9c7 30fcf887 > 0xb64961dc: 3359e9c7 316cf887 4a8eb1da 0e58f04f > 0xb64961ec: 250c2003 1002f858 f8d12202 21a0c000 > 0xb64961fc: e000f88c 2000e9cd 6302e9cd 4b874a86 > 0xb649620c: 447a4887 9504447b f58f4478 f00cfaa3 > 0xb649621c: 68e2f2d5 4340f44f 33a0f2ce 0a04f04f > 0xb649622c: 7980f04f 0b00f04f 46106891 600b2501 > 0xb649623c: 44516891 f6f0f5e0 4a7b497a 3001f858 > 0xb649624c: 0140f107 a048f8c7 4608643e f8c7681b > 0xb649625c: 60fb904c f858647b f8c72002 f8c7b054 > 0xb649626c: 60bab058 e9c73208 653abb19 66fd607a > 0xb649627c: f732f5e0 689a68e3 4164f040 f4414618 > 0xb649628c: 60111181 44516899 f6c6f5e0 0110f107 > 0xb649629c: 68fb687a f8c74608 623aa018 6304e9c7 > 0xb64962ac: 901cf8c7 bb09e9c7 bb0de9c7 f5e063fd > 0xb64962bc: 68e3f713 f0406899 46184264 4240f442 > 0xb64962cc: 6899600a f1074451 f5e00ad0 4b57f6a5 > 0xb64962dc: 3003f858 2b00781b 8093f040 f04f68bb > 0xb64962ec: 68f97c80 33082500 c0acf8c7 0b01f04f > > > > --------------- P R O C E S S --------------- > > uid : 0 euid : 0 gid : 0 egid : 0 > > umask: 0022 (----w--w-) > > Threads class SMR info: > _java_thread_list=0xb6e56078, length=0, elements={ > } > _java_thread_list_alloc_cnt=1, _java_thread_list_free_cnt=0, _java_thread_list_max=0, _nested_thread_list_max=0 > _delete_lock_wait_cnt=0, _delete_lock_wait_max=0 > _to_delete_list_cnt=0, _to_delete_list_max=0 > > Java Threads: ( => current thread ) > > Other Threads: > 0xb5b73188 GCTaskThread "GC Thread#0" [stack: 0x81d00000,0x81d80000] [id=14706] > 0xb5b77dc0 ConcurrentGCThread "G1 Main Marker" [stack: 0x81c7e000,0x81cfe000] [id=14707] > 0xb5b790c0 ConcurrentGCThread "G1 Conc#0" [stack: 0x81a80000,0x81b00000] [id=14708] > 0xb5bde230 ConcurrentGCThread "G1 Refine#0" [stack: 0x81780000,0x81800000] [id=14709] > 0xb5bdf488 ConcurrentGCThread "G1 Service" [stack: 0x81580000,0x81600000] [id=14710] > > =>0xb5b16460 (exited) JavaThread "Unknown thread" [_thread_in_vm, id=14705, stack(0xb5c6e000,0xb5cbe000)] > > Threads with active compile tasks: > > VM state: not at safepoint (not fully initialized) > > VM Mutex/Monitor currently owned by a thread: None > > GC Precious Log: > CPUs: 4 total, 4 available > Memory: 3827M > Large Page Support: Disabled > NUMA Support: Disabled > Compressed Oops: Disabled > Heap Region Size: 1M > Heap Min Capacity: 64M > Heap Initial Capacity: 64M > Heap Max Capacity: 768M > Pre-touch: Disabled > Parallel Workers: 4 > Concurrent Workers: 1 > Concurrent Refinement Workers: 4 > Periodic GC: Disabled > > Heap: > garbage-first heap total 65536K, used 0K [0x83a00000, 0xb3a00000) > region size 1024K, 1 young (1024K), 0 survivors (0K) > Metaspace used 944K, capacity 2200K, committed 2200K, reserved 4400K > > Heap Regions: E=young(eden), S=young(survivor), O=old, HS=humongous(starts), HC=humongous(continues), CS=collection set, F=free, OA=open archive, CA=closed archive, TAMS=top-at-mark-start (previous, next) > | 0|0x83a00000, 0x83a00000, 0x83b00000| 0%| F| |TAMS 0x83a00000, 0x83a00000| Untracked > | 1|0x83b00000, 0x83b00000, 0x83c00000| 0%| F| |TAMS 0x83b00000, 0x83b00000| Untracked > | 2|0x83c00000, 0x83c00000, 0x83d00000| 0%| F| |TAMS 0x83c00000, 0x83c00000| Untracked > | 3|0x83d00000, 0x83d00000, 0x83e00000| 0%| F| |TAMS 0x83d00000, 0x83d00000| Untracked > | 4|0x83e00000, 0x83e00000, 0x83f00000| 0%| F| |TAMS 0x83e00000, 0x83e00000| Untracked > | 5|0x83f00000, 0x83f00000, 0x84000000| 0%| F| |TAMS 0x83f00000, 0x83f00000| Untracked > | 6|0x84000000, 0x84000000, 0x84100000| 0%| F| |TAMS 0x84000000, 0x84000000| Untracked > | 7|0x84100000, 0x84100000, 0x84200000| 0%| F| |TAMS 0x84100000, 0x84100000| Untracked > | 8|0x84200000, 0x84200000, 0x84300000| 0%| F| |TAMS 0x84200000, 0x84200000| Untracked > | 9|0x84300000, 0x84300000, 0x84400000| 0%| F| |TAMS 0x84300000, 0x84300000| Untracked > | 10|0x84400000, 0x84400000, 0x84500000| 0%| F| |TAMS 0x84400000, 0x84400000| Untracked > | 11|0x84500000, 0x84500000, 0x84600000| 0%| F| |TAMS 0x84500000, 0x84500000| Untracked > | 12|0x84600000, 0x84600000, 0x84700000| 0%| F| |TAMS 0x84600000, 0x84600000| Untracked > | 13|0x84700000, 0x84700000, 0x84800000| 0%| F| |TAMS 0x84700000, 0x84700000| Untracked > | 14|0x84800000, 0x84800000, 0x84900000| 0%| F| |TAMS 0x84800000, 0x84800000| Untracked > | 15|0x84900000, 0x84900000, 0x84a00000| 0%| F| |TAMS 0x84900000, 0x84900000| Untracked > | 16|0x84a00000, 0x84a00000, 0x84b00000| 0%| F| |TAMS 0x84a00000, 0x84a00000| Untracked > | 17|0x84b00000, 0x84b00000, 0x84c00000| 0%| F| |TAMS 0x84b00000, 0x84b00000| Untracked > | 18|0x84c00000, 0x84c00000, 0x84d00000| 0%| F| |TAMS 0x84c00000, 0x84c00000| Untracked > | 19|0x84d00000, 0x84d00000, 0x84e00000| 0%| F| |TAMS 0x84d00000, 0x84d00000| Untracked > | 20|0x84e00000, 0x84e00000, 0x84f00000| 0%| F| |TAMS 0x84e00000, 0x84e00000| Untracked > | 21|0x84f00000, 0x84f00000, 0x85000000| 0%| F| |TAMS 0x84f00000, 0x84f00000| Untracked > | 22|0x85000000, 0x85000000, 0x85100000| 0%| F| |TAMS 0x85000000, 0x85000000| Untracked > | 23|0x85100000, 0x85100000, 0x85200000| 0%| F| |TAMS 0x85100000, 0x85100000| Untracked > | 24|0x85200000, 0x85200000, 0x85300000| 0%| F| |TAMS 0x85200000, 0x85200000| Untracked > | 25|0x85300000, 0x85300000, 0x85400000| 0%| F| |TAMS 0x85300000, 0x85300000| Untracked > | 26|0x85400000, 0x85400000, 0x85500000| 0%| F| |TAMS 0x85400000, 0x85400000| Untracked > | 27|0x85500000, 0x85500000, 0x85600000| 0%| F| |TAMS 0x85500000, 0x85500000| Untracked > | 28|0x85600000, 0x85600000, 0x85700000| 0%| F| |TAMS 0x85600000, 0x85600000| Untracked > | 29|0x85700000, 0x85700000, 0x85800000| 0%| F| |TAMS 0x85700000, 0x85700000| Untracked > | 30|0x85800000, 0x85800000, 0x85900000| 0%| F| |TAMS 0x85800000, 0x85800000| Untracked > | 31|0x85900000, 0x85900000, 0x85a00000| 0%| F| |TAMS 0x85900000, 0x85900000| Untracked > | 32|0x85a00000, 0x85a00000, 0x85b00000| 0%| F| |TAMS 0x85a00000, 0x85a00000| Untracked > | 33|0x85b00000, 0x85b00000, 0x85c00000| 0%| F| |TAMS 0x85b00000, 0x85b00000| Untracked > | 34|0x85c00000, 0x85c00000, 0x85d00000| 0%| F| |TAMS 0x85c00000, 0x85c00000| Untracked > | 35|0x85d00000, 0x85d00000, 0x85e00000| 0%| F| |TAMS 0x85d00000, 0x85d00000| Untracked > | 36|0x85e00000, 0x85e00000, 0x85f00000| 0%| F| |TAMS 0x85e00000, 0x85e00000| Untracked > | 37|0x85f00000, 0x85f00000, 0x86000000| 0%| F| |TAMS 0x85f00000, 0x85f00000| Untracked > | 38|0x86000000, 0x86000000, 0x86100000| 0%| F| |TAMS 0x86000000, 0x86000000| Untracked > | 39|0x86100000, 0x86100000, 0x86200000| 0%| F| |TAMS 0x86100000, 0x86100000| Untracked > | 40|0x86200000, 0x86200000, 0x86300000| 0%| F| |TAMS 0x86200000, 0x86200000| Untracked > | 41|0x86300000, 0x86300000, 0x86400000| 0%| F| |TAMS 0x86300000, 0x86300000| Untracked > | 42|0x86400000, 0x86400000, 0x86500000| 0%| F| |TAMS 0x86400000, 0x86400000| Untracked > | 43|0x86500000, 0x86500000, 0x86600000| 0%| F| |TAMS 0x86500000, 0x86500000| Untracked > | 44|0x86600000, 0x86600000, 0x86700000| 0%| F| |TAMS 0x86600000, 0x86600000| Untracked > | 45|0x86700000, 0x86700000, 0x86800000| 0%| F| |TAMS 0x86700000, 0x86700000| Untracked > | 46|0x86800000, 0x86800000, 0x86900000| 0%| F| |TAMS 0x86800000, 0x86800000| Untracked > | 47|0x86900000, 0x86900000, 0x86a00000| 0%| F| |TAMS 0x86900000, 0x86900000| Untracked > | 48|0x86a00000, 0x86a00000, 0x86b00000| 0%| F| |TAMS 0x86a00000, 0x86a00000| Untracked > | 49|0x86b00000, 0x86b00000, 0x86c00000| 0%| F| |TAMS 0x86b00000, 0x86b00000| Untracked > | 50|0x86c00000, 0x86c00000, 0x86d00000| 0%| F| |TAMS 0x86c00000, 0x86c00000| Untracked > | 51|0x86d00000, 0x86d00000, 0x86e00000| 0%| F| |TAMS 0x86d00000, 0x86d00000| Untracked > | 52|0x86e00000, 0x86e00000, 0x86f00000| 0%| F| |TAMS 0x86e00000, 0x86e00000| Untracked > | 53|0x86f00000, 0x86f00000, 0x87000000| 0%| F| |TAMS 0x86f00000, 0x86f00000| Untracked > | 54|0x87000000, 0x87000000, 0x87100000| 0%| F| |TAMS 0x87000000, 0x87000000| Untracked > | 55|0x87100000, 0x87100000, 0x87200000| 0%| F| |TAMS 0x87100000, 0x87100000| Untracked > | 56|0x87200000, 0x87200000, 0x87300000| 0%| F| |TAMS 0x87200000, 0x87200000| Untracked > | 57|0x87300000, 0x87300000, 0x87400000| 0%| F| |TAMS 0x87300000, 0x87300000| Untracked > | 58|0x87400000, 0x87400000, 0x87500000| 0%| F| |TAMS 0x87400000, 0x87400000| Untracked > | 59|0x87500000, 0x87500000, 0x87600000| 0%| F| |TAMS 0x87500000, 0x87500000| Untracked > | 60|0x87600000, 0x87600000, 0x87700000| 0%| F| |TAMS 0x87600000, 0x87600000| Untracked > | 61|0x87700000, 0x87700000, 0x87800000| 0%| F| |TAMS 0x87700000, 0x87700000| Untracked > | 62|0x87800000, 0x87800000, 0x87900000| 0%| F| |TAMS 0x87800000, 0x87800000| Untracked > | 63|0x87900000, 0x87942908, 0x87a00000| 26%| E| |TAMS 0x87900000, 0x87900000| Complete > > Card table byte_map: [0x83700000,0x83880000] _byte_map_base: 0x832e3000 > > Marking Bits (Prev, Next): (CMBitMap*) 0xb5b74324, (CMBitMap*) 0xb5b74344 > Prev Bits: [0x82980000, 0x83580000) > Next Bits: [0x81d80000, 0x82980000) > > GC Heap History (0 events): > No events > > Deoptimization events (0 events): > No events > > Classes unloaded (0 events): > No events > > Classes redefined (0 events): > No events > > Internal exceptions (0 events): > No events > > Events (20 events): > Event: 0.113 loading class java/lang/Character > Event: 0.114 loading class java/lang/Character done > Event: 0.114 loading class java/lang/Float > Event: 0.115 loading class java/lang/Number > Event: 0.115 loading class java/lang/Number done > Event: 0.115 loading class java/lang/Float done > Event: 0.115 loading class java/lang/Double > Event: 0.116 loading class java/lang/Double done > Event: 0.116 loading class java/lang/Byte > Event: 0.116 loading class java/lang/Byte done > Event: 0.116 loading class java/lang/Short > Event: 0.117 loading class java/lang/Short done > Event: 0.117 loading class java/lang/Integer > Event: 0.118 loading class java/lang/Integer done > Event: 0.118 loading class java/lang/Long > Event: 0.119 loading class java/lang/Long done > Event: 0.119 loading class java/util/Iterator > Event: 0.119 loading class java/util/Iterator done > Event: 0.119 loading class java/lang/reflect/RecordComponent > Event: 0.119 loading class java/lang/reflect/RecordComponent done > > > Dynamic libraries: > 00410000-00411000 r-xp 00000000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java > 00420000-00421000 r--p 00000000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java > 00421000-00422000 rw-p 00001000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java > 019b6000-019d7000 rw-p 00000000 00:00 0 [heap] > 809c9000-80e00000 rw-p 00000000 00:00 0 > 80e00000-80e8e000 rw-p 00000000 00:00 0 > 80e8e000-80f00000 ---p 00000000 00:00 0 > 80fb4000-811da000 rw-p 00000000 00:00 0 > 811da000-81400000 ---p 00000000 00:00 0 > 81400000-81421000 rw-p 00000000 00:00 0 > 81421000-81500000 ---p 00000000 00:00 0 > 8157e000-8157f000 ---p 00000000 00:00 0 > 8157f000-81600000 rw-p 00000000 00:00 0 > 81600000-81621000 rw-p 00000000 00:00 0 > 81621000-81700000 ---p 00000000 00:00 0 > 8177e000-8177f000 ---p 00000000 00:00 0 > 8177f000-81800000 rw-p 00000000 00:00 0 > 81800000-81821000 rw-p 00000000 00:00 0 > 81821000-81900000 ---p 00000000 00:00 0 > 81900000-81921000 rw-p 00000000 00:00 0 > 81921000-81a00000 ---p 00000000 00:00 0 > 81a7e000-81a7f000 ---p 00000000 00:00 0 > 81a7f000-81b00000 rw-p 00000000 00:00 0 > 81b00000-81b21000 rw-p 00000000 00:00 0 > 81b21000-81c00000 ---p 00000000 00:00 0 > 81c21000-81c7c000 rw-p 00000000 00:00 0 > 81c7c000-81c7d000 ---p 00000000 00:00 0 > 81c7d000-81cfe000 rw-p 00000000 00:00 0 > 81cfe000-81cff000 ---p 00000000 00:00 0 > 81cff000-81e80000 rw-p 00000000 00:00 0 > 81e80000-82980000 ---p 00000000 00:00 0 > 82980000-82a80000 rw-p 00000000 00:00 0 > 82a80000-83580000 ---p 00000000 00:00 0 > 83580000-835a0000 rw-p 00000000 00:00 0 > 835a0000-83700000 ---p 00000000 00:00 0 > 83700000-83720000 rw-p 00000000 00:00 0 > 83720000-83880000 ---p 00000000 00:00 0 > 83880000-838a0000 rw-p 00000000 00:00 0 > 838a0000-83a00000 ---p 00000000 00:00 0 > 83a00000-87a00000 rw-p 00000000 00:00 0 > 87a00000-b3a00000 ---p 00000000 00:00 0 > b3a25000-b3a76000 rw-p 00000000 00:00 0 > b3a76000-b3ab3000 ---p 00000000 00:00 0 > b3ab3000-b3c33000 rwxp 00000000 00:00 0 > b3c33000-b5ab3000 ---p 00000000 00:00 0 > b5ab3000-b5ac8000 r-xp 00000000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so > b5ac8000-b5ad8000 ---p 00015000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so > b5ad8000-b5ad9000 r--p 00015000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so > b5ad9000-b5ada000 rw-p 00016000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so > b5ada000-b5ae2000 rw-s 00000000 b3:02 2576900 /tmp/hsperfdata_root/14700 > b5ae2000-b5ae9000 r-xp 00000000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so > b5ae9000-b5af8000 ---p 00007000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so > b5af8000-b5af9000 r--p 00006000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so > b5af9000-b5afa000 rw-p 00007000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so > b5afa000-b5b00000 rw-p 00000000 00:00 0 > b5b00000-b5c00000 rw-p 00000000 00:00 0 > b5c00000-b5c0d000 r-xp 00000000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so > b5c0d000-b5c1c000 ---p 0000d000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so > b5c1c000-b5c1d000 r--p 0000c000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so > b5c1d000-b5c1e000 rw-p 0000d000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so > b5c1e000-b5c20000 rw-p 00000000 00:00 0 > b5c20000-b5c27000 r-xp 00000000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so > b5c27000-b5c36000 ---p 00007000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so > b5c36000-b5c37000 r--p 00006000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so > b5c37000-b5c38000 rw-p 00007000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so > b5c38000-b5c3d000 r-xp 00000000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so > b5c3d000-b5c4c000 ---p 00005000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so > b5c4c000-b5c4d000 r--p 00004000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so > b5c4d000-b5c4e000 rw-p 00005000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so > b5c4e000-b5c5d000 r-xp 00000000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so > b5c5d000-b5c6c000 ---p 0000f000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so > b5c6c000-b5c6d000 r--p 0000e000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so > b5c6d000-b5c6e000 rw-p 0000f000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so > b5c6e000-b5c71000 ---p 00000000 00:00 0 > b5c71000-b5cbe000 rw-p 00000000 00:00 0 > b5cbe000-b5d2d000 r-xp 00000000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so > b5d2d000-b5d3d000 ---p 0006f000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so > b5d3d000-b5d3e000 r--p 0006f000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so > b5d3e000-b5d3f000 rw-p 00070000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so > b5d3f000-b6d56000 r-xp 00000000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so > b6d56000-b6d65000 ---p 01017000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so > b6d65000-b6dba000 r--p 01016000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so > b6dba000-b6dd2000 rw-p 0106b000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so > b6dd2000-b6e5e000 rw-p 00000000 00:00 0 > b6e5e000-b6e6f000 r-xp 00000000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so > b6e6f000-b6e7f000 ---p 00011000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so > b6e7f000-b6e80000 r--p 00011000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so > b6e80000-b6e81000 rw-p 00012000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so > b6e81000-b6e83000 rw-p 00000000 00:00 0 > b6e83000-b6e85000 r-xp 00000000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so > b6e85000-b6e94000 ---p 00002000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so > b6e94000-b6e95000 r--p 00001000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so > b6e95000-b6e96000 rw-p 00002000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so > b6e96000-b6eaf000 r-xp 00000000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 > b6eaf000-b6ebe000 ---p 00019000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 > b6ebe000-b6ebf000 r--p 00018000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 > b6ebf000-b6ec0000 rw-p 00019000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 > b6ec0000-b6fa2000 r-xp 00000000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so > b6fa2000-b6fb2000 ---p 000e2000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so > b6fb2000-b6fb4000 r--p 000e2000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so > b6fb4000-b6fb5000 rw-p 000e4000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so > b6fb5000-b6fb8000 rw-p 00000000 00:00 0 > b6fb8000-b6fc2000 r-xp 00000000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so > b6fc2000-b6fd1000 ---p 0000a000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so > b6fd1000-b6fd2000 r--p 00009000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so > b6fd2000-b6fd3000 rw-p 0000a000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so > b6fd3000-b6feb000 r-xp 00000000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so > b6ff2000-b6ff4000 rw-p 00000000 00:00 0 > b6ff6000-b6ff7000 ---p 00000000 00:00 0 > b6ff7000-b6ff8000 r--p 00000000 00:00 0 > b6ff8000-b6ff9000 rwxp 00000000 00:00 0 > b6ff9000-b6ffb000 rw-p 00000000 00:00 0 > b6ffb000-b6ffc000 r--p 00018000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so > b6ffc000-b6ffd000 rw-p 00019000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so > bed97000-bedb8000 rw-p 00000000 00:00 0 [stack] > beeb0000-beeb1000 r-xp 00000000 00:00 0 [sigpage] > beeb1000-beeb2000 r--p 00000000 00:00 0 [vvar] > beeb2000-beeb3000 r-xp 00000000 00:00 0 [vdso] > ffff0000-ffff1000 r-xp 00000000 00:00 0 [vectors] > > > VM Arguments: > jvm_args: -Xms64M -Xmx768M --add-exports=java.base/jdk.internal.module=ALL-UNNAMED > java_command: build.tools.jigsaw.AddPackagesAttribute /workspace/build/linux-arm-server-fastdebug/jdk > java_class_path (initial): /workspace/build/linux-arm-server-fastdebug/buildtools/tools_jigsaw_classes > Launcher Type: SUN_STANDARD > > [Global flags] > uint ConcGCThreads = 1 {product} {ergonomic} Number of threads concurrent gc will use > uint G1ConcRefinementThreads = 4 {product} {ergonomic} The number of parallel rem set update threads. Will be set ergonomically by default. > size_t G1HeapRegionSize = 1048576 {product} {ergonomic} Size of the G1 regions. > uintx GCDrainStackTargetSize = 64 {product} {ergonomic} Number of entries we will try to leave on the stack during parallel gc > size_t InitialHeapSize = 67108864 {product} {command line} Initial heap size (in bytes); zero means use ergonomics > size_t MarkStackSize = 32768 {product} {ergonomic} Size of marking stack > size_t MaxHeapSize = 805306368 {product} {command line} Maximum heap size (in bytes) > size_t MaxNewSize = 482344960 {product} {ergonomic} Maximum new generation size (in bytes), max_uintx means set ergonomically > size_t MinHeapDeltaBytes = 1048576 {product} {ergonomic} The minimum change in heap space due to GC (in bytes) > size_t MinHeapSize = 67108864 {product} {command line} Minimum heap size (in bytes); zero means use ergonomics > uintx NonProfiledCodeHeapSize = 0 {pd product} {ergonomic} Size of code heap with non-profiled methods (in bytes) > uintx ProfiledCodeHeapSize = 0 {pd product} {ergonomic} Size of code heap with profiled methods (in bytes) > size_t SoftMaxHeapSize = 805306368 {manageable} {ergonomic} Soft limit for maximum heap size (in bytes) > bool UseG1GC = true {product} {ergonomic} Use the Garbage-First garbage collector > > Logging: > Log output configuration: > #0: stdout all=warning uptime,level,tags > #1: stderr all=off uptime,level,tags > > Environment Variables: > JAVA_HOME=/opt/java/openjdk > PATH=/opt/java/openjdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin > LC_ALL=C > > Signal Handlers: > SIGSEGV: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO > SIGBUS: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO > SIGFPE: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO > SIGPIPE: [libjvm.so+0xc9aa9d], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO > SIGXFSZ: [libjvm.so+0xc9aa9d], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO > SIGILL: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO > SIGUSR2: [libjvm.so+0xc9ad95], sa_mask[0]=00000000000000000000000000000000, sa_flags=SA_RESTART|SA_SIGINFO > SIGHUP: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none > SIGINT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none > SIGTERM: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none > SIGQUIT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none > > > --------------- S Y S T E M --------------- > > OS: > DISTRIB_ID=Ubuntu > DISTRIB_RELEASE=18.04 > DISTRIB_CODENAME=bionic > DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS" > uname: Linux 20431585315d 5.4.51-v7l+ #1333 SMP Mon Aug 10 16:51:40 BST 2020 armv7l > OS uptime: 14 days 7:59 hours > libc: glibc 2.27 NPTL 2.27 > rlimit (soft/hard): STACK 8192k/infinity , CORE infinity/infinity , NPROC infinity/infinity , NOFILE 1048576/1048576 , AS infinity/infinity , CPU infinity/infinity , DATA infinity/infinity , FSIZE infinity/infinity , MEMLOCK 64k/64k > load average: 3.37 3.26 3.09 > > /proc/meminfo: > MemTotal: 3919812 kB > MemFree: 1255688 kB > MemAvailable: 3518740 kB > Buffers: 134316 kB > Cached: 2117828 kB > SwapCached: 0 kB > Active: 1266624 kB > Inactive: 1167412 kB > Active(anon): 110360 kB > Inactive(anon): 80744 kB > Active(file): 1156264 kB > Inactive(file): 1086668 kB > Unevictable: 16 kB > Mlocked: 16 kB > HighTotal: 3264512 kB > HighFree: 1038848 kB > LowTotal: 655300 kB > LowFree: 216840 kB > SwapTotal: 102396 kB > SwapFree: 102396 kB > Dirty: 24916 kB > Writeback: 0 kB > AnonPages: 181884 kB > Mapped: 125864 kB > Shmem: 16892 kB > KReclaimable: 181816 kB > Slab: 205164 kB > SReclaimable: 181816 kB > SUnreclaim: 23348 kB > KernelStack: 2240 kB > PageTables: 2684 kB > NFS_Unstable: 0 kB > Bounce: 0 kB > WritebackTmp: 0 kB > CommitLimit: 2062300 kB > Committed_AS: 1125176 kB > VmallocTotal: 245760 kB > VmallocUsed: 5520 kB > VmallocChunk: 0 kB > Percpu: 512 kB > CmaTotal: 262144 kB > CmaFree: 171244 kB > > /sys/kernel/mm/transparent_hugepage/enabled: > /sys/kernel/mm/transparent_hugepage/defrag (defrag/compaction efforts parameter): > > Process Memory: > Virtual Size: 888828K (peak: 888828K) > Resident Set Size: 25020K (peak: 25020K) (anon: 11372K, file: 13648K, shmem: 0K) > Swapped out: 0K > C-Heap outstanding allocations: 1636K > > /proc/sys/kernel/threads-max (system-wide limit on the number of threads): 57119 > /proc/sys/vm/max_map_count (maximum number of memory map areas a process may have): 65530 > /proc/sys/kernel/pid_max (system-wide limit on number of process identifiers): 32768 > > Steal ticks since vm start: 0 > Steal ticks percentage since vm start: 0.000 > > CPU: total 4 (initial active 4) (ARMv7), vfp, vfp3-32, simd, mp_ext > /proc/cpuinfo: > processor : 0 > model name : ARMv7 Processor rev 3 (v7l) > BogoMIPS : 270.00 > Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 > CPU implementer : 0x41 > CPU architecture: 7 > CPU variant : 0x0 > CPU part : 0xd08 > CPU revision : 3 > > processor : 1 > model name : ARMv7 Processor rev 3 (v7l) > BogoMIPS : 270.00 > Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 > CPU implementer : 0x41 > CPU architecture: 7 > CPU variant : 0x0 > CPU part : 0xd08 > CPU revision : 3 > > processor : 2 > model name : ARMv7 Processor rev 3 (v7l) > BogoMIPS : 270.00 > Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 > CPU implementer : 0x41 > CPU architecture: 7 > CPU variant : 0x0 > CPU part : 0xd08 > CPU revision : 3 > > processor : 3 > model name : ARMv7 Processor rev 3 (v7l) > BogoMIPS : 270.00 > Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 > CPU implementer : 0x41 > CPU architecture: 7 > CPU variant : 0x0 > CPU part : 0xd08 > CPU revision : 3 > > Hardware : BCM2711 > Revision : c03111 > Serial : 100000001c47254f > Model : Raspberry Pi 4 Model B Rev 1.1 > > Online cpus: 0-3 > Offline cpus: > > Memory: 4k page, physical 3919812k(1255688k free), swap 102396k(102396k free) > > vm_info: OpenJDK Server VM (fastdebug 16-internal+0-adhoc..workspace) for linux-arm JRE (16-internal+0-adhoc..workspace), built on Oct 12 2020 19:49:51 by "" with gcc 7.5.0 > > END. > > > > > > On 12. Oct 2020, at 20:24, Aleksey Shipilev wrote: > > > > Hi, > > > > On 10/12/20 8:12 PM, Marc Hoffmann wrote: > >> Please find the build log and the hs_err file for commit fd0cb98ed03c6214c02ccd3503c1e6d77065a428 attached. > > > > Please try to build with fastdebug (./configure --enable-debug), so that JVM asserts meaninfully somewhere? > > > >> Is there any additional information I can provide to help getting these builds fixed again? > > > > I am seeing plenty of weird x86_32 crashes since last week. Pretty sure some of them would manifest on ARM32 as well. This is why building with fastdebug is the next step: it maps out the bug symptoms. > > > > -- > > Thanks, > > -Aleksey > > > From iklam at openjdk.java.net Tue Oct 13 06:51:17 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 13 Oct 2020 06:51:17 GMT Subject: RFR: 8254365: ciMethod.hpp should not include methodHandles.hpp Message-ID: ciMethod.hpp includes methodHandles.hpp. This is probably a typo as ciMethod.hpp doesn't use the MethodHandles class. Instead, it uses methodHandle which is declared in runtime/handles.hpp. As usual, I had to fix a few .cpp files that used the MethodHandles class but did not explicitly include methodHandles.hpp. Tested with mach5 build tiers 1-5. ------------- Commit messages: - 8254365: ciMethod.hpp should not include methodHandles.hpp Changes: https://git.openjdk.java.net/jdk/pull/623/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=623&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254365 Stats: 23 lines in 21 files changed: 20 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/623.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/623/head:pull/623 PR: https://git.openjdk.java.net/jdk/pull/623 From dholmes at openjdk.java.net Tue Oct 13 07:36:11 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 13 Oct 2020 07:36:11 GMT Subject: RFR: 8254365: ciMethod.hpp should not include methodHandles.hpp In-Reply-To: References: Message-ID: On Tue, 13 Oct 2020 06:18:40 GMT, Ioi Lam wrote: > ciMethod.hpp includes methodHandles.hpp. This is probably a typo as ciMethod.hpp doesn't use the MethodHandles class. > Instead, it uses methodHandle which is declared in runtime/handles.hpp. > As usual, I had to fix a few .cpp files that used the MethodHandles class but did not explicitly include > methodHandles.hpp. > Tested with mach5 build tiers 1-5. Seems okay. There are a couple of changes unrelated to the bug synopsis. :) Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/623 From hoffmann at mountainminds.com Tue Oct 13 08:40:44 2020 From: hoffmann at mountainminds.com (Marc Hoffmann) Date: Tue, 13 Oct 2020 10:40:44 +0200 Subject: arm32 builds continue to fail for me after 8253540 and 8253901 In-Reply-To: References: <56ff08d5-a4e5-788a-1c29-02f76e8755d2@redhat.com> <17F91692-4F3D-4FAA-AB94-361B6C84F982@mountainminds.com> Message-ID: <43292960-64A5-4AA4-862F-45088B5103DB@mountainminds.com> Hi Boris, I use this docker file for the build: https://github.com/marchof/PiCI/blob/master/jdk/docker/Dockerfile Regards, -marc > On 13. Oct 2020, at 08:48, Boris Ulasevich wrote: > > Hi Marc, > > I created JDK-8254661 for the issue. I would love to fix it, but still > can't reproduce the crash (even on Raspberry Pi). > What configuration do you have? The following sequence works Ok for me: > pi at raspberrypi $ git clone https://github.com/openjdk/jdk > pi at raspberrypi $ cd jdk > pi at raspberrypi $ bash configure --with-boot-jdk=/home/pi/jdk-15 > pi at raspberrypi $ make > > Your debug build shows that I did not fix the > assert_different_registers in > InterpreterMacroAssembler::unlock_object() > body (and the function comment by the way!), though with eyeballing I > do not see what is wrong for Rlock=R0: > https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/arm/interp_masm_arm.cpp#L1000 > > regards, > Boris > > On Mon, Oct 12, 2020 at 11:34 PM Marc Hoffmann > wrote: >> >> Hi Aleksey, hi Boris, >> >> for me the crash is always reproducible: Every single build after >> >> 77a0f3999afa322b64643afd4a161164440af975 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function >> >> fails on arm32 (build on ubuntu in docker on a raspberry pi 4). Before this commit I haven?t encountered any failures. >> >> Here is the hs_err file with ?enable-debug (reproduced with current master c7f00640627eab38b77d23d07876cf0247fa18f3). >> >> Cheers, >> -marc >> >> >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (/workspace/src/hotspot/share/asm/register.hpp:160), pid=14700, tid=14705 >> # assert(a != b && a != c && a != d && a != e && b != c && b != d && b != e && c != d && c != e && d != e) failed: registers must be different: a=0x00000002, b=0x00000003, c=0x00000000, d=0x00000000, e=0x0000000c >> # >> # JRE version: (16.0) (fastdebug build ) >> # Java VM: OpenJDK Server VM (fastdebug 16-internal+0-adhoc..workspace, mixed mode, g1 gc, linux-arm) >> # Problematic frame: >> # V [libjvm.so+0x7571fc] InterpreterMacroAssembler::unlock_object(RegisterImpl*) [clone .part.34]+0x63 >> # >> # Core dump will be written. Default location: /workspace/make/core >> # >> # >> >> --------------- S U M M A R Y ------------ >> >> Command Line: -Xms64M -Xmx768M --add-exports=java.base/jdk.internal.module=ALL-UNNAMED build.tools.jigsaw.AddPackagesAttribute /workspace/build/linux-arm-server-fastdebug/jdk >> >> Host: 20431585315d, rev 3 (v7l), 4 cores, 3G, Ubuntu 18.04.3 LTS >> Time: Mon Oct 12 20:22:15 2020 UTC elapsed time: 0.144243 seconds (0d 0h 0m 0s) >> >> --------------- T H R E A D --------------- >> >> Current thread (0xb5b16460): JavaThread "Unknown thread" [_thread_in_vm, id=14705, stack(0xb5c6e000,0xb5cbe000)] >> >> Stack: [0xb5c6e000,0xb5cbe000], sp=0xb5cbc2d0, free space=312k >> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x7571fc] InterpreterMacroAssembler::unlock_object(RegisterImpl*) [clone .part.34]+0x63 >> >> Registers: >> r0 = 0x00000003 >> r1 = 0x000000a0 >> r2 = 0x00000002 >> r3 = 0x00000000 >> r4 = 0xb5b168b0 >> r5 = 0x0000000c >> r6 = 0x00000000 >> r7 = 0xb5cbc2e8 >> r8 = 0xb6db1fa8 >> r9 = 0xb5cbc760 >> r10 = 0xe3520000 >> fp = 0xb6db1fa8 >> r12 = 0xb6ff8000 >> sp = 0xb5cbc2d0 >> lr = 0x00000058 >> pc = 0xb64961fc >> cpsr = 0x200f0030 >> >> Top of Stack: (sp=0xb5cbc2d0) >> 0xb5cbc2d0: 00000002 00000003 00000000 00000000 >> 0xb5cbc2e0: 0000000c 00000048 0000006e 00000000 >> 0xb5cbc2f0: 00000000 00000000 00000000 0000007c >> 0xb5cbc300: 00000000 00000077 b5cbc378 00000000 >> 0xb5cbc310: b5cbc380 0000000f b6db1fa8 b5cbc340 >> 0xb5cbc320: b5b168b0 b6db1fa8 b5cbc4c4 b6008a2b >> 0xb5cbc330: b5cbc454 b5cbc348 0000000f b61971cf >> 0xb5cbc340: b5cbc380 b5cbc3b0 b5b168b0 b5cbc388 >> >> Instructions: (pc=0xb64961fc) >> 0xb64960fc: 440be9c7 a034f8c7 f5e06139 68ebf7ed >> 0xb649610c: f040689a 46184164 1180f441 68996011 >> 0xb649611c: f5e03104 4b12f781 46284622 1003f858 >> 0xb649612c: f998f2e4 f64f68eb f2ce7210 6899122f >> 0xb649613c: 600a4618 31046899 f76ef5e0 46284631 >> 0xb649614c: f4e6f69f f5e04630 f507f79b 46bd7786 >> 0xb649615c: 8ff0e8bd 0091bfd0 00006ee4 000059d0 >> 0xb649616c: 00007d1c 4a084b07 b480447b 589baf00 >> 0xb649617c: b91b781b f85d46bd 47707b04 f85d46bd >> 0xb649618c: e7197b04 0091be30 00006a24 bf182900 >> 0xb649619c: f1a1290c e92d0202 b0f34ff0 2301bf08 >> 0xb64961ac: bf18af06 f8df2300 2a018268 461abf8c >> 0xb64961bc: 0201f043 f04f460e 44f833ff 30d0f8c7 >> 0xb64961cc: f8c74604 23003140 333de9c7 30fcf887 >> 0xb64961dc: 3359e9c7 316cf887 4a8eb1da 0e58f04f >> 0xb64961ec: 250c2003 1002f858 f8d12202 21a0c000 >> 0xb64961fc: e000f88c 2000e9cd 6302e9cd 4b874a86 >> 0xb649620c: 447a4887 9504447b f58f4478 f00cfaa3 >> 0xb649621c: 68e2f2d5 4340f44f 33a0f2ce 0a04f04f >> 0xb649622c: 7980f04f 0b00f04f 46106891 600b2501 >> 0xb649623c: 44516891 f6f0f5e0 4a7b497a 3001f858 >> 0xb649624c: 0140f107 a048f8c7 4608643e f8c7681b >> 0xb649625c: 60fb904c f858647b f8c72002 f8c7b054 >> 0xb649626c: 60bab058 e9c73208 653abb19 66fd607a >> 0xb649627c: f732f5e0 689a68e3 4164f040 f4414618 >> 0xb649628c: 60111181 44516899 f6c6f5e0 0110f107 >> 0xb649629c: 68fb687a f8c74608 623aa018 6304e9c7 >> 0xb64962ac: 901cf8c7 bb09e9c7 bb0de9c7 f5e063fd >> 0xb64962bc: 68e3f713 f0406899 46184264 4240f442 >> 0xb64962cc: 6899600a f1074451 f5e00ad0 4b57f6a5 >> 0xb64962dc: 3003f858 2b00781b 8093f040 f04f68bb >> 0xb64962ec: 68f97c80 33082500 c0acf8c7 0b01f04f >> >> >> >> --------------- P R O C E S S --------------- >> >> uid : 0 euid : 0 gid : 0 egid : 0 >> >> umask: 0022 (----w--w-) >> >> Threads class SMR info: >> _java_thread_list=0xb6e56078, length=0, elements={ >> } >> _java_thread_list_alloc_cnt=1, _java_thread_list_free_cnt=0, _java_thread_list_max=0, _nested_thread_list_max=0 >> _delete_lock_wait_cnt=0, _delete_lock_wait_max=0 >> _to_delete_list_cnt=0, _to_delete_list_max=0 >> >> Java Threads: ( => current thread ) >> >> Other Threads: >> 0xb5b73188 GCTaskThread "GC Thread#0" [stack: 0x81d00000,0x81d80000] [id=14706] >> 0xb5b77dc0 ConcurrentGCThread "G1 Main Marker" [stack: 0x81c7e000,0x81cfe000] [id=14707] >> 0xb5b790c0 ConcurrentGCThread "G1 Conc#0" [stack: 0x81a80000,0x81b00000] [id=14708] >> 0xb5bde230 ConcurrentGCThread "G1 Refine#0" [stack: 0x81780000,0x81800000] [id=14709] >> 0xb5bdf488 ConcurrentGCThread "G1 Service" [stack: 0x81580000,0x81600000] [id=14710] >> >> =>0xb5b16460 (exited) JavaThread "Unknown thread" [_thread_in_vm, id=14705, stack(0xb5c6e000,0xb5cbe000)] >> >> Threads with active compile tasks: >> >> VM state: not at safepoint (not fully initialized) >> >> VM Mutex/Monitor currently owned by a thread: None >> >> GC Precious Log: >> CPUs: 4 total, 4 available >> Memory: 3827M >> Large Page Support: Disabled >> NUMA Support: Disabled >> Compressed Oops: Disabled >> Heap Region Size: 1M >> Heap Min Capacity: 64M >> Heap Initial Capacity: 64M >> Heap Max Capacity: 768M >> Pre-touch: Disabled >> Parallel Workers: 4 >> Concurrent Workers: 1 >> Concurrent Refinement Workers: 4 >> Periodic GC: Disabled >> >> Heap: >> garbage-first heap total 65536K, used 0K [0x83a00000, 0xb3a00000) >> region size 1024K, 1 young (1024K), 0 survivors (0K) >> Metaspace used 944K, capacity 2200K, committed 2200K, reserved 4400K >> >> Heap Regions: E=young(eden), S=young(survivor), O=old, HS=humongous(starts), HC=humongous(continues), CS=collection set, F=free, OA=open archive, CA=closed archive, TAMS=top-at-mark-start (previous, next) >> | 0|0x83a00000, 0x83a00000, 0x83b00000| 0%| F| |TAMS 0x83a00000, 0x83a00000| Untracked >> | 1|0x83b00000, 0x83b00000, 0x83c00000| 0%| F| |TAMS 0x83b00000, 0x83b00000| Untracked >> | 2|0x83c00000, 0x83c00000, 0x83d00000| 0%| F| |TAMS 0x83c00000, 0x83c00000| Untracked >> | 3|0x83d00000, 0x83d00000, 0x83e00000| 0%| F| |TAMS 0x83d00000, 0x83d00000| Untracked >> | 4|0x83e00000, 0x83e00000, 0x83f00000| 0%| F| |TAMS 0x83e00000, 0x83e00000| Untracked >> | 5|0x83f00000, 0x83f00000, 0x84000000| 0%| F| |TAMS 0x83f00000, 0x83f00000| Untracked >> | 6|0x84000000, 0x84000000, 0x84100000| 0%| F| |TAMS 0x84000000, 0x84000000| Untracked >> | 7|0x84100000, 0x84100000, 0x84200000| 0%| F| |TAMS 0x84100000, 0x84100000| Untracked >> | 8|0x84200000, 0x84200000, 0x84300000| 0%| F| |TAMS 0x84200000, 0x84200000| Untracked >> | 9|0x84300000, 0x84300000, 0x84400000| 0%| F| |TAMS 0x84300000, 0x84300000| Untracked >> | 10|0x84400000, 0x84400000, 0x84500000| 0%| F| |TAMS 0x84400000, 0x84400000| Untracked >> | 11|0x84500000, 0x84500000, 0x84600000| 0%| F| |TAMS 0x84500000, 0x84500000| Untracked >> | 12|0x84600000, 0x84600000, 0x84700000| 0%| F| |TAMS 0x84600000, 0x84600000| Untracked >> | 13|0x84700000, 0x84700000, 0x84800000| 0%| F| |TAMS 0x84700000, 0x84700000| Untracked >> | 14|0x84800000, 0x84800000, 0x84900000| 0%| F| |TAMS 0x84800000, 0x84800000| Untracked >> | 15|0x84900000, 0x84900000, 0x84a00000| 0%| F| |TAMS 0x84900000, 0x84900000| Untracked >> | 16|0x84a00000, 0x84a00000, 0x84b00000| 0%| F| |TAMS 0x84a00000, 0x84a00000| Untracked >> | 17|0x84b00000, 0x84b00000, 0x84c00000| 0%| F| |TAMS 0x84b00000, 0x84b00000| Untracked >> | 18|0x84c00000, 0x84c00000, 0x84d00000| 0%| F| |TAMS 0x84c00000, 0x84c00000| Untracked >> | 19|0x84d00000, 0x84d00000, 0x84e00000| 0%| F| |TAMS 0x84d00000, 0x84d00000| Untracked >> | 20|0x84e00000, 0x84e00000, 0x84f00000| 0%| F| |TAMS 0x84e00000, 0x84e00000| Untracked >> | 21|0x84f00000, 0x84f00000, 0x85000000| 0%| F| |TAMS 0x84f00000, 0x84f00000| Untracked >> | 22|0x85000000, 0x85000000, 0x85100000| 0%| F| |TAMS 0x85000000, 0x85000000| Untracked >> | 23|0x85100000, 0x85100000, 0x85200000| 0%| F| |TAMS 0x85100000, 0x85100000| Untracked >> | 24|0x85200000, 0x85200000, 0x85300000| 0%| F| |TAMS 0x85200000, 0x85200000| Untracked >> | 25|0x85300000, 0x85300000, 0x85400000| 0%| F| |TAMS 0x85300000, 0x85300000| Untracked >> | 26|0x85400000, 0x85400000, 0x85500000| 0%| F| |TAMS 0x85400000, 0x85400000| Untracked >> | 27|0x85500000, 0x85500000, 0x85600000| 0%| F| |TAMS 0x85500000, 0x85500000| Untracked >> | 28|0x85600000, 0x85600000, 0x85700000| 0%| F| |TAMS 0x85600000, 0x85600000| Untracked >> | 29|0x85700000, 0x85700000, 0x85800000| 0%| F| |TAMS 0x85700000, 0x85700000| Untracked >> | 30|0x85800000, 0x85800000, 0x85900000| 0%| F| |TAMS 0x85800000, 0x85800000| Untracked >> | 31|0x85900000, 0x85900000, 0x85a00000| 0%| F| |TAMS 0x85900000, 0x85900000| Untracked >> | 32|0x85a00000, 0x85a00000, 0x85b00000| 0%| F| |TAMS 0x85a00000, 0x85a00000| Untracked >> | 33|0x85b00000, 0x85b00000, 0x85c00000| 0%| F| |TAMS 0x85b00000, 0x85b00000| Untracked >> | 34|0x85c00000, 0x85c00000, 0x85d00000| 0%| F| |TAMS 0x85c00000, 0x85c00000| Untracked >> | 35|0x85d00000, 0x85d00000, 0x85e00000| 0%| F| |TAMS 0x85d00000, 0x85d00000| Untracked >> | 36|0x85e00000, 0x85e00000, 0x85f00000| 0%| F| |TAMS 0x85e00000, 0x85e00000| Untracked >> | 37|0x85f00000, 0x85f00000, 0x86000000| 0%| F| |TAMS 0x85f00000, 0x85f00000| Untracked >> | 38|0x86000000, 0x86000000, 0x86100000| 0%| F| |TAMS 0x86000000, 0x86000000| Untracked >> | 39|0x86100000, 0x86100000, 0x86200000| 0%| F| |TAMS 0x86100000, 0x86100000| Untracked >> | 40|0x86200000, 0x86200000, 0x86300000| 0%| F| |TAMS 0x86200000, 0x86200000| Untracked >> | 41|0x86300000, 0x86300000, 0x86400000| 0%| F| |TAMS 0x86300000, 0x86300000| Untracked >> | 42|0x86400000, 0x86400000, 0x86500000| 0%| F| |TAMS 0x86400000, 0x86400000| Untracked >> | 43|0x86500000, 0x86500000, 0x86600000| 0%| F| |TAMS 0x86500000, 0x86500000| Untracked >> | 44|0x86600000, 0x86600000, 0x86700000| 0%| F| |TAMS 0x86600000, 0x86600000| Untracked >> | 45|0x86700000, 0x86700000, 0x86800000| 0%| F| |TAMS 0x86700000, 0x86700000| Untracked >> | 46|0x86800000, 0x86800000, 0x86900000| 0%| F| |TAMS 0x86800000, 0x86800000| Untracked >> | 47|0x86900000, 0x86900000, 0x86a00000| 0%| F| |TAMS 0x86900000, 0x86900000| Untracked >> | 48|0x86a00000, 0x86a00000, 0x86b00000| 0%| F| |TAMS 0x86a00000, 0x86a00000| Untracked >> | 49|0x86b00000, 0x86b00000, 0x86c00000| 0%| F| |TAMS 0x86b00000, 0x86b00000| Untracked >> | 50|0x86c00000, 0x86c00000, 0x86d00000| 0%| F| |TAMS 0x86c00000, 0x86c00000| Untracked >> | 51|0x86d00000, 0x86d00000, 0x86e00000| 0%| F| |TAMS 0x86d00000, 0x86d00000| Untracked >> | 52|0x86e00000, 0x86e00000, 0x86f00000| 0%| F| |TAMS 0x86e00000, 0x86e00000| Untracked >> | 53|0x86f00000, 0x86f00000, 0x87000000| 0%| F| |TAMS 0x86f00000, 0x86f00000| Untracked >> | 54|0x87000000, 0x87000000, 0x87100000| 0%| F| |TAMS 0x87000000, 0x87000000| Untracked >> | 55|0x87100000, 0x87100000, 0x87200000| 0%| F| |TAMS 0x87100000, 0x87100000| Untracked >> | 56|0x87200000, 0x87200000, 0x87300000| 0%| F| |TAMS 0x87200000, 0x87200000| Untracked >> | 57|0x87300000, 0x87300000, 0x87400000| 0%| F| |TAMS 0x87300000, 0x87300000| Untracked >> | 58|0x87400000, 0x87400000, 0x87500000| 0%| F| |TAMS 0x87400000, 0x87400000| Untracked >> | 59|0x87500000, 0x87500000, 0x87600000| 0%| F| |TAMS 0x87500000, 0x87500000| Untracked >> | 60|0x87600000, 0x87600000, 0x87700000| 0%| F| |TAMS 0x87600000, 0x87600000| Untracked >> | 61|0x87700000, 0x87700000, 0x87800000| 0%| F| |TAMS 0x87700000, 0x87700000| Untracked >> | 62|0x87800000, 0x87800000, 0x87900000| 0%| F| |TAMS 0x87800000, 0x87800000| Untracked >> | 63|0x87900000, 0x87942908, 0x87a00000| 26%| E| |TAMS 0x87900000, 0x87900000| Complete >> >> Card table byte_map: [0x83700000,0x83880000] _byte_map_base: 0x832e3000 >> >> Marking Bits (Prev, Next): (CMBitMap*) 0xb5b74324, (CMBitMap*) 0xb5b74344 >> Prev Bits: [0x82980000, 0x83580000) >> Next Bits: [0x81d80000, 0x82980000) >> >> GC Heap History (0 events): >> No events >> >> Deoptimization events (0 events): >> No events >> >> Classes unloaded (0 events): >> No events >> >> Classes redefined (0 events): >> No events >> >> Internal exceptions (0 events): >> No events >> >> Events (20 events): >> Event: 0.113 loading class java/lang/Character >> Event: 0.114 loading class java/lang/Character done >> Event: 0.114 loading class java/lang/Float >> Event: 0.115 loading class java/lang/Number >> Event: 0.115 loading class java/lang/Number done >> Event: 0.115 loading class java/lang/Float done >> Event: 0.115 loading class java/lang/Double >> Event: 0.116 loading class java/lang/Double done >> Event: 0.116 loading class java/lang/Byte >> Event: 0.116 loading class java/lang/Byte done >> Event: 0.116 loading class java/lang/Short >> Event: 0.117 loading class java/lang/Short done >> Event: 0.117 loading class java/lang/Integer >> Event: 0.118 loading class java/lang/Integer done >> Event: 0.118 loading class java/lang/Long >> Event: 0.119 loading class java/lang/Long done >> Event: 0.119 loading class java/util/Iterator >> Event: 0.119 loading class java/util/Iterator done >> Event: 0.119 loading class java/lang/reflect/RecordComponent >> Event: 0.119 loading class java/lang/reflect/RecordComponent done >> >> >> Dynamic libraries: >> 00410000-00411000 r-xp 00000000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java >> 00420000-00421000 r--p 00000000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java >> 00421000-00422000 rw-p 00001000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java >> 019b6000-019d7000 rw-p 00000000 00:00 0 [heap] >> 809c9000-80e00000 rw-p 00000000 00:00 0 >> 80e00000-80e8e000 rw-p 00000000 00:00 0 >> 80e8e000-80f00000 ---p 00000000 00:00 0 >> 80fb4000-811da000 rw-p 00000000 00:00 0 >> 811da000-81400000 ---p 00000000 00:00 0 >> 81400000-81421000 rw-p 00000000 00:00 0 >> 81421000-81500000 ---p 00000000 00:00 0 >> 8157e000-8157f000 ---p 00000000 00:00 0 >> 8157f000-81600000 rw-p 00000000 00:00 0 >> 81600000-81621000 rw-p 00000000 00:00 0 >> 81621000-81700000 ---p 00000000 00:00 0 >> 8177e000-8177f000 ---p 00000000 00:00 0 >> 8177f000-81800000 rw-p 00000000 00:00 0 >> 81800000-81821000 rw-p 00000000 00:00 0 >> 81821000-81900000 ---p 00000000 00:00 0 >> 81900000-81921000 rw-p 00000000 00:00 0 >> 81921000-81a00000 ---p 00000000 00:00 0 >> 81a7e000-81a7f000 ---p 00000000 00:00 0 >> 81a7f000-81b00000 rw-p 00000000 00:00 0 >> 81b00000-81b21000 rw-p 00000000 00:00 0 >> 81b21000-81c00000 ---p 00000000 00:00 0 >> 81c21000-81c7c000 rw-p 00000000 00:00 0 >> 81c7c000-81c7d000 ---p 00000000 00:00 0 >> 81c7d000-81cfe000 rw-p 00000000 00:00 0 >> 81cfe000-81cff000 ---p 00000000 00:00 0 >> 81cff000-81e80000 rw-p 00000000 00:00 0 >> 81e80000-82980000 ---p 00000000 00:00 0 >> 82980000-82a80000 rw-p 00000000 00:00 0 >> 82a80000-83580000 ---p 00000000 00:00 0 >> 83580000-835a0000 rw-p 00000000 00:00 0 >> 835a0000-83700000 ---p 00000000 00:00 0 >> 83700000-83720000 rw-p 00000000 00:00 0 >> 83720000-83880000 ---p 00000000 00:00 0 >> 83880000-838a0000 rw-p 00000000 00:00 0 >> 838a0000-83a00000 ---p 00000000 00:00 0 >> 83a00000-87a00000 rw-p 00000000 00:00 0 >> 87a00000-b3a00000 ---p 00000000 00:00 0 >> b3a25000-b3a76000 rw-p 00000000 00:00 0 >> b3a76000-b3ab3000 ---p 00000000 00:00 0 >> b3ab3000-b3c33000 rwxp 00000000 00:00 0 >> b3c33000-b5ab3000 ---p 00000000 00:00 0 >> b5ab3000-b5ac8000 r-xp 00000000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so >> b5ac8000-b5ad8000 ---p 00015000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so >> b5ad8000-b5ad9000 r--p 00015000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so >> b5ad9000-b5ada000 rw-p 00016000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so >> b5ada000-b5ae2000 rw-s 00000000 b3:02 2576900 /tmp/hsperfdata_root/14700 >> b5ae2000-b5ae9000 r-xp 00000000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so >> b5ae9000-b5af8000 ---p 00007000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so >> b5af8000-b5af9000 r--p 00006000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so >> b5af9000-b5afa000 rw-p 00007000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so >> b5afa000-b5b00000 rw-p 00000000 00:00 0 >> b5b00000-b5c00000 rw-p 00000000 00:00 0 >> b5c00000-b5c0d000 r-xp 00000000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so >> b5c0d000-b5c1c000 ---p 0000d000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so >> b5c1c000-b5c1d000 r--p 0000c000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so >> b5c1d000-b5c1e000 rw-p 0000d000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so >> b5c1e000-b5c20000 rw-p 00000000 00:00 0 >> b5c20000-b5c27000 r-xp 00000000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so >> b5c27000-b5c36000 ---p 00007000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so >> b5c36000-b5c37000 r--p 00006000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so >> b5c37000-b5c38000 rw-p 00007000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so >> b5c38000-b5c3d000 r-xp 00000000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so >> b5c3d000-b5c4c000 ---p 00005000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so >> b5c4c000-b5c4d000 r--p 00004000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so >> b5c4d000-b5c4e000 rw-p 00005000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so >> b5c4e000-b5c5d000 r-xp 00000000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so >> b5c5d000-b5c6c000 ---p 0000f000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so >> b5c6c000-b5c6d000 r--p 0000e000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so >> b5c6d000-b5c6e000 rw-p 0000f000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so >> b5c6e000-b5c71000 ---p 00000000 00:00 0 >> b5c71000-b5cbe000 rw-p 00000000 00:00 0 >> b5cbe000-b5d2d000 r-xp 00000000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so >> b5d2d000-b5d3d000 ---p 0006f000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so >> b5d3d000-b5d3e000 r--p 0006f000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so >> b5d3e000-b5d3f000 rw-p 00070000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so >> b5d3f000-b6d56000 r-xp 00000000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so >> b6d56000-b6d65000 ---p 01017000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so >> b6d65000-b6dba000 r--p 01016000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so >> b6dba000-b6dd2000 rw-p 0106b000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so >> b6dd2000-b6e5e000 rw-p 00000000 00:00 0 >> b6e5e000-b6e6f000 r-xp 00000000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so >> b6e6f000-b6e7f000 ---p 00011000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so >> b6e7f000-b6e80000 r--p 00011000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so >> b6e80000-b6e81000 rw-p 00012000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so >> b6e81000-b6e83000 rw-p 00000000 00:00 0 >> b6e83000-b6e85000 r-xp 00000000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so >> b6e85000-b6e94000 ---p 00002000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so >> b6e94000-b6e95000 r--p 00001000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so >> b6e95000-b6e96000 rw-p 00002000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so >> b6e96000-b6eaf000 r-xp 00000000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 >> b6eaf000-b6ebe000 ---p 00019000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 >> b6ebe000-b6ebf000 r--p 00018000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 >> b6ebf000-b6ec0000 rw-p 00019000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 >> b6ec0000-b6fa2000 r-xp 00000000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so >> b6fa2000-b6fb2000 ---p 000e2000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so >> b6fb2000-b6fb4000 r--p 000e2000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so >> b6fb4000-b6fb5000 rw-p 000e4000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so >> b6fb5000-b6fb8000 rw-p 00000000 00:00 0 >> b6fb8000-b6fc2000 r-xp 00000000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so >> b6fc2000-b6fd1000 ---p 0000a000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so >> b6fd1000-b6fd2000 r--p 00009000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so >> b6fd2000-b6fd3000 rw-p 0000a000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so >> b6fd3000-b6feb000 r-xp 00000000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so >> b6ff2000-b6ff4000 rw-p 00000000 00:00 0 >> b6ff6000-b6ff7000 ---p 00000000 00:00 0 >> b6ff7000-b6ff8000 r--p 00000000 00:00 0 >> b6ff8000-b6ff9000 rwxp 00000000 00:00 0 >> b6ff9000-b6ffb000 rw-p 00000000 00:00 0 >> b6ffb000-b6ffc000 r--p 00018000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so >> b6ffc000-b6ffd000 rw-p 00019000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so >> bed97000-bedb8000 rw-p 00000000 00:00 0 [stack] >> beeb0000-beeb1000 r-xp 00000000 00:00 0 [sigpage] >> beeb1000-beeb2000 r--p 00000000 00:00 0 [vvar] >> beeb2000-beeb3000 r-xp 00000000 00:00 0 [vdso] >> ffff0000-ffff1000 r-xp 00000000 00:00 0 [vectors] >> >> >> VM Arguments: >> jvm_args: -Xms64M -Xmx768M --add-exports=java.base/jdk.internal.module=ALL-UNNAMED >> java_command: build.tools.jigsaw.AddPackagesAttribute /workspace/build/linux-arm-server-fastdebug/jdk >> java_class_path (initial): /workspace/build/linux-arm-server-fastdebug/buildtools/tools_jigsaw_classes >> Launcher Type: SUN_STANDARD >> >> [Global flags] >> uint ConcGCThreads = 1 {product} {ergonomic} Number of threads concurrent gc will use >> uint G1ConcRefinementThreads = 4 {product} {ergonomic} The number of parallel rem set update threads. Will be set ergonomically by default. >> size_t G1HeapRegionSize = 1048576 {product} {ergonomic} Size of the G1 regions. >> uintx GCDrainStackTargetSize = 64 {product} {ergonomic} Number of entries we will try to leave on the stack during parallel gc >> size_t InitialHeapSize = 67108864 {product} {command line} Initial heap size (in bytes); zero means use ergonomics >> size_t MarkStackSize = 32768 {product} {ergonomic} Size of marking stack >> size_t MaxHeapSize = 805306368 {product} {command line} Maximum heap size (in bytes) >> size_t MaxNewSize = 482344960 {product} {ergonomic} Maximum new generation size (in bytes), max_uintx means set ergonomically >> size_t MinHeapDeltaBytes = 1048576 {product} {ergonomic} The minimum change in heap space due to GC (in bytes) >> size_t MinHeapSize = 67108864 {product} {command line} Minimum heap size (in bytes); zero means use ergonomics >> uintx NonProfiledCodeHeapSize = 0 {pd product} {ergonomic} Size of code heap with non-profiled methods (in bytes) >> uintx ProfiledCodeHeapSize = 0 {pd product} {ergonomic} Size of code heap with profiled methods (in bytes) >> size_t SoftMaxHeapSize = 805306368 {manageable} {ergonomic} Soft limit for maximum heap size (in bytes) >> bool UseG1GC = true {product} {ergonomic} Use the Garbage-First garbage collector >> >> Logging: >> Log output configuration: >> #0: stdout all=warning uptime,level,tags >> #1: stderr all=off uptime,level,tags >> >> Environment Variables: >> JAVA_HOME=/opt/java/openjdk >> PATH=/opt/java/openjdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin >> LC_ALL=C >> >> Signal Handlers: >> SIGSEGV: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGBUS: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGFPE: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGPIPE: [libjvm.so+0xc9aa9d], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGXFSZ: [libjvm.so+0xc9aa9d], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGILL: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGUSR2: [libjvm.so+0xc9ad95], sa_mask[0]=00000000000000000000000000000000, sa_flags=SA_RESTART|SA_SIGINFO >> SIGHUP: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none >> SIGINT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none >> SIGTERM: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none >> SIGQUIT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none >> >> >> --------------- S Y S T E M --------------- >> >> OS: >> DISTRIB_ID=Ubuntu >> DISTRIB_RELEASE=18.04 >> DISTRIB_CODENAME=bionic >> DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS" >> uname: Linux 20431585315d 5.4.51-v7l+ #1333 SMP Mon Aug 10 16:51:40 BST 2020 armv7l >> OS uptime: 14 days 7:59 hours >> libc: glibc 2.27 NPTL 2.27 >> rlimit (soft/hard): STACK 8192k/infinity , CORE infinity/infinity , NPROC infinity/infinity , NOFILE 1048576/1048576 , AS infinity/infinity , CPU infinity/infinity , DATA infinity/infinity , FSIZE infinity/infinity , MEMLOCK 64k/64k >> load average: 3.37 3.26 3.09 >> >> /proc/meminfo: >> MemTotal: 3919812 kB >> MemFree: 1255688 kB >> MemAvailable: 3518740 kB >> Buffers: 134316 kB >> Cached: 2117828 kB >> SwapCached: 0 kB >> Active: 1266624 kB >> Inactive: 1167412 kB >> Active(anon): 110360 kB >> Inactive(anon): 80744 kB >> Active(file): 1156264 kB >> Inactive(file): 1086668 kB >> Unevictable: 16 kB >> Mlocked: 16 kB >> HighTotal: 3264512 kB >> HighFree: 1038848 kB >> LowTotal: 655300 kB >> LowFree: 216840 kB >> SwapTotal: 102396 kB >> SwapFree: 102396 kB >> Dirty: 24916 kB >> Writeback: 0 kB >> AnonPages: 181884 kB >> Mapped: 125864 kB >> Shmem: 16892 kB >> KReclaimable: 181816 kB >> Slab: 205164 kB >> SReclaimable: 181816 kB >> SUnreclaim: 23348 kB >> KernelStack: 2240 kB >> PageTables: 2684 kB >> NFS_Unstable: 0 kB >> Bounce: 0 kB >> WritebackTmp: 0 kB >> CommitLimit: 2062300 kB >> Committed_AS: 1125176 kB >> VmallocTotal: 245760 kB >> VmallocUsed: 5520 kB >> VmallocChunk: 0 kB >> Percpu: 512 kB >> CmaTotal: 262144 kB >> CmaFree: 171244 kB >> >> /sys/kernel/mm/transparent_hugepage/enabled: >> /sys/kernel/mm/transparent_hugepage/defrag (defrag/compaction efforts parameter): >> >> Process Memory: >> Virtual Size: 888828K (peak: 888828K) >> Resident Set Size: 25020K (peak: 25020K) (anon: 11372K, file: 13648K, shmem: 0K) >> Swapped out: 0K >> C-Heap outstanding allocations: 1636K >> >> /proc/sys/kernel/threads-max (system-wide limit on the number of threads): 57119 >> /proc/sys/vm/max_map_count (maximum number of memory map areas a process may have): 65530 >> /proc/sys/kernel/pid_max (system-wide limit on number of process identifiers): 32768 >> >> Steal ticks since vm start: 0 >> Steal ticks percentage since vm start: 0.000 >> >> CPU: total 4 (initial active 4) (ARMv7), vfp, vfp3-32, simd, mp_ext >> /proc/cpuinfo: >> processor : 0 >> model name : ARMv7 Processor rev 3 (v7l) >> BogoMIPS : 270.00 >> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 >> CPU implementer : 0x41 >> CPU architecture: 7 >> CPU variant : 0x0 >> CPU part : 0xd08 >> CPU revision : 3 >> >> processor : 1 >> model name : ARMv7 Processor rev 3 (v7l) >> BogoMIPS : 270.00 >> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 >> CPU implementer : 0x41 >> CPU architecture: 7 >> CPU variant : 0x0 >> CPU part : 0xd08 >> CPU revision : 3 >> >> processor : 2 >> model name : ARMv7 Processor rev 3 (v7l) >> BogoMIPS : 270.00 >> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 >> CPU implementer : 0x41 >> CPU architecture: 7 >> CPU variant : 0x0 >> CPU part : 0xd08 >> CPU revision : 3 >> >> processor : 3 >> model name : ARMv7 Processor rev 3 (v7l) >> BogoMIPS : 270.00 >> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 >> CPU implementer : 0x41 >> CPU architecture: 7 >> CPU variant : 0x0 >> CPU part : 0xd08 >> CPU revision : 3 >> >> Hardware : BCM2711 >> Revision : c03111 >> Serial : 100000001c47254f >> Model : Raspberry Pi 4 Model B Rev 1.1 >> >> Online cpus: 0-3 >> Offline cpus: >> >> Memory: 4k page, physical 3919812k(1255688k free), swap 102396k(102396k free) >> >> vm_info: OpenJDK Server VM (fastdebug 16-internal+0-adhoc..workspace) for linux-arm JRE (16-internal+0-adhoc..workspace), built on Oct 12 2020 19:49:51 by "" with gcc 7.5.0 >> >> END. >> >> >> >> >>> On 12. Oct 2020, at 20:24, Aleksey Shipilev wrote: >>> >>> Hi, >>> >>> On 10/12/20 8:12 PM, Marc Hoffmann wrote: >>>> Please find the build log and the hs_err file for commit fd0cb98ed03c6214c02ccd3503c1e6d77065a428 attached. >>> >>> Please try to build with fastdebug (./configure --enable-debug), so that JVM asserts meaninfully somewhere? >>> >>>> Is there any additional information I can provide to help getting these builds fixed again? >>> >>> I am seeing plenty of weird x86_32 crashes since last week. Pretty sure some of them would manifest on ARM32 as well. This is why building with fastdebug is the next step: it maps out the bug symptoms. >>> >>> -- >>> Thanks, >>> -Aleksey >>> >> From github.com+4146708+a74nh at openjdk.java.net Tue Oct 13 08:49:59 2020 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Tue, 13 Oct 2020 08:49:59 GMT Subject: RFR: 8221554: aarch64 cross-modifying code [v4] In-Reply-To: References: Message-ID: > The AArch64 port uses maybe_isb in places where an ISB might be required > because the code may have safepointed. These maybe_isbs are very conservative > and are used in many places are used when a safepoint has not happened. > > cross_modify_fence was added in common code to place a barrier in all the > places after a safepoint has occurred. All the uses of it are in common code, > yet it remains unimplemented on AArch64. > > This set of patches implements cross_modify_fence for AArch64 and reconsiders > every uses of maybe_isb, discarding many of them. In addition, it introduces > a new diagnostic option, which when enabled on AArch64 tests the correct > usage of the barriers. > > Advantage of this patch is threefold: > * Reducing the number of ISBs - giving a theoretical performance improvement. > * Use of common code instead of backend specific code. > * Additional test diagnostic options > > Patch 1: Split cross_modify_fence > ================================= > This is simply refactoring work split out to simplify the other two patches. > > instruction_fence() is provided by each target and simply places > a fence for the instruction stream. > > cross_modify_fence() is now a member of JavaThread and just calls > instruction_fence. This function will be extended in Patch 3. > > Patch 2: Use cross_modify_fence instead of maybe_isb > ==================================================== > > The [n] References refer to the comments for cross_modify_fence in > thread.hpp. > > This is all the existing uses of maybe_isb in the AArch64 target: > > 1) Instances of Java code calling a VM function > * This encapsulates the changes to: > ** MacroAssembler::call_VM_leaf_base() > ** generate_fast_get_int_field0() > ** stubGenerator_aarch64 generate_throw_exception() > ** sharedRuntime_aarch64 generate_handler_blob() > ** SharedRuntime::generate_resolve_blob() > ** C1 LIR_Assembler::rt_call > ** C1 StubAssembler::call_RT(): used by Used by generate_exception_throw, > generate_handle_exception, generate_code_for. > ** OptoRuntime::generate_exception_blob() > * Any changes will be caught due to calls to [2] or [3] by the VM function. > * Any calls that do not call [2] or [3] do not require an ISB. > * This patch is more optimal for these cases. > > 2) Instances of Java code calling a JNI function > * This encapsulates the changes to: > ** SharedRuntime::generate_native_wrapper() > ** TemplateInterpreterGenerator::generate_native_entry() > * A safepoint still in progress after the call with be caught by [4]. > * An ISB is still required for the case where there was a safepoint > but it completed during the call. This happens if the code doesn't > branch on safepoint_in_progress > * In the SharedRuntime version, the two possible calls to > reguard_yellow_pages and complete_monitor_unlocking_C are after the thread > goes back into it's original state, so are covered by [2] and [3], the > same as a normal VM call. > * This patch is only more optimal for the two post-JNI calls. > > 3) Patching functions > * This encapsulates the changes to: > ** patch_callers_callsite() (called by gen_c2i_adapter()) > * This results in code being patched, but does not safepoint > * Therefore an ISB is required. > * This patch introduces no change here. > > 4) C1 MacroAssembler::emit_static_call_stub() > * Calls ISB (not maybe_isb) > * By design, the patching doesn't require that the up-to-date > destination is required for proper functioning. > * However, the ISB makes it most likely that the new destination will > be picked up. > * This patch introduces no change here. > > Patch 3: Add cross modify fence verification > ============================================ > > The VerifyCrossModifyFence diagnostic flag enables confirmation to the correct > usage of instruction barriers. It can safely be enabled on any Java run. > > Enabling it will cause the following: > > * Once all threads have been brought to a safepoint, each thread will be > marked. > > * On a cross_modify_fence and safepoint_fence the mark for that thread > will be cleared. > > * On entry to a method and in a safepoint poll, then the thread is checked. > If it is marked, then the code will error. Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: Remove inlasm_isb define Change-Id: I2d0ef8a78292dac875f3f65d2253981cdb7a497a ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/428/files - new: https://git.openjdk.java.net/jdk/pull/428/files/022c60e4..338eca42 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=428&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=428&range=02-03 Stats: 3 lines in 1 file changed: 0 ins; 2 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/428.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/428/head:pull/428 PR: https://git.openjdk.java.net/jdk/pull/428 From github.com+4146708+a74nh at openjdk.java.net Tue Oct 13 08:50:02 2020 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Tue, 13 Oct 2020 08:50:02 GMT Subject: RFR: 8221554: aarch64 cross-modifying code [v3] In-Reply-To: References: Message-ID: On Mon, 12 Oct 2020 15:04:22 GMT, Alan Hayward wrote: >> src/hotspot/os_cpu/linux_aarch64/orderAccess_linux_aarch64.hpp line 35: >> >>> 33: #define inlasm_isb() asm volatile("isb" : : : "memory") >>> 34: >>> 35: // Implementation of class OrderAccess. >> >> This #define of inlasm_isb() looks wrong. Surely it should be in the body of OrderAccess::cross_modify_fence_impl() >> given that it's not used anywhere else. All the extra indirection does is confuse the reader. > > Agreed. It was designed to fit with my patch which did the same for the dmb's - but I've closed that patch. Will fix. Updated ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From github.com+4146708+a74nh at openjdk.java.net Tue Oct 13 09:03:10 2020 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Tue, 13 Oct 2020 09:03:10 GMT Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: References: <35eLsMpWmcCUoiEWhnYdSpZNmvLy4ra56Qtd6eRW574=.4e7c9278-3e0d-457d-9c15-eef45bae9755@github.com> Message-ID: On Mon, 12 Oct 2020 13:50:51 GMT, Alan Hayward wrote: >> @a74nh Please do not force-push commits on an open PR as it breaks the commit history and prevents reviewers from >> seeing what has changed since they last reviewed things. If you need to "rebase" you can just merge your branch with an >> updated master branch and push the merge commit to your personal fork. The skara tooling will flatten the commits into >> a single clean commit when integration happens. Thanks. > >> @a74nh Please do not force-push commits on an open PR as it breaks the commit history and prevents reviewers from >> seeing what has changed since they last reviewed things. If you need to "rebase" you can just merge your branch with an >> updated master branch and push the merge commit to your personal fork. The skara tooling will flatten the commits into >> a single clean commit when integration happens. Thanks. > > Not a fan of working with merge commits and I feel it gets muddled when you have history on top of a patch series (as > opposed to a single patch). However, understood - I'll make sure to merge instead of force pushing next time. > _Mailing list message from [Andrew Haley](mailto:aph at redhat.com) on [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ > > So, the good news and the bad news: > > Moving to cross_modify_fence reduces the number of ISBs from > 3,840,210 maybe_isb()s to 74,538 cross_modify_fence()s on my > poster child application, which is recompiling all of java.base. > > However, this is a program that runs for 187,501,798,979 insns, > so we've reduced the proportion of ISBs from 0.002% to 0.00004%. > I guess that's worth having, but I doubt that the improvement > would ever have been above the noise level. Thanks for testing this out. "Kills 98% of ISBs" is the marketing headline then. Sadly less effective than I was hoping - which explains my testing results. > > On the good side, this at least makes AArch64 more like other > targets. > Yes. This should give all the goodness from using common code (simpler, stronger code, etc etc) To be clear, the four original reasons for the patch were: *Use common code/interfaces where possible *Reduce ISBs where AArch64 was being too cautious *Add ISBs if theres any paths without them (there weren't) *Confirm the above changes are safe. ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From stefank at openjdk.java.net Tue Oct 13 09:33:24 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Tue, 13 Oct 2020 09:33:24 GMT Subject: RFR: 8254668: JVMTI process frames on thread without started processing In-Reply-To: References: Message-ID: <6LGnxhwhNPi4SdhuSvt5RF-jQzQ7QNPR7ndVc6Tt6HE=.7d468b7c-df34-4c44-823a-0e95f7abae32@github.com> On Tue, 13 Oct 2020 09:25:55 GMT, Stefan Karlsson wrote: > I hit the following assert in some tests runs that I've been doing: > # Internal Error (/home/stefank/git/alt/open/src/hotspot/share/runtime/stackWatermark.inline.hpp:67), pid=828170, > tid=828734 # assert(processing_started()) failed: Processing should already have started > > The stack traces for this has been: > Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x1626d75] StackWatermarkSet::on_iteration(JavaThread*, frame const&)+0xd5 > V [libjvm.so+0xad791a] frame::sender(RegisterMap*) const+0x7a > V [libjvm.so+0xacd3f8] frame::real_sender(RegisterMap*) const+0x18 > V [libjvm.so+0x1804c4a] vframe::sender() const+0xea > V [libjvm.so+0x175f47b] JavaThread::last_java_vframe(RegisterMap*)+0x5b > V [libjvm.so+0x10e10fc] JvmtiEnvBase::vframeFor(JavaThread*, int)+0x4c > V [libjvm.so+0x10e6972] JvmtiEnvBase::check_top_frame(Thread*, JavaThread*, jvalue, TosState, Handle*)+0xe2 > V [libjvm.so+0x10e759c] JvmtiEnvBase::force_early_return(JavaThread*, jvalue, TosState)+0x11c > V [libjvm.so+0x105b8f5] jvmti_ForceEarlyReturnObject+0x215 > V [libjvm.so+0x1626d75] StackWatermarkSet::on_iteration(JavaThread*, frame const&)+0xd5 > V [libjvm.so+0xad791a] frame::sender(RegisterMap*) const+0x7a > V [libjvm.so+0xacd3f8] frame::real_sender(RegisterMap*) const+0x18 > V [libjvm.so+0x1804c4a] vframe::sender() const+0xea > V [libjvm.so+0x1804d00] vframe::java_sender() const+0x10 > V [libjvm.so+0x10e1115] JvmtiEnvBase::vframeFor(JavaThread*, int)+0x65 > V [libjvm.so+0x10d475f] JvmtiEnv::NotifyFramePop(JavaThread*, int)+0x9f > V [libjvm.so+0x106b6aa] jvmti_NotifyFramePop+0x23a > The code inspects the top frame of a suspended java thread. However, there's nothing in the code that starts the > watermark processing of the thread, so the code asserts when sender calls on_iteration. > We only have to call start_processing/on_iteration when oops are being read. The failing code does *not* inspect any > oops, so I turn of the on_iteration call by settings process_frame to false. > To notify the readers of the code that vframeFor doesn't process the oops, I've renamed the function to > vframeForNoProcess to give a visual cue. > I found this bug when running this command line: > makec ../build/fastdebug/ test TEST=test/hotspot/jtreg/vmTestbase/nsk/jvmti > JTREG="JAVA_OPTIONS=-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=1 -XX:ZFragmentationLimit=0.01" > JTREG_EXTRA_PROBLEM_LISTS=ProblemList-zgc.txt Five tests consistently asserts with this command line. All tests pass > with the proposed fix. > Recommendations of tests to run are welcome. I intend to get this run through tier1-3, but haven't yet. Notifying @reinrich @fisk since I think they have been looking into similar problems. ------------- PR: https://git.openjdk.java.net/jdk/pull/627 From stefank at openjdk.java.net Tue Oct 13 09:33:24 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Tue, 13 Oct 2020 09:33:24 GMT Subject: RFR: 8254668: JVMTI process frames on thread without started processing Message-ID: I hit the following assert in some tests runs that I've been doing: # Internal Error (/home/stefank/git/alt/open/src/hotspot/share/runtime/stackWatermark.inline.hpp:67), pid=828170, tid=828734 # assert(processing_started()) failed: Processing should already have started The stack traces for this has been: Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x1626d75] StackWatermarkSet::on_iteration(JavaThread*, frame const&)+0xd5 V [libjvm.so+0xad791a] frame::sender(RegisterMap*) const+0x7a V [libjvm.so+0xacd3f8] frame::real_sender(RegisterMap*) const+0x18 V [libjvm.so+0x1804c4a] vframe::sender() const+0xea V [libjvm.so+0x175f47b] JavaThread::last_java_vframe(RegisterMap*)+0x5b V [libjvm.so+0x10e10fc] JvmtiEnvBase::vframeFor(JavaThread*, int)+0x4c V [libjvm.so+0x10e6972] JvmtiEnvBase::check_top_frame(Thread*, JavaThread*, jvalue, TosState, Handle*)+0xe2 V [libjvm.so+0x10e759c] JvmtiEnvBase::force_early_return(JavaThread*, jvalue, TosState)+0x11c V [libjvm.so+0x105b8f5] jvmti_ForceEarlyReturnObject+0x215 V [libjvm.so+0x1626d75] StackWatermarkSet::on_iteration(JavaThread*, frame const&)+0xd5 V [libjvm.so+0xad791a] frame::sender(RegisterMap*) const+0x7a V [libjvm.so+0xacd3f8] frame::real_sender(RegisterMap*) const+0x18 V [libjvm.so+0x1804c4a] vframe::sender() const+0xea V [libjvm.so+0x1804d00] vframe::java_sender() const+0x10 V [libjvm.so+0x10e1115] JvmtiEnvBase::vframeFor(JavaThread*, int)+0x65 V [libjvm.so+0x10d475f] JvmtiEnv::NotifyFramePop(JavaThread*, int)+0x9f V [libjvm.so+0x106b6aa] jvmti_NotifyFramePop+0x23a The code inspects the top frame of a suspended java thread. However, there's nothing in the code that starts the watermark processing of the thread, so the code asserts when sender calls on_iteration. We only have to call start_processing/on_iteration when oops are being read. The failing code does *not* inspect any oops, so I turn of the on_iteration call by settings process_frame to false. To notify the readers of the code that vframeFor doesn't process the oops, I've renamed the function to vframeForNoProcess to give a visual cue. I found this bug when running this command line: makec ../build/fastdebug/ test TEST=test/hotspot/jtreg/vmTestbase/nsk/jvmti JTREG="JAVA_OPTIONS=-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=1 -XX:ZFragmentationLimit=0.01" JTREG_EXTRA_PROBLEM_LISTS=ProblemList-zgc.txt Five tests consistently asserts with this command line. All tests pass with the proposed fix. Recommendations of tests to run are welcome. I intend to get this run through tier1-3, but haven't yet. ------------- Commit messages: - 8254668: JVMTI process frames on thread without started processing Changes: https://git.openjdk.java.net/jdk/pull/627/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=627&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254668 Stats: 9 lines in 3 files changed: 1 ins; 0 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/627.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/627/head:pull/627 PR: https://git.openjdk.java.net/jdk/pull/627 From avoitylov at openjdk.java.net Tue Oct 13 09:39:23 2020 From: avoitylov at openjdk.java.net (Aleksei Voitylov) Date: Tue, 13 Oct 2020 09:39:23 GMT Subject: Integrated: JDK-8247589: Implementation of Alpine Linux/x64 Port In-Reply-To: References: Message-ID: <1UYbWcHetnzMfkPLjmtJp2XUoUdhpGmVBOkPRZ1JhqM=.56033536-ce88-45b7-8d9d-3570819bd063@github.com> On Mon, 7 Sep 2020 11:23:28 GMT, Aleksei Voitylov wrote: > continuing the review thread from here https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-September/068546.html > >> The download side of using JNI in these tests is that it complicates the >> setup a bit for those that run jtreg directly and/or just build the JDK >> and not the test libraries. You could reduce this burden a bit by >> limiting the load library/isMusl check to Linux only, meaning isMusl >> would not be called on other platforms. >> >> The alternative you suggest above might indeed be better. I assume you >> don't mean splitting the tests but rather just adding a second @test >> description so that the vm.musl case runs the test with a system >> property that allows the test know the expected load library path behavior. > > I have updated the PR to split the two tests in multiple @test s. > >> The updated comment in java_md.c in this looks good. A minor comment on >> Platform.isBusybox is Files.isSymbolicLink returning true implies that >> the link exists so no need to check for exists too. Also the >> if-then-else style for the new class in ProcessBuilder/Basic.java is >> inconsistent with the rest of the test so it stands out. > > Thank you, these changes are done in the updated PR. > >> Given the repo transition this weekend then I assume you'll create a PR >> for the final review at least. Also I see JEP 386 hasn't been targeted >> yet but I assume Boris, as owner, will propose-to-target and wait for it >> to be targeted before it is integrated. > > Yes. How can this be best accomplished with the new git workflow? > - we can continue the review process till the end and I will request the integration to happen only after the JEP is > targeted. I guess this step is now done by typing "slash integrate" in a comment. > - we can pause the review process now until the JEP is targeted. > > In the first case I'm kindly asking the Reviewers who already chimed in on that to re-confirm the review here. This pull request has now been integrated. Changeset: 63009f90 Author: Aleksei Voitylov Committer: Alexander Scherbatiy URL: https://git.openjdk.java.net/jdk/commit/63009f90 Stats: 403 lines in 30 files changed: 348 ins; 17 del; 38 mod 8247589: Implementation of Alpine Linux/x64 Port Co-authored-by: Mikael Vidstedt Co-authored-by: Alexander Scherbatiy Co-authored-by: Axel Siebenborn Co-authored-by: Aleksei Voitylov Reviewed-by: alanb, erikj, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/49 From avoitylov at openjdk.java.net Tue Oct 13 09:54:24 2020 From: avoitylov at openjdk.java.net (Aleksei Voitylov) Date: Tue, 13 Oct 2020 09:54:24 GMT Subject: RFR: JDK-8247589: Implementation of Alpine Linux/x64 Port [v6] In-Reply-To: References: Message-ID: On Fri, 9 Oct 2020 06:02:04 GMT, David Holmes wrote: >> Aleksei Voitylov has updated the pull request with a new target base due to a merge or a rebase. The pull request now >> contains three commits: >> - Merge branch 'master' into JDK-8247589 >> - JDK-8247589: Implementation of Alpine Linux/x64 Port >> - JDK-8247589: Implementation of Alpine Linux/x64 Port > > Marked as reviewed by dholmes (Reviewer). Thanks everyone! ------------- PR: https://git.openjdk.java.net/jdk/pull/49 From mcimadamore at openjdk.java.net Tue Oct 13 10:27:43 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Tue, 13 Oct 2020 10:27:43 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v9] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the third incubation round of the foreign memory access API incubation > (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: > * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from > multiple threads > * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee > that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class > has been added, which defines several useful dereference routines; these are really just thin wrappers around memory > access var handles, but they make the barrier of entry for using this API somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not > the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link > to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit > of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which > wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as > dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability > in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; > secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can > use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done > by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided > below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be > happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, > Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd > like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio > Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative > to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a > carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access > base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte > offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. > `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which > it is easy to derive all the other handles using plain var handle combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both > `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients > can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. > * `MemoryLayouts` > * A new layout, for machine addresses, has been added to the mix. > > > > ### Implementation changes > > There are two main things to discuss here: support for shared segments, and the general simplification of the memory > access var handle support. > #### Shared segments > > The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to > achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment > is shared, it would be possible for a thread to close it while another is accessing it. After considering several > options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he > reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world > (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a > close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and > the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). > It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of > these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, > we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to > whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of > stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, > once we detect that a thread is accessing the very segment we're about to close, what should happen? We first > experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it > fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the > machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to > minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread > is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and > try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should > be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single > place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in > addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` > annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) > class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, > like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is > tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during > access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory > access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead > of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a > `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed > successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two > implementations, one for confined segments and one for shared segments; the main difference between the two is what > happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared > segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or > `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` > state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. #### > Memory access var handles overhaul The key realization here was that if all memory access var handles took a > coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle > form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var > handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that > e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the > implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level > access on the innards of the memory access var handle. All that code is now gone. #### Test changes Not much to see > here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, > since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` > functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the > microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared > segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - > https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: - Merge branch 'master' into 8254162 - Simplify example in the toplevel javadoc - Tweak support for mapped memory segments - Tweak referenced to MemoryAddressProxy in Utils.java - Fix performance issue with "small" segment mismatch - Address review comments - Fix indent in GensrcScopedMemoryAccess.gmk - Address review comments - Add modified files - RFR 8254162: Implementation of Foreign-Memory Access API (Third Incubator) This patch contains the changes associated with the third incubation round of the foreign memory access API incubation (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from multiple threads * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee that the memory will be deallocated, eventually * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class has been added, which defines several useful dereference routines; these are really just thin wrappers around memory access var handles, but they make the barrier of entry for using this API somewhat lower. A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible. Thanks Maurizio Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html Specdiff: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html CSR: https://bugs.openjdk.java.net/browse/JDK-8254163 * `MemorySegment` * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) * added a no-arg factory for a native restricted segment representing entire native heap * rename `withOwnerThread` to `handoff` * add new `share` method, to create shared segments * add new `registerCleaner` method, to register a segment against a cleaner * add more helpers to create arrays from a segment e.g. `toIntArray` * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) * `MemoryAddress` * drop `segment` accessor * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative to a given segment * `MemoryAccess` * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. `getByteAtOffset` vs `getByteAtIndex`). * `MemoryHandles` * drop `withOffset` combinator * drop `withStride` combinator * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which it is easy to derive all the other handles using plain var handle combinators. * `Addressable` * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. * `MemoryLayouts` * A new layout, for machine addresses, has been added to the mix. There are two main things to discuss here: support for shared segments, and the general simplification of the memory access var handle support. The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment is shared, it would be possible for a thread to close it while another is accessing it. After considering several options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, once we detect that a thread is accessing the very segment we're about to close, what should happen? We first experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two implementations, one for confined segments and one for shared segments; the main difference between the two is what happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. The key realization here was that if all memory access var handles took a coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level access on the innards of the memory access var handle. All that code is now gone. Not much to see here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/548/files - new: https://git.openjdk.java.net/jdk/pull/548/files/d14d06a4..8815d941 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=08 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=07-08 Stats: 22231 lines in 447 files changed: 12727 ins; 6345 del; 3159 mod Patch: https://git.openjdk.java.net/jdk/pull/548.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/548/head:pull/548 PR: https://git.openjdk.java.net/jdk/pull/548 From mcimadamore at openjdk.java.net Tue Oct 13 11:06:32 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Tue, 13 Oct 2020 11:06:32 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v10] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the third incubation round of the foreign memory access API incubation > (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: > * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from > multiple threads > * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee > that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class > has been added, which defines several useful dereference routines; these are really just thin wrappers around memory > access var handles, but they make the barrier of entry for using this API somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not > the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link > to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit > of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which > wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as > dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability > in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; > secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can > use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done > by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided > below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be > happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, > Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd > like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio > Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative > to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a > carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access > base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte > offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. > `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which > it is easy to derive all the other handles using plain var handle combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both > `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients > can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. > * `MemoryLayouts` > * A new layout, for machine addresses, has been added to the mix. > > > > ### Implementation changes > > There are two main things to discuss here: support for shared segments, and the general simplification of the memory > access var handle support. > #### Shared segments > > The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to > achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment > is shared, it would be possible for a thread to close it while another is accessing it. After considering several > options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he > reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world > (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a > close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and > the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). > It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of > these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, > we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to > whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of > stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, > once we detect that a thread is accessing the very segment we're about to close, what should happen? We first > experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it > fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the > machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to > minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread > is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and > try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should > be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single > place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in > addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` > annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) > class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, > like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is > tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during > access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory > access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead > of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a > `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed > successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two > implementations, one for confined segments and one for shared segments; the main difference between the two is what > happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared > segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or > `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` > state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. #### > Memory access var handles overhaul The key realization here was that if all memory access var handles took a > coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle > form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var > handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that > e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the > implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level > access on the innards of the memory access var handle. All that code is now gone. #### Test changes Not much to see > here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, > since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` > functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the > microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared > segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - > https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Remove spurious check on MemoryScope::confineTo Added tests to make sure no spurious exception is thrown when: * handing off a segment from A to A * sharing an already shared segment ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/548/files - new: https://git.openjdk.java.net/jdk/pull/548/files/8815d941..8fb8ff2f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=09 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=08-09 Stats: 27 lines in 2 files changed: 16 ins; 8 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/548.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/548/head:pull/548 PR: https://git.openjdk.java.net/jdk/pull/548 From mcimadamore at openjdk.java.net Tue Oct 13 11:23:33 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Tue, 13 Oct 2020 11:23:33 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v11] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the third incubation round of the foreign memory access API incubation > (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: > * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from > multiple threads > * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee > that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class > has been added, which defines several useful dereference routines; these are really just thin wrappers around memory > access var handles, but they make the barrier of entry for using this API somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not > the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link > to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit > of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which > wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as > dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability > in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; > secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can > use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done > by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided > below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be > happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, > Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd > like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio > Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative > to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a > carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access > base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte > offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. > `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which > it is easy to derive all the other handles using plain var handle combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both > `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients > can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. > * `MemoryLayouts` > * A new layout, for machine addresses, has been added to the mix. > > > > ### Implementation changes > > There are two main things to discuss here: support for shared segments, and the general simplification of the memory > access var handle support. > #### Shared segments > > The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to > achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment > is shared, it would be possible for a thread to close it while another is accessing it. After considering several > options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he > reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world > (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a > close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and > the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). > It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of > these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, > we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to > whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of > stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, > once we detect that a thread is accessing the very segment we're about to close, what should happen? We first > experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it > fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the > machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to > minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread > is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and > try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should > be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single > place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in > addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` > annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) > class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, > like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is > tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during > access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory > access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead > of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a > `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed > successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two > implementations, one for confined segments and one for shared segments; the main difference between the two is what > happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared > segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or > `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` > state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. #### > Memory access var handles overhaul The key realization here was that if all memory access var handles took a > coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle > form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var > handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that > e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the > implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level > access on the innards of the memory access var handle. All that code is now gone. #### Test changes Not much to see > here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, > since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` > functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the > microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared > segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - > https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 12 additional commits since the last revision: - Merge branch 'master' into 8254162 - Remove spurious check on MemoryScope::confineTo Added tests to make sure no spurious exception is thrown when: * handing off a segment from A to A * sharing an already shared segment - Merge branch 'master' into 8254162 - Simplify example in the toplevel javadoc - Tweak support for mapped memory segments - Tweak referenced to MemoryAddressProxy in Utils.java - Fix performance issue with "small" segment mismatch - Address review comments - Fix indent in GensrcScopedMemoryAccess.gmk - Address review comments - ... and 2 more: https://git.openjdk.java.net/jdk/compare/f7d78c01...e866bb23 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/548/files - new: https://git.openjdk.java.net/jdk/pull/548/files/8fb8ff2f..e866bb23 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=10 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=09-10 Stats: 605 lines in 49 files changed: 426 ins; 116 del; 63 mod Patch: https://git.openjdk.java.net/jdk/pull/548.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/548/head:pull/548 PR: https://git.openjdk.java.net/jdk/pull/548 From eosterlund at openjdk.java.net Tue Oct 13 11:30:12 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 13 Oct 2020 11:30:12 GMT Subject: RFR: 8254668: JVMTI process frames on thread without started processing In-Reply-To: References: Message-ID: <8n43lPCxk0mlCPtfksiLfvwLAMet4xMHmdyvD8tV7m4=.5954b90b-be0d-4cbd-bcca-f57905025218@github.com> On Tue, 13 Oct 2020 09:25:55 GMT, Stefan Karlsson wrote: > I hit the following assert in some tests runs that I've been doing: > # Internal Error (/home/stefank/git/alt/open/src/hotspot/share/runtime/stackWatermark.inline.hpp:67), pid=828170, > tid=828734 # assert(processing_started()) failed: Processing should already have started > > The stack traces for this has been: > Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x1626d75] StackWatermarkSet::on_iteration(JavaThread*, frame const&)+0xd5 > V [libjvm.so+0xad791a] frame::sender(RegisterMap*) const+0x7a > V [libjvm.so+0xacd3f8] frame::real_sender(RegisterMap*) const+0x18 > V [libjvm.so+0x1804c4a] vframe::sender() const+0xea > V [libjvm.so+0x175f47b] JavaThread::last_java_vframe(RegisterMap*)+0x5b > V [libjvm.so+0x10e10fc] JvmtiEnvBase::vframeFor(JavaThread*, int)+0x4c > V [libjvm.so+0x10e6972] JvmtiEnvBase::check_top_frame(Thread*, JavaThread*, jvalue, TosState, Handle*)+0xe2 > V [libjvm.so+0x10e759c] JvmtiEnvBase::force_early_return(JavaThread*, jvalue, TosState)+0x11c > V [libjvm.so+0x105b8f5] jvmti_ForceEarlyReturnObject+0x215 > V [libjvm.so+0x1626d75] StackWatermarkSet::on_iteration(JavaThread*, frame const&)+0xd5 > V [libjvm.so+0xad791a] frame::sender(RegisterMap*) const+0x7a > V [libjvm.so+0xacd3f8] frame::real_sender(RegisterMap*) const+0x18 > V [libjvm.so+0x1804c4a] vframe::sender() const+0xea > V [libjvm.so+0x1804d00] vframe::java_sender() const+0x10 > V [libjvm.so+0x10e1115] JvmtiEnvBase::vframeFor(JavaThread*, int)+0x65 > V [libjvm.so+0x10d475f] JvmtiEnv::NotifyFramePop(JavaThread*, int)+0x9f > V [libjvm.so+0x106b6aa] jvmti_NotifyFramePop+0x23a > The code inspects the top frame of a suspended java thread. However, there's nothing in the code that starts the > watermark processing of the thread, so the code asserts when sender calls on_iteration. > We only have to call start_processing/on_iteration when oops are being read. The failing code does *not* inspect any > oops, so I turn of the on_iteration call by settings process_frame to false. > To notify the readers of the code that vframeFor doesn't process the oops, I've renamed the function to > vframeForNoProcess to give a visual cue. > I found this bug when running this command line: > makec ../build/fastdebug/ test TEST=test/hotspot/jtreg/vmTestbase/nsk/jvmti > JTREG="JAVA_OPTIONS=-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=1 -XX:ZFragmentationLimit=0.01" > JTREG_EXTRA_PROBLEM_LISTS=ProblemList-zgc.txt Five tests consistently asserts with this command line. All tests pass > with the proposed fix. > Recommendations of tests to run are welcome. I intend to get this run through tier1-3, but haven't yet. Looks good. Thanks for fixing this. ------------- Marked as reviewed by eosterlund (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/627 From coleenp at openjdk.java.net Tue Oct 13 11:33:08 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 13 Oct 2020 11:33:08 GMT Subject: RFR: 8254365: ciMethod.hpp should not include methodHandles.hpp In-Reply-To: References: Message-ID: <_D_TBPo4W4CO6M9hgpPzifk7nE8pl2llm_Kd11_TMFE=.a242232e-ea2d-4109-b0cf-433e082739a3@github.com> On Tue, 13 Oct 2020 06:18:40 GMT, Ioi Lam wrote: > ciMethod.hpp includes methodHandles.hpp. This is probably a typo as ciMethod.hpp doesn't use the MethodHandles class. > Instead, it uses methodHandle which is declared in runtime/handles.hpp. > As usual, I had to fix a few .cpp files that used the MethodHandles class but did not explicitly include > methodHandles.hpp. > Tested with mach5 build tiers 1-5. Marked as reviewed by coleenp (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/623 From rrich at openjdk.java.net Tue Oct 13 12:51:09 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Tue, 13 Oct 2020 12:51:09 GMT Subject: RFR: 8254668: JVMTI process frames on thread without started processing In-Reply-To: References: Message-ID: On Tue, 13 Oct 2020 09:25:55 GMT, Stefan Karlsson wrote: > I hit the following assert in some tests runs that I've been doing: > # Internal Error (/home/stefank/git/alt/open/src/hotspot/share/runtime/stackWatermark.inline.hpp:67), pid=828170, > tid=828734 # assert(processing_started()) failed: Processing should already have started > > The stack traces for this has been: > Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x1626d75] StackWatermarkSet::on_iteration(JavaThread*, frame const&)+0xd5 > V [libjvm.so+0xad791a] frame::sender(RegisterMap*) const+0x7a > V [libjvm.so+0xacd3f8] frame::real_sender(RegisterMap*) const+0x18 > V [libjvm.so+0x1804c4a] vframe::sender() const+0xea > V [libjvm.so+0x175f47b] JavaThread::last_java_vframe(RegisterMap*)+0x5b > V [libjvm.so+0x10e10fc] JvmtiEnvBase::vframeFor(JavaThread*, int)+0x4c > V [libjvm.so+0x10e6972] JvmtiEnvBase::check_top_frame(Thread*, JavaThread*, jvalue, TosState, Handle*)+0xe2 > V [libjvm.so+0x10e759c] JvmtiEnvBase::force_early_return(JavaThread*, jvalue, TosState)+0x11c > V [libjvm.so+0x105b8f5] jvmti_ForceEarlyReturnObject+0x215 > V [libjvm.so+0x1626d75] StackWatermarkSet::on_iteration(JavaThread*, frame const&)+0xd5 > V [libjvm.so+0xad791a] frame::sender(RegisterMap*) const+0x7a > V [libjvm.so+0xacd3f8] frame::real_sender(RegisterMap*) const+0x18 > V [libjvm.so+0x1804c4a] vframe::sender() const+0xea > V [libjvm.so+0x1804d00] vframe::java_sender() const+0x10 > V [libjvm.so+0x10e1115] JvmtiEnvBase::vframeFor(JavaThread*, int)+0x65 > V [libjvm.so+0x10d475f] JvmtiEnv::NotifyFramePop(JavaThread*, int)+0x9f > V [libjvm.so+0x106b6aa] jvmti_NotifyFramePop+0x23a > The code inspects the top frame of a suspended java thread. However, there's nothing in the code that starts the > watermark processing of the thread, so the code asserts when sender calls on_iteration. > We only have to call start_processing/on_iteration when oops are being read. The failing code does *not* inspect any > oops, so I turn of the on_iteration call by settings process_frame to false. > To notify the readers of the code that vframeFor doesn't process the oops, I've renamed the function to > vframeForNoProcess to give a visual cue. > I found this bug when running this command line: > makec ../build/fastdebug/ test TEST=test/hotspot/jtreg/vmTestbase/nsk/jvmti > JTREG="JAVA_OPTIONS=-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=1 -XX:ZFragmentationLimit=0.01" > JTREG_EXTRA_PROBLEM_LISTS=ProblemList-zgc.txt Five tests consistently asserts with this command line. All tests pass > with the proposed fix. > Recommendations of tests to run are welcome. I intend to get this run through tier1-3, but haven't yet. Hi Stefan, thanks for fixing. With this change the assertion in pr #119 does not fail anymore. The fix looks good to me but I'm not an ZGC expert, neither a Reviewer :) src/hotspot/share/prims/jvmtiEnvBase.cpp line 559: > 557: > 558: // return the vframe on the specified thread and depth, NULL if no such frame > 559: // The thread and the oops in the returned might not have been process. s/the returned/the returned vframe/ ------------- Marked as reviewed by rrich (Committer). PR: https://git.openjdk.java.net/jdk/pull/627 From ysuenaga at openjdk.java.net Tue Oct 13 13:05:20 2020 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Tue, 13 Oct 2020 13:05:20 GMT Subject: RFR: 8226236: [TESTBUG] win32: gc/metaspace/TestCapacityUntilGCWrapAround.java fails Message-ID: Originally filed at AdoptOpenJDK: https://github.com/AdoptOpenJDK/openjdk-tests/issues/1162 The test fails on 32bit windows with: java.lang.IllegalStateException: WB_IncMetaspaceCapacityUntilGC: could not increase capacity until GC due to contention with another thread at sun.hotspot.WhiteBox.incMetaspaceCapacityUntilGC(Native Method) at TestCapacityUntilGCWrapAround.main(TestCapacityUntilGCWrapAround.java:51) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) at java.lang.Thread.run(Thread.java:748) `TestCapacityUntilGCWrapAround` passes `4GB - 1` to `incMetaspaceCapacityUntilGC()`. It seems to be too big. And also this code seems to want to check the behavior when `_capacity_until_gc` is overflown. White box test would throw ISE when it hapen. So we need to handle it correctly. ------------- Commit messages: - 8226236: [TESTBUG] win32: gc/metaspace/TestCapacityUntilGCWrapAround.java fails Changes: https://git.openjdk.java.net/jdk/pull/628/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=628&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8226236 Stats: 36 lines in 2 files changed: 17 ins; 12 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/628.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/628/head:pull/628 PR: https://git.openjdk.java.net/jdk/pull/628 From shade at openjdk.java.net Tue Oct 13 13:05:20 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 13 Oct 2020 13:05:20 GMT Subject: RFR: 8226236: [TESTBUG] win32: gc/metaspace/TestCapacityUntilGCWrapAround.java fails In-Reply-To: References: Message-ID: <_o0fSSs7xQBbZjhfb-Z3cBmLpHAcDeErcxSH-E_17P8=.8d962dba-c353-4e9c-8776-1c5c2739a2d1@github.com> On Tue, 13 Oct 2020 10:00:37 GMT, Yasumasa Suenaga wrote: > Originally filed at AdoptOpenJDK: > https://github.com/AdoptOpenJDK/openjdk-tests/issues/1162 > > The test fails on 32bit windows with: > > java.lang.IllegalStateException: WB_IncMetaspaceCapacityUntilGC: could not increase capacity until GC due to contention > with another thread > at sun.hotspot.WhiteBox.incMetaspaceCapacityUntilGC(Native Method) > at TestCapacityUntilGCWrapAround.main(TestCapacityUntilGCWrapAround.java:51) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) > at java.lang.Thread.run(Thread.java:748) > > `TestCapacityUntilGCWrapAround` passes `4GB - 1` to `incMetaspaceCapacityUntilGC()`. It seems to be too big. > And also this code seems to want to check the behavior when `_capacity_until_gc` is overflown. White box test would > throw ISE when it hapen. So we need to handle it correctly. Current patch makes `gc/metaspace` tests pass on `x86_32` for me. They used to fail as described. ------------- PR: https://git.openjdk.java.net/jdk/pull/628 From ysuenaga at openjdk.java.net Tue Oct 13 13:05:20 2020 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Tue, 13 Oct 2020 13:05:20 GMT Subject: RFR: 8226236: [TESTBUG] win32: gc/metaspace/TestCapacityUntilGCWrapAround.java fails In-Reply-To: <_o0fSSs7xQBbZjhfb-Z3cBmLpHAcDeErcxSH-E_17P8=.8d962dba-c353-4e9c-8776-1c5c2739a2d1@github.com> References: <_o0fSSs7xQBbZjhfb-Z3cBmLpHAcDeErcxSH-E_17P8=.8d962dba-c353-4e9c-8776-1c5c2739a2d1@github.com> Message-ID: <1yw3nA7BTPT1CzZA6p6yk8ZIkkKQOUvxtPIpRrNU8DQ=.5800404e-e699-4973-b4c6-f14ec319722e@github.com> On Tue, 13 Oct 2020 10:12:13 GMT, Aleksey Shipilev wrote: >> Originally filed at AdoptOpenJDK: >> https://github.com/AdoptOpenJDK/openjdk-tests/issues/1162 >> >> The test fails on 32bit windows with: >> >> java.lang.IllegalStateException: WB_IncMetaspaceCapacityUntilGC: could not increase capacity until GC due to contention >> with another thread >> at sun.hotspot.WhiteBox.incMetaspaceCapacityUntilGC(Native Method) >> at TestCapacityUntilGCWrapAround.main(TestCapacityUntilGCWrapAround.java:51) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:498) >> at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) >> at java.lang.Thread.run(Thread.java:748) >> >> `TestCapacityUntilGCWrapAround` passes `4GB - 1` to `incMetaspaceCapacityUntilGC()`. It seems to be too big. >> And also this code seems to want to check the behavior when `_capacity_until_gc` is overflown. White box test would >> throw ISE when it hapen. So we need to handle it correctly. > > Current patch makes `gc/metaspace` tests pass on `x86_32` for me. They used to fail as described. Thanks @shipilev ! Can you approve this change? ------------- PR: https://git.openjdk.java.net/jdk/pull/628 From shade at openjdk.java.net Tue Oct 13 13:05:20 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 13 Oct 2020 13:05:20 GMT Subject: RFR: 8226236: [TESTBUG] win32: gc/metaspace/TestCapacityUntilGCWrapAround.java fails In-Reply-To: <1yw3nA7BTPT1CzZA6p6yk8ZIkkKQOUvxtPIpRrNU8DQ=.5800404e-e699-4973-b4c6-f14ec319722e@github.com> References: <_o0fSSs7xQBbZjhfb-Z3cBmLpHAcDeErcxSH-E_17P8=.8d962dba-c353-4e9c-8776-1c5c2739a2d1@github.com> <1yw3nA7BTPT1CzZA6p6yk8ZIkkKQOUvxtPIpRrNU8DQ=.5800404e-e699-4973-b4c6-f14ec319722e@github.com> Message-ID: On Tue, 13 Oct 2020 13:00:11 GMT, Yasumasa Suenaga wrote: >> Current patch makes `gc/metaspace` tests pass on `x86_32` for me. They used to fail as described. > > Thanks @shipilev ! Can you approve this change? I'd prefer @tstuefe and @tschatzl to have a look. ------------- PR: https://git.openjdk.java.net/jdk/pull/628 From stefank at openjdk.java.net Tue Oct 13 13:06:29 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Tue, 13 Oct 2020 13:06:29 GMT Subject: RFR: 8254668: JVMTI process frames on thread without started processing [v2] In-Reply-To: References: Message-ID: > I hit the following assert in some tests runs that I've been doing: > # Internal Error (/home/stefank/git/alt/open/src/hotspot/share/runtime/stackWatermark.inline.hpp:67), pid=828170, > tid=828734 # assert(processing_started()) failed: Processing should already have started > > The stack traces for this has been: > Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x1626d75] StackWatermarkSet::on_iteration(JavaThread*, frame const&)+0xd5 > V [libjvm.so+0xad791a] frame::sender(RegisterMap*) const+0x7a > V [libjvm.so+0xacd3f8] frame::real_sender(RegisterMap*) const+0x18 > V [libjvm.so+0x1804c4a] vframe::sender() const+0xea > V [libjvm.so+0x175f47b] JavaThread::last_java_vframe(RegisterMap*)+0x5b > V [libjvm.so+0x10e10fc] JvmtiEnvBase::vframeFor(JavaThread*, int)+0x4c > V [libjvm.so+0x10e6972] JvmtiEnvBase::check_top_frame(Thread*, JavaThread*, jvalue, TosState, Handle*)+0xe2 > V [libjvm.so+0x10e759c] JvmtiEnvBase::force_early_return(JavaThread*, jvalue, TosState)+0x11c > V [libjvm.so+0x105b8f5] jvmti_ForceEarlyReturnObject+0x215 > V [libjvm.so+0x1626d75] StackWatermarkSet::on_iteration(JavaThread*, frame const&)+0xd5 > V [libjvm.so+0xad791a] frame::sender(RegisterMap*) const+0x7a > V [libjvm.so+0xacd3f8] frame::real_sender(RegisterMap*) const+0x18 > V [libjvm.so+0x1804c4a] vframe::sender() const+0xea > V [libjvm.so+0x1804d00] vframe::java_sender() const+0x10 > V [libjvm.so+0x10e1115] JvmtiEnvBase::vframeFor(JavaThread*, int)+0x65 > V [libjvm.so+0x10d475f] JvmtiEnv::NotifyFramePop(JavaThread*, int)+0x9f > V [libjvm.so+0x106b6aa] jvmti_NotifyFramePop+0x23a > The code inspects the top frame of a suspended java thread. However, there's nothing in the code that starts the > watermark processing of the thread, so the code asserts when sender calls on_iteration. > We only have to call start_processing/on_iteration when oops are being read. The failing code does *not* inspect any > oops, so I turn of the on_iteration call by settings process_frame to false. > To notify the readers of the code that vframeFor doesn't process the oops, I've renamed the function to > vframeForNoProcess to give a visual cue. > I found this bug when running this command line: > makec ../build/fastdebug/ test TEST=test/hotspot/jtreg/vmTestbase/nsk/jvmti > JTREG="JAVA_OPTIONS=-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=1 -XX:ZFragmentationLimit=0.01" > JTREG_EXTRA_PROBLEM_LISTS=ProblemList-zgc.txt Five tests consistently asserts with this command line. All tests pass > with the proposed fix. > Recommendations of tests to run are welcome. I intend to get this run through tier1-3, but haven't yet. Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: Review 1 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/627/files - new: https://git.openjdk.java.net/jdk/pull/627/files/587fd354..00c1b25f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=627&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=627&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/627.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/627/head:pull/627 PR: https://git.openjdk.java.net/jdk/pull/627 From stefank at openjdk.java.net Tue Oct 13 13:06:30 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Tue, 13 Oct 2020 13:06:30 GMT Subject: RFR: 8254668: JVMTI process frames on thread without started processing [v2] In-Reply-To: References: Message-ID: <3dSL6O7WvOI3GBrnN9vB_wZgq-XE1S6vihRUhsnBUgs=.1e159afe-891c-4fa4-8d40-9da2e2a3cd4a@github.com> On Tue, 13 Oct 2020 12:48:05 GMT, Richard Reingruber wrote: >> Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: >> >> Review 1 > > Hi Stefan, > > thanks for fixing. > > With this change the assertion in pr #119 does not fail anymore. > > The fix looks good to me but I'm not an ZGC expert, neither a Reviewer :) Thanks @fisk and @reinrich for reviewing. > src/hotspot/share/prims/jvmtiEnvBase.cpp line 559: > >> 557: >> 558: // return the vframe on the specified thread and depth, NULL if no such frame >> 559: // The thread and the oops in the returned might not have been process. > > s/the returned/the returned vframe/ Thanks for noticing. ------------- PR: https://git.openjdk.java.net/jdk/pull/627 From stuefe at openjdk.java.net Tue Oct 13 13:13:19 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 13 Oct 2020 13:13:19 GMT Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on macOS [v5] In-Reply-To: References: <6iVRP-20baz0_46SouR-dj9SyspR5QvaL9iJMdeipDE=.92688b4e-ebd3-4681-8e63-a4aee752c407@github.com> <_XaA5cQEInPMn5Q5gj2y7AFCRprFQiYfI6BeUN49FhA=.9f17ae05-b37e-4f40-a83f-fd34aa812575@github.com> Message-ID: On Fri, 9 Oct 2020 17:11:23 GMT, Anton Kozlov wrote: > > GrowableArray maybe not the best choice here since e.g. it requires you to search twice on add. A better solution may > > be a specialized BST. > > I assume amount of executable mappings to be small. Depends on if exec parameter available at reserve, it is either > only a single one for the CodeCache (see below) or plus several more for mappings with unknown mode (that were not > committed yet) > > IMHO too heavvy weight for a platform only change. > > If there are other uses for such a solution (managing memory regions, melting them together, splitting them maybe on > > remove) we should not support setting and clearing exec on commit but only on a per-mapping base. > > It is more simple when the whole mapping is executable or not. We don't need to split/merge on commit/uncommit then. > But we need do to something when os::release_memory is called on a submapping of a mapping with unknown status. Like on > AIX, uncommit is made https://github.com/openjdk/jdk/blob/master/src/hotspot/os/aix/os_aix.cpp#L2096. But here for > macOS, I'm trying to avoid any change of behavior for non-exec mappings. If the exec parameter is provided for reserve > (as it eventually would be), then we don't need splitting/merging at all. This is what the latest patch is about. I > haven't tested that thoroughly yet, but eventually it would be possible to deduce correct exec values for os::reserve > based on subsequent os::commit. If we make a step back, we have exec parameter known for reserve and commit, I also > pretty sure that it is possible to deduce that for any uncommit (which was one of the initial concerns) Let's agree on > some plan how to attack the problem? I would like to distinguish the work toward MAP_JIT and improving interface. Not > sure what should come first. Are you still opposing to have exec parameter in os::reserve/commit/uncommit and > obligating callers to provide consistent exec values for each, at least at this phase? I mean, eventually we will have > a platform-dependent `handle_t` for mapping or equivalent. Like if we provide size of the whole mapping (the context) > for each commit_memory on AIX, we won't need to do the bookkeeping. What if os::commit to take ReservedSpace and do > something conservative when that is not provided? > > GrowableArray maybe not the best choice here since e.g. it requires you to search twice on add. A better solution may > > be a specialized BST. > > I assume amount of executable mappings to be small. Depends on if exec parameter available at reserve, it is either > only a single one for the CodeCache (see below) or plus several more for mappings with unknown mode (that were not > committed yet) > > IMHO too heavvy weight for a platform only change. > > If there are other uses for such a solution (managing memory regions, melting them together, splitting them maybe on > > remove) we should not support setting and clearing exec on commit but only on a per-mapping base. > > It is more simple when the whole mapping is executable or not. We don't need to split/merge on commit/uncommit then. > But we need do to something when os::release_memory is called on a submapping of a mapping with unknown status. Like on > AIX, uncommit is made https://github.com/openjdk/jdk/blob/master/src/hotspot/os/aix/os_aix.cpp#L2096. But here for > macOS, I'm trying to avoid any change of behavior for non-exec mappings. If the exec parameter is provided for reserve > (as it eventually would be), then we don't need splitting/merging at all. This is what the latest patch is about. I > haven't tested that thoroughly yet, but eventually it would be possible to deduce correct exec values for os::reserve > based on subsequent os::commit. If we make a step back, we have exec parameter known for reserve and commit, I also > pretty sure that it is possible to deduce that for any uncommit (which was one of the initial concerns) Let's agree on > some plan how to attack the problem? I would like to distinguish the work toward MAP_JIT and improving interface. Not > sure what should come first. Are you still opposing to have exec parameter in os::reserve/commit/uncommit and > obligating callers to provide consistent exec values for each, at least at this phase? I mean, eventually we will have > a platform-dependent `handle_t` for mapping or equivalent. Like if we provide size of the whole mapping (the context) > for each commit_memory on AIX, we won't need to do the bookkeeping. What if os::commit to take ReservedSpace and do > something conservative when that is not provided? Are there any users of executable memory which cannot live with anonymous mapping on whatever address with small pages? Does anyone need large pages or a specific wish address? If not, maybe we really should introduce a (reserve|commit|uncommit|release)_executable_memory() at least temporarily, as you suggested. At least that would be clear, and could provide a clear starting point for a new interface. ------------- PR: https://git.openjdk.java.net/jdk/pull/294 From mcimadamore at openjdk.java.net Tue Oct 13 13:29:28 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Tue, 13 Oct 2020 13:29:28 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) Message-ID: This patch contains the changes associated with the first incubation round of the foreign linker access API incubation (see JEP 389 [1]). This work is meant to sit on top of the foreign memory access support (see JEP 393 [2] and associated pull request [3]). The main goal of this API is to provide a way to call native functions from Java code without the need of intermediate JNI glue code. In order to do this, native calls are modeled through the MethodHandle API. I suggest reading the writeup [4] I put together few weeks ago, which illustrates what the foreign linker support is, and how it should be used by clients. Disclaimer: the pull request mechanism isn't great at managing *dependent* reviews. For this reasons, I'm attaching a webrev which contains only the differences between this PR and the memory access PR. I will be periodically uploading new webrevs, as new iterations come out, to try and make the life of reviewers as simple as possible. A big thank to Jorn Vernee and Vladimir Ivanov - they are the main architects of all the hotspot changes you see here, and without their help, the foreign linker support wouldn't be what it is today. As usual, a big thank to Paul Sandoz, who provided many insights (often by trying the bits first hand). Thanks Maurizio Webrev: http://cr.openjdk.java.net/~mcimadamore/8254231_v1/webrev Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254231_v1/javadoc/jdk/incubator/foreign/package-summary.html Specdiff (relative to [3]): http://cr.openjdk.java.net/~mcimadamore/8254231_v1/specdiff_delta/overview-summary.html CSR: https://bugs.openjdk.java.net/browse/JDK-8254232 ### API Changes The API changes are actually rather slim: * `LibraryLookup` * This class allows clients to lookup symbols in native libraries; the interface is fairly simple; you can load a library by name, or absolute path, and then lookup symbols on that library. * `FunctionDescriptor` * This is an abstraction that is very similar, in spirit, to `MethodType`; it is, at its core, an aggregate of memory layouts for the function arguments/return type. A function descriptor is used to describe the signature of a native function. * `CLinker` * This is the real star of the show. A `CLinker` has two main methods: `downcallHandle` and `upcallStub`; the first takes a native symbol (as obtained from `LibraryLookup`), a `MethodType` and a `FunctionDescriptor` and returns a `MethodHandle` instance which can be used to call the target native symbol. The second takes an existing method handle, and a `FunctionDescriptor` and returns a new `MemorySegment` corresponding to a code stub allocated by the VM which acts as a trampoline from native code to the user-provided method handle. This is very useful for implementing upcalls. * This class also contains the various layout constants that should be used by clients when describing native signatures (e.g. `C_LONG` and friends); these layouts contain additional ABI classfication information (in the form of layout attributes) which is used by the runtime to *infer* how Java arguments should be shuffled for the native call to take place. * Finally, this class provides some helper functions e.g. so that clients can convert Java strings into C strings and back. * `NativeScope` * This is an helper class which allows clients to group together logically related allocations; that is, rather than allocating separate memory segments using separate *try-with-resource* constructs, a `NativeScope` allows clients to use a _single_ block, and allocate all the required segments there. This is not only an usability boost, but also a performance boost, since not all allocation requests will be turned into `malloc` calls. * `MemorySegment` * Only one method added here - namely `handoff(NativeScope)` which allows a segment to be transferred onto an existing native scope. ### Safety The foreign linker API is intrinsically unsafe; many things can go wrong when requesting a native method handle. For instance, the description of the native signature might be wrong (e.g. have too many arguments) - and the runtime has, in the general case, no way to detect such mismatches. For these reasons, obtaining a `CLinker` instance is a *restricted* operation, which can be enabled by specifying the usual JDK property `-Dforeign.restricted=permit` (as it's the case for other restricted method in the foreign memory API). ### Implementation changes The Java changes associated with `LibraryLookup` are relative straightforward; the only interesting thing to note here is that library loading does _not_ depend on class loaders, so `LibraryLookup` is not subject to the same restrictions which apply to JNI library loading (e.g. same library cannot be loaded by different classloaders). As for `NativeScope` the changes are again relatively straightforward; it is an API which sits neatly on top of the foreign meory access API, providing some kind of allocation service which shares the same underlying memory segment(s), and turns an allocation request into a segment slice, which is a much less expensive operation. `NativeScope` comes in two variants: there are native scopes for which the allocation size is known a priori, and native scopes which can grow - these two schemes are implemented by two separate subclasses of `AbstractNativeScopeImpl`. Of course the bulk of the changes are to support the `CLinker` downcall/upcall routines. These changes cut pretty deep into the JVM; I'll briefly summarize the goal of some of this changes - for further details, Jorn has put together a detailed writeup which explains the rationale behind the VM support, with some references to the code [5]. The main idea behind foreign linker is to infer, given a Java method type (expressed as a `MethodType` instance) and the description of the signature of a native function (expressed as a `FunctionDescriptor` instance) a _recipe_ that can be used to turn a Java call into the corresponding native call targeting the requested native function. This inference scheme can be defined in a pretty straightforward fashion by looking at the various ABI specifications (for instance, see [6] for the SysV ABI, which is the one used on Linux/Mac). The various `CallArranger` classes, of which we have a flavor for each supported platform, do exactly that kind of inference. For the inference process to work, we need to attach extra information to memory layouts; it is no longer sufficient to know e.g. that a layout is 32/64 bits - we need to know whether it is meant to represent a floating point value, or an integral value; this knowledge is required because floating points are passed in different registers by most ABIs. For this reason, `CLinker` offers a set of pre-baked, platform-dependent layout constants which contain the required classification attributes (e.g. a `Clinker.TypeKind` enum value). The runtime extracts this attribute, and performs classification accordingly. A native call is decomposed into a sequence of basic, primitive operations, called `Binding` (see the great javadoc on the `Binding.java` class for more info). There are many such bindings - for instance the `Move` binding is used to move a value into a specific machine register/stack slot. So, the main job of the various `CallingArranger` classes is to determine, given a Java `MethodType` and `FunctionDescriptor` what is the set of bindings associated with the downcall/upcall. At the heart of the foreign linker support is the `ProgrammableInvoker` class. This class effectively generates a `MethodHandle` which follows the steps described by the various bindings obtained by `CallArranger`. There are actually various strategies to interpret these bindings - listed below: * basic intepreted mode; in this mode, all bindings are interpreted using a stack-based machine written in Java (see `BindingInterpreter`), except for the `Move` bindings. For these bindings, the move is implemented by allocating a *buffer* (whose size is ABI specific) and by moving all the lowered values into positions within this buffer. The buffer is then passed to a piece of assembly code inside the VM which takes values from the buffer and moves them in their expected registers/stack slots (note that each position in the buffer corresponds to a different register). This is the most general invocation mode, the more "customizable" one, but also the slowest - since for every call there is some extra allocation which takes place. * specialized interpreted mode; same as before, but instead of interpreting the bindings with a stack-based interpreter, we generate a method handle chain which effectively interprets all the bindings (again, except `Move` ones). * intrinsified mode; this is typically used in combination with the specialized interpreted mode described above (although it can also be used with the Java-based binding interpreter). The goal here is to remove the buffer allocation and copy by introducing an additional JVM intrinsic. If a native call recipe is constant (e.g. the set of bindings is constant, which is probably the case if the native method handle is stored in a `static`, `final` field), then the VM can generate specialized assembly code which interprets the `Move` binding without the need to go for an intermediate buffer. This gives us back performances that are on par with JNI. For upcalls, the support is not (yet) as advanced, and only the basic interpreted mode is available there. We plan to add support for intrinsified modes there as well, which should considerably boost perfomances (probably well beyond what JNI can offer at the moment, since the upcall support in JNI is not very well optimized). Again, for more readings on the internals of the foreign linker support, please refer to [5]. #### Test changes Many new tests have been added to validate the foreign linker support; we have high level tests (see `StdLibTest`) which aim at testing the linker from the perspective of code that clients could write. But we also have deeper combinatorial tests (see `TestUpcall` and `TestDowncall`) which are meant to stress every corner of the ABI implementation. There are also some great tests (see the `callarranger` folder) which test the various `CallArranger`s for all the possible platforms; these tests adopt more of a white-box approach - that is, instead of treating the linker machinery as a black box and verify that the support works by checking that the native call returned the results we expected, these tests aims at checking that the set of bindings generated by the call arranger is correct. This also mean that we can test the classification logic for Windows, Mac and Linux regardless of the platform we're executing on. Some additional microbenchmarks have been added to compare the performances of downcall/upcall with JNI. [1] - https://openjdk.java.net/jeps/389 [2] - https://openjdk.java.net/jeps/393 [3] - https://git.openjdk.java.net/jdk/pull/548 [4] - https://github.com/openjdk/panama-foreign/blob/foreign-jextract/doc/panama_ffi.md [5] - http://cr.openjdk.java.net/~jvernee/docs/Foreign-abi%20downcall%20intrinsics%20technical%20description.html ------------- Commit messages: - Fix more whitespaces - Fix whitespaces - Remove rejected file - More updates - Add new files - Merge with master - Merge branch 'master' into 8254162 - Remove spurious check on MemoryScope::confineTo - Merge branch 'master' into 8254162 - Simplify example in the toplevel javadoc - ... and 8 more: https://git.openjdk.java.net/jdk/compare/5d6a6255...7cf0ef09 Changes: https://git.openjdk.java.net/jdk/pull/634/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=634&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254231 Stats: 75468 lines in 264 files changed: 72636 ins; 1596 del; 1236 mod Patch: https://git.openjdk.java.net/jdk/pull/634.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/634/head:pull/634 PR: https://git.openjdk.java.net/jdk/pull/634 From aph at redhat.com Tue Oct 13 14:16:52 2020 From: aph at redhat.com (Andrew Haley) Date: Tue, 13 Oct 2020 15:16:52 +0100 Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: References: <35eLsMpWmcCUoiEWhnYdSpZNmvLy4ra56Qtd6eRW574=.4e7c9278-3e0d-457d-9c15-eef45bae9755@github.com> Message-ID: <7d41cbf1-a435-bbf2-e8ec-9f6a2059458a@redhat.com> On 13/10/2020 10:03, Alan Hayward wrote: >> On the good side, this at least makes AArch64 more like other >> targets. >> > Yes. This should give all the goodness from using common code > (simpler, stronger code, etc etc) Well, maybe. My thinking was that the common cross_modify_fence() code was untested and possibly incorrect, whereas I was pretty sure maybe_isb() was used wherever necessary and in fact conservatively even where it wasn't actually necessary. So I didn't really want AArch64 to be the victim. :-) > To be clear, the four original reasons for the patch were: > *Use common code/interfaces where possible > *Reduce ISBs where AArch64 was being too cautious > *Add ISBs if theres any paths without them (there weren't) > *Confirm the above changes are safe. Yep. One interesting thing is that while the AArch64 non-coherent Icache and the rules for "Concurrent modification and execution of instructions" are annoying, they don't make a practical difference to the execution time of Java programs. It's a pain to have to program all of this stuff, but in performance terms the decision of the Arm architects has been vindicated. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From rkennke at openjdk.java.net Tue Oct 13 14:30:30 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 13 Oct 2020 14:30:30 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v9] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference > and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads > that make heavvy use of such weak references will therefore potentially cause significant GC pauses. There are 3 main > items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is > theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly > reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking > bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever > marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all > objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it > will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter > of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for > FinalReference, marking stops there, and does not mark through the referent. 2. Concurrent processing is performed > after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and > depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list > (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no > referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers > in Reference.get() intrinsics that return NULL when the referent is not reachable. Testing: hotspot_gc_shenadoah > (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions Roman Kennke has updated the pull request incrementally with three additional commits since the last revision: - Explicitely use concurrent vs stw reference processing, don't rely on is_at_shenandoah_safepoint() - Exclude Shenandoah from TestSoftReferencesBehaviorOnOOME.java, it doesn't play with concurrent reference processing - Remove wrong assert from ShenandoahReferenceProcessor ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/505/files - new: https://git.openjdk.java.net/jdk/pull/505/files/ee7412e2..2c4b1cbd Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=08 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=07-08 Stats: 17 lines in 5 files changed: 6 ins; 2 del; 9 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From iklam at openjdk.java.net Tue Oct 13 15:44:30 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 13 Oct 2020 15:44:30 GMT Subject: RFR: 8254365: ciMethod.hpp should not include methodHandles.hpp In-Reply-To: References: Message-ID: On Tue, 13 Oct 2020 07:33:18 GMT, David Holmes wrote: > Seems okay. There are a couple of changes unrelated to the bug synopsis. :) > > Thanks, > David There are 2 things that I didn't mention: - java.cpp needs vmThread.hpp (for the zero build only) after this fix. vmThread.hpp used to be recursively included via methodHandles.hpp -> entry_zero.hpp -> zeroInterpreter.hpp -> abstractInterpreter.hpp -> vmThread.hpp. - ciEnv.hpp needs systemDictionary.hpp after this fix. systemDictionary.hpp used to be recursively included via methodHandles.hpp. ------------- PR: https://git.openjdk.java.net/jdk/pull/623 From kbarrett at openjdk.java.net Tue Oct 13 16:04:32 2020 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Tue, 13 Oct 2020 16:04:32 GMT Subject: RFR: 8253402: Convert vmSymbols::SID to enum class [v4] In-Reply-To: <9m7uFY5ij94oj3SQ9pTHNq-tsw0NnPHDVqHhznmAuOo=.bc75f19e-0527-4962-8b19-b178cfa8e572@github.com> References: <9m7uFY5ij94oj3SQ9pTHNq-tsw0NnPHDVqHhznmAuOo=.bc75f19e-0527-4962-8b19-b178cfa8e572@github.com> Message-ID: On Sun, 11 Oct 2020 22:54:21 GMT, Ioi Lam wrote: >> Convert `vmSymbols::SID` to an `enum class` to provide better type safety. >> >> - The original enum type `vmSymbols::SID` cannot be forward-declared. I moved it out of the `vmSymbols` class and >> renamed, so now it can be forward-declared as `enum class vmSymbolID : int;`, without including the large vmSymbols.hpp >> file. >> - This also breaks the mutual dependency between the `vmSymbols` and `vmIntrinsics` classes. Now the declaration of >> `vmIntrinsics` can be moved from vmSymbols.hpp to vmIntrinsics.hpp, where it naturally belongs. >> - Type-safe enumeration (contributed by Kim Barrett) >> for (vmSymbolsIterator it = vmSymbolsRange.begin(); it != vmSymbolsRange.end(); ++it) { >> vmSymbolID index = *it; .... >> } >> - I moved `vmSymbols::_symbols[]` to `Symbol::_vm_symbols[]`, and made it accessible via `Symbol::vm_symbol_at()`. This >> way, header files (e.g. fieldInfo.hpp) that need to convert from `vmSymbolID` to `Symbol*` don't need to include the >> large vmSymbols.hpp file. >> - I changed the `VM_SYMBOL_ENUM_NAME` macro so that the users don't need to explicitly add the `vmSymbolID::` scope. >> - I removed many unnecessary casts between `int` and `vmSymbolID`. >> - The remaining casts are done via `vmSymbol::as_int()` and `vmSymbols::as_SID()` with range checks. >> >> ----- >> If this is successful, I will do the same for `vmIntrinsics::ID`. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > added missing #include from enumIterator.hpp Changes requested by kbarrett (Reviewer). src/hotspot/share/classfile/vmSymbols.hpp line 714: > 712: } > 713: static constexpr bool is_valid_id(vmSymbolID sid) { > 714: return (static_cast(sid) >= FIRST_SID && static_cast(sid) < SID_LIMIT); Why not just `return is_valid_id(static_cast(sid));` src/hotspot/share/classfile/vmSymbols.hpp line 728: > 726: > 727: static constexpr int number_of_symbols() { > 728: return SID_LIMIT; SID_LIMIT includes NO_SID, so this seems to be off-by-one. src/hotspot/share/classfile/vmSymbols.hpp line 732: > 730: > 731: enum { > 732: log2_SID_LIMIT = 11 // checked by an assert at start-up [pre-existing, could be followup] Why isn't this checked with a static assert right here? `SID_LIMIT <= (1 << log2_SID_LIMIT)` is (and always has been) compile-time checkable. And log2_SID_LIMIT should no longer be an enum. (Note that the value isn't currently constexpr-calculable because round_up_power_of_2 can't currently be constexpr.) src/hotspot/share/utilities/enumIterator.hpp line 91: > 89: > 90: template > 91: class EnumIterationTraits : AllStatic { A comment for this class might be helpful. Something like "A helper class for EnumIterator, computing some additional information the iterator uses, based on T and EnumeratorRange." The main point being that users of enum iteration don't need to think about this. src/hotspot/share/utilities/enumIterator.hpp line 56: > 54: // > 55: // > 56: // EnumIterationTraits -- defines the static range of all possible values of the enum. This traits class isn't really interesting to users of this facility; it's more of an implementation detail. See my comment below on the definition. It's EnumeratorRange and (especially) ENUMERATOR_RANGE that are interesting to users. src/hotspot/share/utilities/enumIterator.hpp line 138: > 136: } > 137: > 138: // True if the enumerators designate the same value. "True if the iterators designate the same enumeration value." (Sorry about that.) Similarly below for operator!=. ------------- PR: https://git.openjdk.java.net/jdk/pull/276 From psandoz at openjdk.java.net Tue Oct 13 16:14:40 2020 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Tue, 13 Oct 2020 16:14:40 GMT Subject: RFR: 8223347: Integration of Vector API (Incubator) [v3] In-Reply-To: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> References: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> Message-ID: <-wiRtZZKucOjqFnqeDjVm3B8BaThwGyDdt4aFo9t2-g=.2b4350f4-4704-4857-82e4-7e014898b2da@github.com> > This pull request is for integration of the Vector API. It was previously reviewed under conditions when mercurial was > used for the source code control system. Review threads can be found here (searching for issue number 8223347 in the > title): https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-April/thread.html > https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-May/thread.html > https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-July/thread.html > > If mercurial was still being used the code would be pushed directly, once the CSR is approved. However, in this case a > pull request is required and needs explicit reviewer approval. Between the final review and this pull request no code > has changed, except for that related to merging. Paul Sandoz has updated the pull request incrementally with one additional commit since the last revision: Fix related to merge ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/367/files - new: https://git.openjdk.java.net/jdk/pull/367/files/9cca17b8..d5acb4ff Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=367&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=367&range=01-02 Stats: 76 lines in 1 file changed: 0 ins; 0 del; 76 mod Patch: https://git.openjdk.java.net/jdk/pull/367.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/367/head:pull/367 PR: https://git.openjdk.java.net/jdk/pull/367 From pchilanomate at openjdk.java.net Tue Oct 13 16:49:11 2020 From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo) Date: Tue, 13 Oct 2020 16:49:11 GMT Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: References: <35eLsMpWmcCUoiEWhnYdSpZNmvLy4ra56Qtd6eRW574=.4e7c9278-3e0d-457d-9c15-eef45bae9755@github.com> Message-ID: On Tue, 13 Oct 2020 09:00:38 GMT, Alan Hayward wrote: >>> @a74nh Please do not force-push commits on an open PR as it breaks the commit history and prevents reviewers from >>> seeing what has changed since they last reviewed things. If you need to "rebase" you can just merge your branch with an >>> updated master branch and push the merge commit to your personal fork. The skara tooling will flatten the commits into >>> a single clean commit when integration happens. Thanks. >> >> Not a fan of working with merge commits and I feel it gets muddled when you have history on top of a patch series (as >> opposed to a single patch). However, understood - I'll make sure to merge instead of force pushing next time. > >> _Mailing list message from [Andrew Haley](mailto:aph at redhat.com) on [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ >> >> So, the good news and the bad news: >> >> Moving to cross_modify_fence reduces the number of ISBs from >> 3,840,210 maybe_isb()s to 74,538 cross_modify_fence()s on my >> poster child application, which is recompiling all of java.base. >> >> However, this is a program that runs for 187,501,798,979 insns, >> so we've reduced the proportion of ISBs from 0.002% to 0.00004%. >> I guess that's worth having, but I doubt that the improvement >> would ever have been above the noise level. > > Thanks for testing this out. > "Kills 98% of ISBs" is the marketing headline then. > Sadly less effective than I was hoping - which explains my testing results. > >> >> On the good side, this at least makes AArch64 more like other >> targets. >> > > Yes. This should give all the goodness from using common code > (simpler, stronger code, etc etc) > > To be clear, the four original reasons for the patch were: > *Use common code/interfaces where possible > *Reduce ISBs where AArch64 was being too cautious > *Add ISBs if theres any paths without them (there weren't) > *Confirm the above changes are safe. Regarding the use of cross_modify_fence(), I filed a bug last week to remove some uneeded uses of them in common code (https://bugs.openjdk.java.net/browse/JDK-8254264). Just a heads up before I send the RFR since I see some reference to them in the added comments. Thanks, Patricio ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From rrich at openjdk.java.net Tue Oct 13 17:13:24 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Tue, 13 Oct 2020 17:13:24 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v9] In-Reply-To: References: <5O9n8cKBJyjhp2cNOVD2PcpKQiXqEs5BJjkW1lH-5EM=.044a510e-6517-4564-a3db-00c7951f0b22@github.com> Message-ID: On Sun, 11 Oct 2020 07:20:07 GMT, Richard Reingruber wrote: >>> >>> >>> I tried to run testing with latest changes and latest JDK and build failed: >>> src/hotspot/share/runtime/escapeBarrier.cpp:310:35: error: no matching function for call to >>> 'StackFrameStream::StackFrameStream(JavaThread*&)' 310 | StackFrameStream fst(deoptee); >> >> I noticed this too. I wanted to test with ZGC before pushing the small >> fix. Unfortunately I get >> >> # Internal Error (/priv/d038402/git/reinrich/jdk_ea_new/src/hotspot/share/runtime/stackWatermark.inline.hpp:67), >> pid=90890, tid=90912 # assert(processing_started()) failed: Processing should already have started >> >> [...] >> >> Current thread (0x00007f749c25b1c0): JavaThread "JDWP Transport Listener: dt_socket" daemon [_thread_in_vm, id=90912, >> stack(0x00007f7474c9f000,0x00007f7474da0000)] _threads_hazard_ptr=0x00007f749c2b00c0, _nested_threads_hazard_ptr_cnt=0 >> Stack: [0x00007f7474c9f000,0x00007f7474da0000], sp=0x00007f7474d9c240, free space=1012k >> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x15b3255] StackWatermarkSet::on_iteration(JavaThread*, frame const&)+0xa5 >> V [libjvm.so+0xa1024f] frame::sender(RegisterMap*) const+0x13f >> V [libjvm.so+0xa048f8] frame::real_sender(RegisterMap*) const+0x18 >> V [libjvm.so+0x176261b] vframe::sender() const+0xeb >> V [libjvm.so+0x16cd56b] JavaThread::last_java_vframe(RegisterMap*)+0x5b >> V [libjvm.so+0xfa7a56] JvmtiEnvBase::vframeFor(JavaThread*, int)+0x46 >> V [libjvm.so+0xfab8e5] JvmtiEnvBase::check_top_frame(JavaThread*, JavaThread*, jvalue, TosState, Handle*)+0x1f5 >> V [libjvm.so+0xfac13e] JvmtiEnvBase::force_early_return(JavaThread*, jvalue, TosState)+0x15e >> V [libjvm.so+0xf36fa8] jvmti_ForceEarlyReturnLong+0x258 >> C [libjdwp.so+0xa8b3] forceEarlyReturn+0x293 >> C [libjdwp.so+0x12945] debugLoop_run+0x1f5 >> C [libjdwp.so+0x25bb3] attachThread+0x33 >> V [libjvm.so+0xfcf524] JvmtiAgentThread::call_start_function()+0x1d4 >> V [libjvm.so+0x16cc8f7] JavaThread::thread_main_inner()+0x247 >> V [libjvm.so+0x16d1ce8] Thread::call_run()+0xf8 >> V [libjvm.so+0x12dd75e] thread_native_entry(Thread*)+0x10e >> >> In the test case >> `EAForceEarlyReturnOfInlinedMethodWithScalarReplacedObjectsReallocFailure` of the >> new test `jdk/com/sun/jdi/EATests.java` >> >> So far I do not have an indication that the failure is caused by this change but >> when I run the test with -XX:-DoEscapeAnalysis then the test succeeds. >> >> I need to look more into it. Wish I was a ZGC expert :) >> >> Anyway I pushed the build fix. Tests succeed with default GC. > > The crash described above happens after JDK-8253180 > (https://github.com/openjdk/jdk/commit/b9873e18330b7e43ca47bc1c0655e7ab20828f7a) when executing `EATests.java` with > ZGC: make run-test TEST=test/jdk/com/sun/jdi/EATests.java JTREG=VM_OPTIONS=-XX:+UseZGC > > My understanding of JDK-8253180 (and ZGC) is rather vague. To me it looks as if stackwalks outside of a > safepoint/handshake on suspended threads are currently not supported. It would be my understanding that > `StackWatermarkSet::start_processing()` needs to be called before walking the stack of a thread. Currently this is only > done in preparation of a safepoint or handshake. `JvmtiEnvBase::check_top_frame()` walks the stack of a suspended > thread without safepoint/handshake. This triggers the crash in my opinion. When `StackWatermarkSet::start_processing()` > is called before the test succeeds. I will ask Erik ?sterlund about this. The issues with ZGC concurrent thread stack processing will be resolved with #627 ------------- PR: https://git.openjdk.java.net/jdk/pull/119 From psandoz at openjdk.java.net Tue Oct 13 17:34:37 2020 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Tue, 13 Oct 2020 17:34:37 GMT Subject: RFR: 8223347: Integration of Vector API (Incubator) [v4] In-Reply-To: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> References: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> Message-ID: > This pull request is for integration of the Vector API. It was previously reviewed under conditions when mercurial was > used for the source code control system. Review threads can be found here (searching for issue number 8223347 in the > title): https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-April/thread.html > https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-May/thread.html > https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-July/thread.html > > If mercurial was still being used the code would be pushed directly, once the CSR is approved. However, in this case a > pull request is required and needs explicit reviewer approval. Between the final review and this pull request no code > has changed, except for that related to merging. Paul Sandoz has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: - Merge master - Fix related to merge - HotspotIntrinsicCandidate to IntrinsicCandidate - Merge master - Fix permissions - Fix permissions - Merge master - Vector API new files - Integration of Vector API (Incubator) ------------- Changes: https://git.openjdk.java.net/jdk/pull/367/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=367&range=03 Stats: 295107 lines in 336 files changed: 292957 ins; 1062 del; 1088 mod Patch: https://git.openjdk.java.net/jdk/pull/367.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/367/head:pull/367 PR: https://git.openjdk.java.net/jdk/pull/367 From jbhateja at openjdk.java.net Tue Oct 13 17:54:22 2020 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Tue, 13 Oct 2020 17:54:22 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v2] In-Reply-To: References: Message-ID: > Summary: > > 1) Partial in-lining technique avoids call overhead penalty for small array copy operations with size less than 32 > bytes. 2) At runtime, a conditional check based on copy length either calls an array-copy stub or executes an optimized > instruction sequence using AVX-512 masked instructions emitted at the call site. 3) New runtime flag > ArrayCopyPartialInlineSize=0/32(default)/64 bytes determines the maximum size for partial in-lining. 4) Based on the > perf results seen in benchmarks currently partial in-lining is performed only for arraycopy involving sub-word types > (bool/byte/char/short). Once PR-61 gets integrated we can extend this patch to cover all the primitive types. > Performance Results: > System : CascadeLake Server, Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz > Micros : test/micro/org/openjdk/bench/java/lang/ArrayCopy*.java > ArrayCopyPartialInlineSize : 32 > > JMH | Block Size | Baseline (ns/op) | Partial Inling (ns/op) | Gain > -- | -- | -- | -- | -- > ArrayCopyAligned.testByte | 1 | 5.417 | 2.696 | 2.009272997 > ArrayCopyAligned.testByte | 3 | 5.494 | 2.702 | 2.03330866 > ArrayCopyAligned.testByte | 5 | 5.417 | 2.637 | 2.05422829 > ArrayCopyAligned.testByte | 10 | 5.343 | 2.703 | 1.976692564 > ArrayCopyAligned.testByte | 20 | 5.837 | 2.636 | 2.214339909 > ArrayCopyAligned.testByte | 70 | 5.86 | 6 | 0.976666667 > ArrayCopyAligned.testByte | 150 | 6.766 | 6.906 | 0.979727773 > ArrayCopyAligned.testByte | 300 | 7.605 | 7.952 | 0.956363179 > ArrayCopyAligned.testByte | 600 | 11.989 | 12.007 | 0.998500874 > ArrayCopyAligned.testByte | 1200 | 16.447 | 16.585 | 0.991679228 > ArrayCopyAligned.testChar | 1 | 5.02 | 2.828 | 1.775106082 > ArrayCopyAligned.testChar | 3 | 5.129 | 2.762 | 1.85698769 > ArrayCopyAligned.testChar | 5 | 5.041 | 2.762 | 1.82512672 > ArrayCopyAligned.testChar | 10 | 5.716 | 2.762 | 2.069514844 > ArrayCopyAligned.testChar | 20 | 5.111 | 5.399 | 0.946656788 > ArrayCopyAligned.testChar | 70 | 6.271 | 6.242 | 1.004645947 > ArrayCopyAligned.testChar | 150 | 7.45 | 7.599 | 0.980392157 > ArrayCopyAligned.testChar | 300 | 9.904 | 10.112 | 0.97943038 > ArrayCopyAligned.testChar | 600 | 17.131 | 17.167 | 0.997902953 > ArrayCopyAligned.testChar | 1200 | 29.556 | 29.851 | 0.990117584 > ArrayCopyUnalignedBoth.testByte | 1 | 5.419 | 2.702 | 2.005551443 > ArrayCopyUnalignedBoth.testByte | 3 | 5.558 | 2.636 | 2.108497724 > ArrayCopyUnalignedBoth.testByte | 5 | 5.43 | 2.636 | 2.059939302 > ArrayCopyUnalignedBoth.testByte | 10 | 5.378 | 2.637 | 2.039438756 > ArrayCopyUnalignedBoth.testByte | 20 | 5.914 | 2.636 | 2.243550836 > ArrayCopyUnalignedBoth.testByte | 70 | 5.882 | 5.954 | 0.987907289 > ArrayCopyUnalignedBoth.testByte | 150 | 6.784 | 6.88 | 0.986046512 > ArrayCopyUnalignedBoth.testByte | 300 | 7.635 | 7.968 | 0.958207831 > ArrayCopyUnalignedBoth.testByte | 600 | 12.226 | 12.129 | 1.007997362 > ArrayCopyUnalignedBoth.testByte | 1200 | 16.992 | 20.717 | 0.820195974 > ArrayCopyUnalignedBoth.testChar | 1 | 5.019 | 2.828 | 1.774752475 > ArrayCopyUnalignedBoth.testChar | 3 | 5.163 | 2.763 | 1.868621064 > ArrayCopyUnalignedBoth.testChar | 5 | 5.042 | 2.827 | 1.783516095 > ArrayCopyUnalignedBoth.testChar | 10 | 5.718 | 2.828 | 2.021923621 > ArrayCopyUnalignedBoth.testChar | 20 | 5.111 | 5.404 | 0.945780903 > ArrayCopyUnalignedBoth.testChar | 70 | 6.367 | 6.235 | 1.02117081 > ArrayCopyUnalignedBoth.testChar | 150 | 7.367 | 8.269 | 0.890917886 > ArrayCopyUnalignedBoth.testChar | 300 | 10.358 | 10.642 | 0.973313287 > ArrayCopyUnalignedBoth.testChar | 600 | 20.84 | 17.522 | 1.189361945 > ArrayCopyUnalignedBoth.testChar | 1200 | 31.895 | 31.892 | 1.000094067 > ArrayCopyUnalignedDst.testByte | 1 | 5.455 | 2.637 | 2.068638604 > ArrayCopyUnalignedDst.testByte | 3 | 5.562 | 2.702 | 2.058475204 > ArrayCopyUnalignedDst.testByte | 5 | 5.427 | 2.702 | 2.008512213 > ArrayCopyUnalignedDst.testByte | 10 | 5.367 | 2.696 | 1.990727003 > ArrayCopyUnalignedDst.testByte | 20 | 5.839 | 2.637 | 2.214258627 > ArrayCopyUnalignedDst.testByte | 70 | 5.888 | 5.968 | 0.986595174 > ArrayCopyUnalignedDst.testByte | 150 | 6.785 | 6.773 | 1.001771741 > ArrayCopyUnalignedDst.testByte | 300 | 7.606 | 7.972 | 0.954089313 > ArrayCopyUnalignedDst.testByte | 600 | 11.986 | 21.195 | 0.565510734 > ArrayCopyUnalignedDst.testByte | 1200 | 16.54 | 16.784 | 0.985462345 > ArrayCopyUnalignedDst.testChar | 1 | 5.02 | 2.827 | 1.775733994 > ArrayCopyUnalignedDst.testChar | 3 | 5.131 | 2.762 | 1.857711803 > ArrayCopyUnalignedDst.testChar | 5 | 5.038 | 2.762 | 1.82404055 > ArrayCopyUnalignedDst.testChar | 10 | 5.718 | 2.762 | 2.070238957 > ArrayCopyUnalignedDst.testChar | 20 | 5.113 | 5.401 | 0.946676541 > ArrayCopyUnalignedDst.testChar | 70 | 6.222 | 6.214 | 1.001287416 > ArrayCopyUnalignedDst.testChar | 150 | 7.367 | 8.125 | 0.906707692 > ArrayCopyUnalignedDst.testChar | 300 | 10.204 | 10.082 | 1.012100774 > ArrayCopyUnalignedDst.testChar | 600 | 16.978 | 17.135 | 0.990837467 > ArrayCopyUnalignedDst.testChar | 1200 | 32.351 | 31.996 | 1.011095137 > ArrayCopyUnalignedSrc.testByte | 1 | 5.414 | 2.696 | 2.008160237 > ArrayCopyUnalignedSrc.testByte | 3 | 5.494 | 2.637 | 2.083428138 > ArrayCopyUnalignedSrc.testByte | 5 | 5.431 | 2.637 | 2.059537353 > ArrayCopyUnalignedSrc.testByte | 10 | 5.344 | 2.703 | 1.977062523 > ArrayCopyUnalignedSrc.testByte | 20 | 5.834 | 2.696 | 2.163946588 > ArrayCopyUnalignedSrc.testByte | 70 | 5.883 | 6.009 | 0.979031453 > ArrayCopyUnalignedSrc.testByte | 150 | 6.729 | 6.87 | 0.979475983 > ArrayCopyUnalignedSrc.testByte | 300 | 7.603 | 7.97 | 0.953952321 > ArrayCopyUnalignedSrc.testByte | 600 | 12.004 | 12.16 | 0.987171053 > ArrayCopyUnalignedSrc.testByte | 1200 | 16.534 | 16.643 | 0.9934507 > ArrayCopyUnalignedSrc.testChar | 1 | 5.021 | 2.762 | 1.81788559 > ArrayCopyUnalignedSrc.testChar | 3 | 5.13 | 2.762 | 1.857349747 > ArrayCopyUnalignedSrc.testChar | 5 | 5.042 | 2.827 | 1.783516095 > ArrayCopyUnalignedSrc.testChar | 10 | 5.726 | 2.761 | 2.073886273 > ArrayCopyUnalignedSrc.testChar | 20 | 5.112 | 5.401 | 0.94649139 > ArrayCopyUnalignedSrc.testChar | 70 | 6.113 | 6.227 | 0.981692629 > ArrayCopyUnalignedSrc.testChar | 150 | 7.493 | 7.888 | 0.949923935 > ArrayCopyUnalignedSrc.testChar | 300 | 10.234 | 10.501 | 0.97457385 > ArrayCopyUnalignedSrc.testChar | 600 | 17.175 | 17.142 | 1.001925096 > ArrayCopyUnalignedSrc.testChar | 1200 | 31.926 | 31.987 | 0.998092975 > > Detailed Reports: > Baseline : > [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt) > WithOpt : > [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt) Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8252848 - 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions. ------------- Changes: https://git.openjdk.java.net/jdk/pull/302/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=302&range=01 Stats: 516 lines in 23 files changed: 495 ins; 0 del; 21 mod Patch: https://git.openjdk.java.net/jdk/pull/302.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/302/head:pull/302 PR: https://git.openjdk.java.net/jdk/pull/302 From jbhateja at openjdk.java.net Tue Oct 13 18:03:27 2020 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Tue, 13 Oct 2020 18:03:27 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v3] In-Reply-To: References: Message-ID: <94qadtiTzSkdsJAc_8IWrLxpBvmfiBXMf_W9Z965P80=.9a59a5db-2209-4007-94bb-16ccd8ff0b77@github.com> > Summary: > > 1) Partial in-lining technique avoids call overhead penalty for small array copy operations with size less than 32 > bytes. 2) At runtime, a conditional check based on copy length either calls an array-copy stub or executes an optimized > instruction sequence using AVX-512 masked instructions emitted at the call site. 3) New runtime flag > ArrayCopyPartialInlineSize=0/32(default)/64 bytes determines the maximum size for partial in-lining. 4) Based on the > perf results seen in benchmarks currently partial in-lining is performed only for arraycopy involving sub-word types > (bool/byte/char/short). Once PR-61 gets integrated we can extend this patch to cover all the primitive types. > Performance Results: > System : CascadeLake Server, Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz > Micros : test/micro/org/openjdk/bench/java/lang/ArrayCopy*.java > ArrayCopyPartialInlineSize : 32 > > JMH | Block Size | Baseline (ns/op) | Partial Inling (ns/op) | Gain > -- | -- | -- | -- | -- > ArrayCopyAligned.testByte | 1 | 5.417 | 2.696 | 2.009272997 > ArrayCopyAligned.testByte | 3 | 5.494 | 2.702 | 2.03330866 > ArrayCopyAligned.testByte | 5 | 5.417 | 2.637 | 2.05422829 > ArrayCopyAligned.testByte | 10 | 5.343 | 2.703 | 1.976692564 > ArrayCopyAligned.testByte | 20 | 5.837 | 2.636 | 2.214339909 > ArrayCopyAligned.testByte | 70 | 5.86 | 6 | 0.976666667 > ArrayCopyAligned.testByte | 150 | 6.766 | 6.906 | 0.979727773 > ArrayCopyAligned.testByte | 300 | 7.605 | 7.952 | 0.956363179 > ArrayCopyAligned.testByte | 600 | 11.989 | 12.007 | 0.998500874 > ArrayCopyAligned.testByte | 1200 | 16.447 | 16.585 | 0.991679228 > ArrayCopyAligned.testChar | 1 | 5.02 | 2.828 | 1.775106082 > ArrayCopyAligned.testChar | 3 | 5.129 | 2.762 | 1.85698769 > ArrayCopyAligned.testChar | 5 | 5.041 | 2.762 | 1.82512672 > ArrayCopyAligned.testChar | 10 | 5.716 | 2.762 | 2.069514844 > ArrayCopyAligned.testChar | 20 | 5.111 | 5.399 | 0.946656788 > ArrayCopyAligned.testChar | 70 | 6.271 | 6.242 | 1.004645947 > ArrayCopyAligned.testChar | 150 | 7.45 | 7.599 | 0.980392157 > ArrayCopyAligned.testChar | 300 | 9.904 | 10.112 | 0.97943038 > ArrayCopyAligned.testChar | 600 | 17.131 | 17.167 | 0.997902953 > ArrayCopyAligned.testChar | 1200 | 29.556 | 29.851 | 0.990117584 > ArrayCopyUnalignedBoth.testByte | 1 | 5.419 | 2.702 | 2.005551443 > ArrayCopyUnalignedBoth.testByte | 3 | 5.558 | 2.636 | 2.108497724 > ArrayCopyUnalignedBoth.testByte | 5 | 5.43 | 2.636 | 2.059939302 > ArrayCopyUnalignedBoth.testByte | 10 | 5.378 | 2.637 | 2.039438756 > ArrayCopyUnalignedBoth.testByte | 20 | 5.914 | 2.636 | 2.243550836 > ArrayCopyUnalignedBoth.testByte | 70 | 5.882 | 5.954 | 0.987907289 > ArrayCopyUnalignedBoth.testByte | 150 | 6.784 | 6.88 | 0.986046512 > ArrayCopyUnalignedBoth.testByte | 300 | 7.635 | 7.968 | 0.958207831 > ArrayCopyUnalignedBoth.testByte | 600 | 12.226 | 12.129 | 1.007997362 > ArrayCopyUnalignedBoth.testByte | 1200 | 16.992 | 20.717 | 0.820195974 > ArrayCopyUnalignedBoth.testChar | 1 | 5.019 | 2.828 | 1.774752475 > ArrayCopyUnalignedBoth.testChar | 3 | 5.163 | 2.763 | 1.868621064 > ArrayCopyUnalignedBoth.testChar | 5 | 5.042 | 2.827 | 1.783516095 > ArrayCopyUnalignedBoth.testChar | 10 | 5.718 | 2.828 | 2.021923621 > ArrayCopyUnalignedBoth.testChar | 20 | 5.111 | 5.404 | 0.945780903 > ArrayCopyUnalignedBoth.testChar | 70 | 6.367 | 6.235 | 1.02117081 > ArrayCopyUnalignedBoth.testChar | 150 | 7.367 | 8.269 | 0.890917886 > ArrayCopyUnalignedBoth.testChar | 300 | 10.358 | 10.642 | 0.973313287 > ArrayCopyUnalignedBoth.testChar | 600 | 20.84 | 17.522 | 1.189361945 > ArrayCopyUnalignedBoth.testChar | 1200 | 31.895 | 31.892 | 1.000094067 > ArrayCopyUnalignedDst.testByte | 1 | 5.455 | 2.637 | 2.068638604 > ArrayCopyUnalignedDst.testByte | 3 | 5.562 | 2.702 | 2.058475204 > ArrayCopyUnalignedDst.testByte | 5 | 5.427 | 2.702 | 2.008512213 > ArrayCopyUnalignedDst.testByte | 10 | 5.367 | 2.696 | 1.990727003 > ArrayCopyUnalignedDst.testByte | 20 | 5.839 | 2.637 | 2.214258627 > ArrayCopyUnalignedDst.testByte | 70 | 5.888 | 5.968 | 0.986595174 > ArrayCopyUnalignedDst.testByte | 150 | 6.785 | 6.773 | 1.001771741 > ArrayCopyUnalignedDst.testByte | 300 | 7.606 | 7.972 | 0.954089313 > ArrayCopyUnalignedDst.testByte | 600 | 11.986 | 21.195 | 0.565510734 > ArrayCopyUnalignedDst.testByte | 1200 | 16.54 | 16.784 | 0.985462345 > ArrayCopyUnalignedDst.testChar | 1 | 5.02 | 2.827 | 1.775733994 > ArrayCopyUnalignedDst.testChar | 3 | 5.131 | 2.762 | 1.857711803 > ArrayCopyUnalignedDst.testChar | 5 | 5.038 | 2.762 | 1.82404055 > ArrayCopyUnalignedDst.testChar | 10 | 5.718 | 2.762 | 2.070238957 > ArrayCopyUnalignedDst.testChar | 20 | 5.113 | 5.401 | 0.946676541 > ArrayCopyUnalignedDst.testChar | 70 | 6.222 | 6.214 | 1.001287416 > ArrayCopyUnalignedDst.testChar | 150 | 7.367 | 8.125 | 0.906707692 > ArrayCopyUnalignedDst.testChar | 300 | 10.204 | 10.082 | 1.012100774 > ArrayCopyUnalignedDst.testChar | 600 | 16.978 | 17.135 | 0.990837467 > ArrayCopyUnalignedDst.testChar | 1200 | 32.351 | 31.996 | 1.011095137 > ArrayCopyUnalignedSrc.testByte | 1 | 5.414 | 2.696 | 2.008160237 > ArrayCopyUnalignedSrc.testByte | 3 | 5.494 | 2.637 | 2.083428138 > ArrayCopyUnalignedSrc.testByte | 5 | 5.431 | 2.637 | 2.059537353 > ArrayCopyUnalignedSrc.testByte | 10 | 5.344 | 2.703 | 1.977062523 > ArrayCopyUnalignedSrc.testByte | 20 | 5.834 | 2.696 | 2.163946588 > ArrayCopyUnalignedSrc.testByte | 70 | 5.883 | 6.009 | 0.979031453 > ArrayCopyUnalignedSrc.testByte | 150 | 6.729 | 6.87 | 0.979475983 > ArrayCopyUnalignedSrc.testByte | 300 | 7.603 | 7.97 | 0.953952321 > ArrayCopyUnalignedSrc.testByte | 600 | 12.004 | 12.16 | 0.987171053 > ArrayCopyUnalignedSrc.testByte | 1200 | 16.534 | 16.643 | 0.9934507 > ArrayCopyUnalignedSrc.testChar | 1 | 5.021 | 2.762 | 1.81788559 > ArrayCopyUnalignedSrc.testChar | 3 | 5.13 | 2.762 | 1.857349747 > ArrayCopyUnalignedSrc.testChar | 5 | 5.042 | 2.827 | 1.783516095 > ArrayCopyUnalignedSrc.testChar | 10 | 5.726 | 2.761 | 2.073886273 > ArrayCopyUnalignedSrc.testChar | 20 | 5.112 | 5.401 | 0.94649139 > ArrayCopyUnalignedSrc.testChar | 70 | 6.113 | 6.227 | 0.981692629 > ArrayCopyUnalignedSrc.testChar | 150 | 7.493 | 7.888 | 0.949923935 > ArrayCopyUnalignedSrc.testChar | 300 | 10.234 | 10.501 | 0.97457385 > ArrayCopyUnalignedSrc.testChar | 600 | 17.175 | 17.142 | 1.001925096 > ArrayCopyUnalignedSrc.testChar | 1200 | 31.926 | 31.987 | 0.998092975 > > Detailed Reports: > Baseline : > [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt) > WithOpt : > [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt) Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Replacing explicit type checks with existing type checking routines ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/302/files - new: https://git.openjdk.java.net/jdk/pull/302/files/9ab77283..2679fe66 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=302&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=302&range=01-02 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/302.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/302/head:pull/302 PR: https://git.openjdk.java.net/jdk/pull/302 From iklam at openjdk.java.net Tue Oct 13 18:15:12 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 13 Oct 2020 18:15:12 GMT Subject: RFR: 8253402: Convert vmSymbols::SID to enum class [v4] In-Reply-To: References: <9m7uFY5ij94oj3SQ9pTHNq-tsw0NnPHDVqHhznmAuOo=.bc75f19e-0527-4962-8b19-b178cfa8e572@github.com> Message-ID: On Tue, 13 Oct 2020 15:42:41 GMT, Kim Barrett wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> added missing #include from enumIterator.hpp > > src/hotspot/share/classfile/vmSymbols.hpp line 732: > >> 730: >> 731: enum { >> 732: log2_SID_LIMIT = 11 // checked by an assert at start-up > > [pre-existing, could be followup] > Why isn't this checked with a static assert right here? > `SID_LIMIT <= (1 << log2_SID_LIMIT)` > is (and always has been) compile-time checkable. And log2_SID_LIMIT should no longer be an enum. (Note that the value > isn't currently constexpr-calculable because round_up_power_of_2 can't currently be constexpr.) The existing code groups several asserts together. I think this makes the intention clearer. I'll leave it for now. void vmSymbols::initialize(TRAPS) { assert(SID_LIMIT <= (1< (1< References: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> <7-p-Kc9lQyyuoWdNtmgbXbwkxsgk4oQGKmFSCcMpvnU=.97810c01-3200-4767-bbd4-35d53c2bc5ca@github.com> <6Voyfr_s-ieyRA-8Rtvvpz7tkhhicA8sY2d2KTp3Kmw=.fa256bae-2143-4b43-bfea-5837ad31eb6a@github.com> Message-ID: On Mon, 12 Oct 2020 22:00:24 GMT, CoreyAshford wrote: >> This latest push passes the intrinsic regression test. I had run the intrinsic TestBase64 regression test on the >> previous push, but not the one in utils. Interesting. Somehow it didn't occur to me that there could be a problem >> there if the intrinsic TestBase64 test passed. I will check into the other regression test. Don't review this latest >> push just yet. > > Ok, all is clear. I just ran `jdk/java/util/Base64/TestBase64.java` which passes as well. Please review again when > convenient. Hi Corey, thanks for taking some stuff out of the ?too short? path. There may be a performance regression when decoding many short arrays because of the stub call overhead and the usage of the slower part of the Java implementation. We could do it a little better in many cases to compute the maximum possible iteration count i: i = (sl - sp) / block_size if (i * block_size > sl - 12) i-- if (i <= 0) return 0 What do you think? I don?t think branch prediction hints are helpful for the ?too short? check. And we should better use CCR1 instead of CCR2 which is specified as non-volatile. Did you already find a 2nd reviewer for the PPC64 part? Best regards, Martin From: CoreyAshford Sent: Dienstag, 13. Oktober 2020 00:01 To: openjdk/jdk Cc: Doerr, Martin ; Mention Subject: Re: [openjdk/jdk] 8248188: Add IntrinsicCandidate and API for Base64 decoding (#293) Ok, all is clear. I just ran jdk/java/util/Base64/TestBase64.java which passes as well. Please review again when convenient. ? You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From stuefe at openjdk.java.net Tue Oct 13 20:06:18 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 13 Oct 2020 20:06:18 GMT Subject: RFR: 8226236: [TESTBUG] win32: gc/metaspace/TestCapacityUntilGCWrapAround.java fails In-Reply-To: References: Message-ID: On Tue, 13 Oct 2020 10:00:37 GMT, Yasumasa Suenaga wrote: > Originally filed at AdoptOpenJDK: > https://github.com/AdoptOpenJDK/openjdk-tests/issues/1162 > > The test fails on 32bit windows with: > > java.lang.IllegalStateException: WB_IncMetaspaceCapacityUntilGC: could not increase capacity until GC due to contention > with another thread > at sun.hotspot.WhiteBox.incMetaspaceCapacityUntilGC(Native Method) > at TestCapacityUntilGCWrapAround.main(TestCapacityUntilGCWrapAround.java:51) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) > at java.lang.Thread.run(Thread.java:748) > > `TestCapacityUntilGCWrapAround` passes `4GB - 1` to `incMetaspaceCapacityUntilGC()`. It seems to be too big. > And also this code seems to want to check the behavior when `_capacity_until_gc` is overflown. White box test would > throw ISE when it hapen. So we need to handle it correctly. Hi Yasumasa, sorry, I do not understand what the error is yet and to me the patch does seem wrong in a number of places. Both test and whitebox function seem to be doing what they are supposed to do. Is this error intermittent or always reproducable? If the latter, since when (since the code seems old)? Have the AdoptOpenJDK folks done some pre-analysis? ---- >From just looking at the code (cannot build 32bit right now), WB_IncMetaspaceCapacityUntilGC() seems correct to me. - in comes a jlong with 4G-1, so FFFF-FFFF - it fits into a 32bit size_t so its fine - we calculate aligned_inc by aligning the value down somewhat, probably a page: FFFF-F000 - we feed this value FFFF-F000 as *inc* value to MetaspaceGC::inc_capacity_until_GC - it returns false. We throw an exception. In MetaspaceGC::inc_capacity_until_GC: bool MetaspaceGC::inc_capacity_until_GC(size_t v, size_t* new_cap_until_GC, size_t* old_cap_until_GC, bool* can_retry) { assert_is_aligned(v, Metaspace::commit_alignment()); size_t old_capacity_until_GC = _capacity_until_GC; size_t new_value = old_capacity_until_GC + v; if (new_value < old_capacity_until_GC) { // The addition wrapped around, set new_value to aligned max value. new_value = align_down(max_uintx, Metaspace::commit_alignment()); } if (new_value > MaxMetaspaceSize) { if (can_retry != NULL) { *can_retry = false; } return false; } we enter with size_t v = FFFF-F000 We calc the new value. Since new value = old value + inc, this will overflow We sense the overflow and correct the new value to be max_uintx aligned down by commit alignment. Which will be smaller than MaxMetaspaceSize. Test does not set it therefore it is max_uintx. So far all good IIUC. if (can_retry != NULL) { *can_retry = true; } size_t prev_value = Atomic::cmpxchg(&_capacity_until_GC, old_capacity_until_GC, new_value); if (old_capacity_until_GC != prev_value) { return false; } if (new_cap_until_GC != NULL) { *new_cap_until_GC = new_value; } if (old_cap_until_GC != NULL) { *old_cap_until_GC = old_capacity_until_GC; } return true; >From looking at this code, the only way I can see this function returning false would be if a concurrent thread modified this threshold. Which should be super rare. Hence my initial question about intermittentness. If its reproducable I do not think I understand it yet. Thanks, Thomas src/hotspot/share/prims/whitebox.cpp line 1745: > 1743: WB_ENTRY(jlong, WB_IncMetaspaceCapacityUntilGC(JNIEnv* env, jobject wb, jlong inc)) > 1744: size_t max_size_t = (size_t) -1; > 1745: if ((size_t) inc > max_size_t) { Sorry, I think this is just wrong, the original code is correct (beside the use of -1 instead of using SIZE_MAX which would have been nicer). size_t is an unsigned 32bit on 32bit platform so the comparison will be always false. The original coding used a 64bit jlong for the comparison which is correct. Also, the first comparison for inc < 0 is needed to make the second comparison work. test/hotspot/jtreg/gc/metaspace/TestCapacityUntilGCWrapAround.java line 32: > 30: * @modules java.base/jdk.internal.misc > 31: * java.management > 32: * @requires vm.bits == 32 This test is for a specific 32bit condition, it does not make sense on 64bit. test/hotspot/jtreg/gc/metaspace/TestCapacityUntilGCWrapAround.java line 46: > 44: private static void incMetaspaceCapacityUntilGCTest(WhiteBox wb) { > 45: long before = wb.metaspaceCapacityUntilGC(); > 46: long after = wb.incMetaspaceCapacityUntilGC(100 * MB); The test makes only sense with 4G increment. AFAIU the counter is supposed to overflow, and cap at MaxMetaspaceSize instead. ------------- Changes requested by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/628 From iklam at openjdk.java.net Tue Oct 13 20:51:26 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 13 Oct 2020 20:51:26 GMT Subject: RFR: 8253402: Convert vmSymbols::SID to enum class [v5] In-Reply-To: References: Message-ID: > Convert `vmSymbols::SID` to an `enum class` to provide better type safety. > > - The original enum type `vmSymbols::SID` cannot be forward-declared. I moved it out of the `vmSymbols` class and > renamed, so now it can be forward-declared as `enum class vmSymbolID : int;`, without including the large vmSymbols.hpp > file. > - This also breaks the mutual dependency between the `vmSymbols` and `vmIntrinsics` classes. Now the declaration of > `vmIntrinsics` can be moved from vmSymbols.hpp to vmIntrinsics.hpp, where it naturally belongs. > - Type-safe enumeration (contributed by Kim Barrett) > for (vmSymbolsIterator it = vmSymbolsRange.begin(); it != vmSymbolsRange.end(); ++it) { > vmSymbolID index = *it; .... > } > - I moved `vmSymbols::_symbols[]` to `Symbol::_vm_symbols[]`, and made it accessible via `Symbol::vm_symbol_at()`. This > way, header files (e.g. fieldInfo.hpp) that need to convert from `vmSymbolID` to `Symbol*` don't need to include the > large vmSymbols.hpp file. > - I changed the `VM_SYMBOL_ENUM_NAME` macro so that the users don't need to explicitly add the `vmSymbolID::` scope. > - I removed many unnecessary casts between `int` and `vmSymbolID`. > - The remaining casts are done via `vmSymbol::as_int()` and `vmSymbols::as_SID()` with range checks. > > ----- > If this is successful, I will do the same for `vmIntrinsics::ID`. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: Addressed review comments by Kim Barrett ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/276/files - new: https://git.openjdk.java.net/jdk/pull/276/files/9ddca08f..662ad8b2 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=276&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=276&range=03-04 Stats: 13 lines in 2 files changed: 4 ins; 1 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/276.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/276/head:pull/276 PR: https://git.openjdk.java.net/jdk/pull/276 From iklam at openjdk.java.net Tue Oct 13 20:51:27 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 13 Oct 2020 20:51:27 GMT Subject: RFR: 8253402: Convert vmSymbols::SID to enum class [v4] In-Reply-To: References: <9m7uFY5ij94oj3SQ9pTHNq-tsw0NnPHDVqHhznmAuOo=.bc75f19e-0527-4962-8b19-b178cfa8e572@github.com> Message-ID: On Tue, 13 Oct 2020 15:35:45 GMT, Kim Barrett wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> added missing #include from enumIterator.hpp > > src/hotspot/share/classfile/vmSymbols.hpp line 728: > >> 726: >> 727: static constexpr int number_of_symbols() { >> 728: return SID_LIMIT; > > SID_LIMIT includes NO_SID, so this seems to be off-by-one. I changed NO_SID to -1 so SID_LIMIT no longer includes NO_SID. ------------- PR: https://git.openjdk.java.net/jdk/pull/276 From github.com+51754783+coreyashford at openjdk.java.net Tue Oct 13 21:02:21 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Tue, 13 Oct 2020 21:02:21 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v4] In-Reply-To: References: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> <7-p-Kc9lQyyuoWdNtmgbXbwkxsgk4oQGKmFSCcMpvnU=.97810c01-3200-4767-bbd4-35d53c2bc5ca@github.com> <6Voyfr_s-ieyRA-8Rtvvpz7tkhhicA8sY2d2KTp3Kmw=.fa256bae-2143-4b43-bfea-5837ad31eb6a@github.com> Message-ID: On Tue, 13 Oct 2020 19:56:42 GMT, Martin Doerr wrote: > Hi Corey, thanks for taking some stuff out of the ?too short? path. There may be a performance regression when decoding > many short arrays because of the stub call overhead and the usage of the slower part of the Java implementation. We > could do it a little better in many cases to compute the maximum possible iteration count i: i = (sl - sp) / block_size > if (i * block_size > sl - 12) i-- if (i <= 0) return 0 What do you think? Are you thinking of a case where that produces a higher iteration count? It looks effectively the same to me. > I don?t think branch prediction hints are helpful for the ?too short? check. My thinking is that most of the time when the intrinsic is called, it will not take the early exit, but I suppose when it is processing a sub-block_size buffer, it will return early every time. I will remove the hints. > And we should better use CCR1 instead of CCR2 which is specified as non-volatile. Ah, I should have checked the calling conventions. I thought all of the CR* regs are volatile. I will fix that. > Did you already find a 2nd reviewer for the PPC64 part? Your original comment said "2nd review", so I thought you meant you need to review it again after the changes. So, no, I haven't looked for or found a second reviewer. Any suggestions? The folks on the team here have been busy with other work. Btw, I'm off today, so I will push commits to the above-mentioned issues tomorrow. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From github.com+69653380+katyapav at openjdk.java.net Tue Oct 13 22:13:18 2020 From: github.com+69653380+katyapav at openjdk.java.net (Ekaterina Pavlova) Date: Tue, 13 Oct 2020 22:13:18 GMT Subject: RFR: 8223347: Integration of Vector API (Incubator) [v4] In-Reply-To: References: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> Message-ID: <1Nola-xPBRFEk3kdLVEoJ1sO8U4OpFldV2sq_Ta4Jxs=.075fcfb4-ebba-4351-9265-e41066b5da96@github.com> On Mon, 12 Oct 2020 12:56:10 GMT, Erik Joelsson wrote: >> Paul Sandoz has updated the pull request with a new target base due to a merge or a rebase. The pull request now >> contains ten commits: >> - Merge master >> - Fix related to merge >> - HotspotIntrinsicCandidate to IntrinsicCandidate >> - Merge master >> - Fix permissions >> - Fix permissions >> - Merge master >> - Vector API new files >> - Integration of Vector API (Incubator) > > Build changes look good. There are several gc tests crashed in panama-vector tier3 testing which seems are not observed in openjdk repo. The crashes look like: # assert(oopDesc::is_oop(obj)) failed: not an oop: 0xfffffffffffffff1 # # JRE version: Java(TM) SE Runtime Environment (16.0+3) (fastdebug build 16-panama+3-216) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 16-panama+3-216, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) # Problematic frame: # V [libjvm.so+0xd8ef94] HandleArea::allocate_handle(oop)+0x144 and the issue is actually tracked by JDK-8233199. This issue needs to be at least analyzed before integrating Vector API. ------------- PR: https://git.openjdk.java.net/jdk/pull/367 From luhenry at microsoft.com Tue Oct 13 22:34:10 2020 From: luhenry at microsoft.com (Ludovic Henry) Date: Tue, 13 Oct 2020 22:34:10 +0000 Subject: [aarch64-port-dev ] [jdk11u] 8253947: Implementation: JEP 388: Windows AArch64 Support In-Reply-To: <88ffa3cc-2a72-2211-f69b-9fd281d66b75@redhat.com> References: <99a84e5c-0838-7281-8eed-f6bf7c6342f1@redhat.com> <957d85208f5d4bcab0d83d3f65ee4995@azul.com> <88ffa3cc-2a72-2211-f69b-9fd281d66b75@redhat.com> Message-ID: Hi, To more easily share status and progress, I've pushed our patches to openjdk/aarch64-port:jdk11-windows [1]. Any feedback and PRs are welcome. > Note that I am *not* ruling out AArch64 support in JDK 11. If it can be done cleanly and safely that will be great. However, to begin with, it will not go into OpenJDK 11u. The stability of the main release branch of OpenJDK is far too important for it to be broken by an untested new port. Could you define more specifically what you mean by " done cleanly and safely". Overall, I understand your cautious approach as the last thing we want is to destabilize JDK 11. I'll keep periodically following up to make sure that at some point this gets merged, but I understand why the push back and why setting the bar as high. Thank you, Ludovic [1] https://github.com/openjdk/aarch64-port/commits/jdk11-windows From per.liden at oracle.com Tue Oct 13 22:47:10 2020 From: per.liden at oracle.com (Per Liden) Date: Wed, 14 Oct 2020 00:47:10 +0200 Subject: =?UTF-8?Q?Re=3a_CFV=3a_New_HotSpot_Group_Member=3a_Erik_=c3=96sterl?= =?UTF-8?Q?und?= In-Reply-To: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> References: <21E7AB55-21D1-4C13-8B22-8C1FE2B60FD5@oracle.com> Message-ID: <7c806b4d-6bb5-8b2d-e8df-489bdab91a7c@oracle.com> Vote: yes /Per On 10/8/20 11:24 AM, Kim Barrett wrote: > I hearby nominate Erik ?sterlund to Membership in the HotSpot Group. > > Erik has been a JDK Reviewer and member of the Oracle GC team for several > years, currently working on ZGC, though his reach and influence extends > significantly beyond that project. He has made many substantial > contributions [1] including (most recently) JEP 376: ZGC: Concurrent > Thread-Stack Processing. > > Votes are due by Friday, 23-Oct-2020 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Erik+%C3%96sterlund%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From kvn at openjdk.java.net Wed Oct 14 00:28:23 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 14 Oct 2020 00:28:23 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v10] In-Reply-To: References: Message-ID: On Sat, 10 Oct 2020 08:34:23 GMT, Richard Reingruber wrote: >> Hi, >> >> this is the continuation of the review of the implementation for: >> >> https://bugs.openjdk.java.net/browse/JDK-8227745 >> https://bugs.openjdk.java.net/browse/JDK-8233915 >> >> It allows for JIT optimizations based on escape analysis even if JVMTI agents acquire capabilities to access references >> to objects that are subject to such optimizations, e.g. scalar replacement. The implementation reverts such >> optimizations just before access very much as when switching from JIT compiled execution to the interpreter, aka >> "deoptimization". Webrev.8 was the last one before before the transition to Git/Github: >> >> http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ >> >> Thanks, Richard. > > Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains 21 commits: > - The constructor of StackFrameStream takes more parameters after JDK-8253180 > - Merge branch 'master' into JDK-8227745 > - Merge branch 'master' into JDK-8227745 > - Merge branch 'master' into JDK-8227745 > - Factorized fragment out of EscapeBarrier::deoptimize_objects_internal into new method in compiledVFrame. > - More smaller changes proposed by Serguei. > - jvmtiDeferredUpdates.hpp: remove forward declarations. > - jvmtiDeferredLocalVariable: move member variables to the beginning of the class definition. > - jvmtiDeferredUpdates.hpp: add/remove empty lines and improve indentation. > - Merge branch 'master' into JDK-8227745 > - ... and 11 more: https://git.openjdk.java.net/jdk/compare/aaa0a2a0...06b139a9 Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/119 From sviswanathan at openjdk.java.net Wed Oct 14 00:37:22 2020 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Wed, 14 Oct 2020 00:37:22 GMT Subject: RFR: 8223347: Integration of Vector API (Incubator) [v4] In-Reply-To: <1Nola-xPBRFEk3kdLVEoJ1sO8U4OpFldV2sq_Ta4Jxs=.075fcfb4-ebba-4351-9265-e41066b5da96@github.com> References: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> <1Nola-xPBRFEk3kdLVEoJ1sO8U4OpFldV2sq_Ta4Jxs=.075fcfb4-ebba-4351-9265-e41066b5da96@github.com> Message-ID: On Tue, 13 Oct 2020 21:29:52 GMT, Ekaterina Pavlova wrote: >> Build changes look good. > > There are several gc tests crashed in panama-vector tier3 testing which seems are not observed in openjdk repo. > The crashes look like: > # assert(oopDesc::is_oop(obj)) failed: not an oop: 0xfffffffffffffff1 > # > # JRE version: Java(TM) SE Runtime Environment (16.0+3) (fastdebug build 16-panama+3-216) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 16-panama+3-216, mixed mode, sharing, tiered, compressed oops, > g1 gc, linux-amd64) # Problematic frame: > # V [libjvm.so+0xd8ef94] HandleArea::allocate_handle(oop)+0x144 > > and the issue is actually tracked by JDK-8233199. > > This issue needs to be at least analyzed before integrating Vector API. @katyapav Is the failure observed on vector-unstable branch of panama-vector? The code in this pull request is from vector-unstable branch. The bug report https://bugs.openjdk.java.net/browse/JDK-8233199 refers to repo-valhalla and not panama-vector:vector-unstable. @PaulSandoz is doing final testing of the pull request today before integration tomorrow hopefully. ------------- PR: https://git.openjdk.java.net/jdk/pull/367 From github.com+69653380+katyapav at openjdk.java.net Wed Oct 14 00:50:25 2020 From: github.com+69653380+katyapav at openjdk.java.net (Ekaterina Pavlova) Date: Wed, 14 Oct 2020 00:50:25 GMT Subject: RFR: 8223347: Integration of Vector API (Incubator) [v4] In-Reply-To: References: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> <1Nola-xPBRFEk3kdLVEoJ1sO8U4OpFldV2sq_Ta4Jxs=.075fcfb4-ebba-4351-9265-e41066b5da96@github.com> Message-ID: <57GgbYK9zqtp_hlgSwgHG-vN0th0LEmLksSntjQ7mW8=.3c2a2d5f-e407-41ff-a2b1-3d89043144a2@github.com> On Wed, 14 Oct 2020 00:34:04 GMT, Sandhya Viswanathan wrote: >> There are several gc tests crashed in panama-vector tier3 testing which seems are not observed in openjdk repo. >> The crashes look like: >> # assert(oopDesc::is_oop(obj)) failed: not an oop: 0xfffffffffffffff1 >> # >> # JRE version: Java(TM) SE Runtime Environment (16.0+3) (fastdebug build 16-panama+3-216) >> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 16-panama+3-216, mixed mode, sharing, tiered, compressed oops, >> g1 gc, linux-amd64) # Problematic frame: >> # V [libjvm.so+0xd8ef94] HandleArea::allocate_handle(oop)+0x144 >> >> and the issue is actually tracked by JDK-8233199. >> >> This issue needs to be at least analyzed before integrating Vector API. > > @katyapav Is the failure observed on vector-unstable branch of panama-vector? > The code in this pull request is from vector-unstable branch. > The bug report https://bugs.openjdk.java.net/browse/JDK-8233199 refers to repo-valhalla and not > panama-vector:vector-unstable. @PaulSandoz is doing final testing of the pull request today before integration tomorrow > hopefully. @sviswa7 you are right, the failure is observed on vector-unstable branch of panama-vector. I referred to JDK-8233199 because it seems both panama-vector and valhalla-repo have the same issue/crash. @PaulSandoz also mentioned that panama-vector was not in sync with master and this is perhaps the issue is in vector-unstable. He said that he tested the PR separately and didn't observe this issue in the PR. ------------- PR: https://git.openjdk.java.net/jdk/pull/367 From kbarrett at openjdk.java.net Wed Oct 14 02:07:14 2020 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 14 Oct 2020 02:07:14 GMT Subject: RFR: 8253402: Convert vmSymbols::SID to enum class [v5] In-Reply-To: References: Message-ID: On Tue, 13 Oct 2020 20:51:26 GMT, Ioi Lam wrote: >> Convert `vmSymbols::SID` to an `enum class` to provide better type safety. >> >> - The original enum type `vmSymbols::SID` cannot be forward-declared. I moved it out of the `vmSymbols` class and >> renamed, so now it can be forward-declared as `enum class vmSymbolID : int;`, without including the large vmSymbols.hpp >> file. >> - This also breaks the mutual dependency between the `vmSymbols` and `vmIntrinsics` classes. Now the declaration of >> `vmIntrinsics` can be moved from vmSymbols.hpp to vmIntrinsics.hpp, where it naturally belongs. >> - Type-safe enumeration (contributed by Kim Barrett) >> for (vmSymbolsIterator it = vmSymbolsRange.begin(); it != vmSymbolsRange.end(); ++it) { >> vmSymbolID index = *it; .... >> } >> - I moved `vmSymbols::_symbols[]` to `Symbol::_vm_symbols[]`, and made it accessible via `Symbol::vm_symbol_at()`. This >> way, header files (e.g. fieldInfo.hpp) that need to convert from `vmSymbolID` to `Symbol*` don't need to include the >> large vmSymbols.hpp file. >> - I changed the `VM_SYMBOL_ENUM_NAME` macro so that the users don't need to explicitly add the `vmSymbolID::` scope. >> - I removed many unnecessary casts between `int` and `vmSymbolID`. >> - The remaining casts are done via `vmSymbol::as_int()` and `vmSymbols::as_SID()` with range checks. >> >> ----- >> If this is successful, I will do the same for `vmIntrinsics::ID`. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Addressed review comments by Kim Barrett Marked as reviewed by kbarrett (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/276 From github.com+51754783+coreyashford at openjdk.java.net Wed Oct 14 02:08:18 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Wed, 14 Oct 2020 02:08:18 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v4] In-Reply-To: References: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> <7-p-Kc9lQyyuoWdNtmgbXbwkxsgk4oQGKmFSCcMpvnU=.97810c01-3200-4767-bbd4-35d53c2bc5ca@github.com> <6Voyfr_s-ieyRA-8Rtvvpz7tkhhicA8sY2d2KTp3Kmw=.fa256bae-2143-4b43-bfea-5837ad31eb6a@github.com> Message-ID: On Tue, 13 Oct 2020 20:59:01 GMT, CoreyAshford wrote: > > > Did you already find a 2nd reviewer for the PPC64 part? > > Your original comment said "2nd review", so I thought you meant you need to review it again after the changes. So, no, > I haven't looked for or found a second reviewer. Any suggestions? The folks on the team here have been busy with other > work. I am actively asking for some help here, so maybe within a few days I can get a 2nd reviewer. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From ysuenaga at openjdk.java.net Wed Oct 14 03:20:15 2020 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 14 Oct 2020 03:20:15 GMT Subject: RFR: 8226236: [TESTBUG] win32: gc/metaspace/TestCapacityUntilGCWrapAround.java fails In-Reply-To: References: Message-ID: On Tue, 13 Oct 2020 20:02:56 GMT, Thomas Stuefe wrote: >> Originally filed at AdoptOpenJDK: >> https://github.com/AdoptOpenJDK/openjdk-tests/issues/1162 >> >> The test fails on 32bit windows with: >> >> java.lang.IllegalStateException: WB_IncMetaspaceCapacityUntilGC: could not increase capacity until GC due to contention >> with another thread >> at sun.hotspot.WhiteBox.incMetaspaceCapacityUntilGC(Native Method) >> at TestCapacityUntilGCWrapAround.main(TestCapacityUntilGCWrapAround.java:51) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:498) >> at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) >> at java.lang.Thread.run(Thread.java:748) >> >> `TestCapacityUntilGCWrapAround` passes `4GB - 1` to `incMetaspaceCapacityUntilGC()`. It seems to be too big. >> And also this code seems to want to check the behavior when `_capacity_until_gc` is overflown. White box test would >> throw ISE when it hapen. So we need to handle it correctly. > > Hi Yasumasa, > > sorry, I do not understand what the error is yet and to me the patch does seem wrong in a number of places. > > Both test and whitebox function seem to be doing what they are supposed to do. > > Is this error intermittent or always reproducable? If the latter, since when (since the code seems old)? > > Have the AdoptOpenJDK folks done some pre-analysis? > > ---- > > From just looking at the code (cannot build 32bit right now), WB_IncMetaspaceCapacityUntilGC() seems correct to me. > > - in comes a jlong with 4G-1, so FFFF-FFFF > - it fits into a 32bit size_t so its fine > - we calculate aligned_inc by aligning the value down somewhat, probably a page: FFFF-F000 > - we feed this value FFFF-F000 as *inc* value to MetaspaceGC::inc_capacity_until_GC > - it returns false. We throw an exception. > > In MetaspaceGC::inc_capacity_until_GC: > > bool MetaspaceGC::inc_capacity_until_GC(size_t v, size_t* new_cap_until_GC, size_t* old_cap_until_GC, bool* can_retry) { > assert_is_aligned(v, Metaspace::commit_alignment()); > > size_t old_capacity_until_GC = _capacity_until_GC; > size_t new_value = old_capacity_until_GC + v; > > if (new_value < old_capacity_until_GC) { > // The addition wrapped around, set new_value to aligned max value. > new_value = align_down(max_uintx, Metaspace::commit_alignment()); > } > > if (new_value > MaxMetaspaceSize) { > if (can_retry != NULL) { > *can_retry = false; > } > return false; > } > > we enter with size_t v = FFFF-F000 > > We calc the new value. Since new value = old value + inc, this will overflow > > We sense the overflow and correct the new value to be max_uintx aligned down by commit alignment. > > Which will be smaller than MaxMetaspaceSize. Test does not set it therefore it is max_uintx. > > So far all good IIUC. > > > if (can_retry != NULL) { > *can_retry = true; > } > size_t prev_value = Atomic::cmpxchg(&_capacity_until_GC, old_capacity_until_GC, new_value); > > if (old_capacity_until_GC != prev_value) { > return false; > } > > if (new_cap_until_GC != NULL) { > *new_cap_until_GC = new_value; > } > if (old_cap_until_GC != NULL) { > *old_cap_until_GC = old_capacity_until_GC; > } > return true; > > From looking at this code, the only way I can see this function returning false would be if a concurrent thread > modified this threshold. Which should be super rare. > Hence my initial question about intermittentness. If its reproducable I do not think I understand it yet. > > Thanks, Thomas I think TestCapacityUntilGCWrapAround.java has two problems: 1. It might attempt to increase metaspace size over MaxMetaspaceSize (this PR fixes it) 2. Overflow test would always fail because `MetaspaceGC::inc_capacity_until_GC()` always returns `false` This test has introduced in JDK-8049599, but I cannot know details because JBS ticket is closed, but we can see its commit: https://github.com/openjdk/jdk/commit/6f4355a3a6be2482c388c2022fc7c3c4b2f481c5 I enabled this test on 64bit platforms because these problems might happen on them. If you agree with the fix for 1. , I will add the test for 2. to this PR. ------------- PR: https://git.openjdk.java.net/jdk/pull/628 From iklam at openjdk.java.net Wed Oct 14 03:23:24 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 14 Oct 2020 03:23:24 GMT Subject: RFR: 8253402: Convert vmSymbols::SID to enum class [v6] In-Reply-To: References: Message-ID: > Convert `vmSymbols::SID` to an `enum class` to provide better type safety. > > - The original enum type `vmSymbols::SID` cannot be forward-declared. I moved it out of the `vmSymbols` class and > renamed, so now it can be forward-declared as `enum class vmSymbolID : int;`, without including the large vmSymbols.hpp > file. > - This also breaks the mutual dependency between the `vmSymbols` and `vmIntrinsics` classes. Now the declaration of > `vmIntrinsics` can be moved from vmSymbols.hpp to vmIntrinsics.hpp, where it naturally belongs. > - Type-safe enumeration (contributed by Kim Barrett) > for (vmSymbolsIterator it = vmSymbolsRange.begin(); it != vmSymbolsRange.end(); ++it) { > vmSymbolID index = *it; .... > } > - I moved `vmSymbols::_symbols[]` to `Symbol::_vm_symbols[]`, and made it accessible via `Symbol::vm_symbol_at()`. This > way, header files (e.g. fieldInfo.hpp) that need to convert from `vmSymbolID` to `Symbol*` don't need to include the > large vmSymbols.hpp file. > - I changed the `VM_SYMBOL_ENUM_NAME` macro so that the users don't need to explicitly add the `vmSymbolID::` scope. > - I removed many unnecessary casts between `int` and `vmSymbolID`. > - The remaining casts are done via `vmSymbol::as_int()` and `vmSymbols::as_SID()` with range checks. > > ----- > If this is successful, I will do the same for `vmIntrinsics::ID`. Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: - Merge branch 'master' into 8253402-convert-vmsymbols-sid-to-enum-class - revert NO_SID to 0 due to assert(Symbol::_vm_symbols[NO_SID] == NULL) - Addressed review comments by Kim Barrett - added missing #include from enumIterator.hpp - Use 2-style EnumIterator - Merge master into 8253402-convert-vmsymbols-sid-to-enum-class - more vmEnums.hpp fixes; fixed minimal VM build - Merge branch 'master' into 8253402-convert-vmsymbols-sid-to-enum-class - Moved forward declaration of vmSymbolID to vmEnums.hpp - clean up whitespaces and removed useless comment - ... and 2 more: https://git.openjdk.java.net/jdk/compare/ba5dc67a...60178eae ------------- Changes: https://git.openjdk.java.net/jdk/pull/276/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=276&range=05 Stats: 799 lines in 29 files changed: 478 ins; 144 del; 177 mod Patch: https://git.openjdk.java.net/jdk/pull/276.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/276/head:pull/276 PR: https://git.openjdk.java.net/jdk/pull/276 From iklam at openjdk.java.net Wed Oct 14 03:32:18 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 14 Oct 2020 03:32:18 GMT Subject: RFR: 8253402: Convert vmSymbols::SID to enum class [v4] In-Reply-To: References: <9m7uFY5ij94oj3SQ9pTHNq-tsw0NnPHDVqHhznmAuOo=.bc75f19e-0527-4962-8b19-b178cfa8e572@github.com> Message-ID: On Tue, 13 Oct 2020 20:46:45 GMT, Ioi Lam wrote: >> src/hotspot/share/classfile/vmSymbols.hpp line 728: >> >>> 726: >>> 727: static constexpr int number_of_symbols() { >>> 728: return SID_LIMIT; >> >> SID_LIMIT includes NO_SID, so this seems to be off-by-one. > > I changed NO_SID to -1 so SID_LIMIT no longer includes NO_SID. I needed to revert NO_SID to 0. There's an assert in vmSymbols.cpp for `Symbol::_vm_symbols[NO_SID] == NULL`. This is pre-existing code and I don't understand why, but I should leave it as. As you can see in the old code, NO_SID is also counted as "part of the symbols" - static vmSymbols::SID vm_symbol_index[vmSymbols::SID_LIMIT]; + static vmSymbolID vm_symbol_index[vmSymbols::number_of_symbols()]; ------------- PR: https://git.openjdk.java.net/jdk/pull/276 From iklam at openjdk.java.net Wed Oct 14 04:57:24 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 14 Oct 2020 04:57:24 GMT Subject: RFR: 8254365: ciMethod.hpp should not include methodHandles.hpp [v2] In-Reply-To: References: Message-ID: <8FGKzLhxmojnmEfkFxFNpDDcR_z4iQvpE9jkposWsxs=.a14ab3aa-6f3c-40c7-b7a7-2c1c81e9a77e@github.com> > ciMethod.hpp includes methodHandles.hpp. This is probably a typo as ciMethod.hpp doesn't use the MethodHandles class. > Instead, it uses methodHandle which is declared in runtime/handles.hpp. > As usual, I had to fix a few .cpp files that used the MethodHandles class but did not explicitly include > methodHandles.hpp. > Tested with mach5 build tiers 1-5. Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch 'master' into 8254365-ciMethod-hpp-shouldnt-include-methodHandles - 8254365: ciMethod.hpp should not include methodHandles.hpp ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/623/files - new: https://git.openjdk.java.net/jdk/pull/623/files/187dc78a..3496d800 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=623&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=623&range=00-01 Stats: 4576 lines in 116 files changed: 1648 ins; 2160 del; 768 mod Patch: https://git.openjdk.java.net/jdk/pull/623.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/623/head:pull/623 PR: https://git.openjdk.java.net/jdk/pull/623 From iklam at openjdk.java.net Wed Oct 14 05:03:12 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 14 Oct 2020 05:03:12 GMT Subject: Integrated: 8254365: ciMethod.hpp should not include methodHandles.hpp In-Reply-To: References: Message-ID: On Tue, 13 Oct 2020 06:18:40 GMT, Ioi Lam wrote: > ciMethod.hpp includes methodHandles.hpp. This is probably a typo as ciMethod.hpp doesn't use the MethodHandles class. > Instead, it uses methodHandle which is declared in runtime/handles.hpp. > As usual, I had to fix a few .cpp files that used the MethodHandles class but did not explicitly include > methodHandles.hpp. > Tested with mach5 build tiers 1-5. This pull request has now been integrated. Changeset: a0980373 Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/a0980373 Stats: 23 lines in 21 files changed: 20 ins; 0 del; 3 mod 8254365: ciMethod.hpp should not include methodHandles.hpp Reviewed-by: dholmes, coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/623 From stuefe at openjdk.java.net Wed Oct 14 06:40:16 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 14 Oct 2020 06:40:16 GMT Subject: RFR: 8226236: [TESTBUG] win32: gc/metaspace/TestCapacityUntilGCWrapAround.java fails In-Reply-To: References: Message-ID: On Wed, 14 Oct 2020 03:17:47 GMT, Yasumasa Suenaga wrote: >> Hi Yasumasa, >> >> sorry, I do not understand what the error is yet and to me the patch does seem wrong in a number of places. >> >> Both test and whitebox function seem to be doing what they are supposed to do. >> >> Is this error intermittent or always reproducable? If the latter, since when (since the code seems old)? >> >> Have the AdoptOpenJDK folks done some pre-analysis? >> >> ---- >> >> From just looking at the code (cannot build 32bit right now), WB_IncMetaspaceCapacityUntilGC() seems correct to me. >> >> - in comes a jlong with 4G-1, so FFFF-FFFF >> - it fits into a 32bit size_t so its fine >> - we calculate aligned_inc by aligning the value down somewhat, probably a page: FFFF-F000 >> - we feed this value FFFF-F000 as *inc* value to MetaspaceGC::inc_capacity_until_GC >> - it returns false. We throw an exception. >> >> In MetaspaceGC::inc_capacity_until_GC: >> >> bool MetaspaceGC::inc_capacity_until_GC(size_t v, size_t* new_cap_until_GC, size_t* old_cap_until_GC, bool* can_retry) { >> assert_is_aligned(v, Metaspace::commit_alignment()); >> >> size_t old_capacity_until_GC = _capacity_until_GC; >> size_t new_value = old_capacity_until_GC + v; >> >> if (new_value < old_capacity_until_GC) { >> // The addition wrapped around, set new_value to aligned max value. >> new_value = align_down(max_uintx, Metaspace::commit_alignment()); >> } >> >> if (new_value > MaxMetaspaceSize) { >> if (can_retry != NULL) { >> *can_retry = false; >> } >> return false; >> } >> >> we enter with size_t v = FFFF-F000 >> >> We calc the new value. Since new value = old value + inc, this will overflow >> >> We sense the overflow and correct the new value to be max_uintx aligned down by commit alignment. >> >> Which will be smaller than MaxMetaspaceSize. Test does not set it therefore it is max_uintx. >> >> So far all good IIUC. >> >> >> if (can_retry != NULL) { >> *can_retry = true; >> } >> size_t prev_value = Atomic::cmpxchg(&_capacity_until_GC, old_capacity_until_GC, new_value); >> >> if (old_capacity_until_GC != prev_value) { >> return false; >> } >> >> if (new_cap_until_GC != NULL) { >> *new_cap_until_GC = new_value; >> } >> if (old_cap_until_GC != NULL) { >> *old_cap_until_GC = old_capacity_until_GC; >> } >> return true; >> >> From looking at this code, the only way I can see this function returning false would be if a concurrent thread >> modified this threshold. Which should be super rare. >> Hence my initial question about intermittentness. If its reproducable I do not think I understand it yet. >> >> Thanks, Thomas > > I think TestCapacityUntilGCWrapAround.java has two problems: > > 1. It might attempt to increase metaspace size over MaxMetaspaceSize (this PR fixes it) > 2. Overflow test would always fail because `MetaspaceGC::inc_capacity_until_GC()` always returns `false` > > This test has introduced in JDK-8049599, but I cannot know details because JBS ticket is closed, but we can see its > commit: https://github.com/openjdk/jdk/commit/6f4355a3a6be2482c388c2022fc7c3c4b2f481c5 I enabled this test on 64bit > platforms because these problems might happen on them. > If you agree with the fix for 1. , I will add the test for 2. to this PR. I found @shipilev's analysis in the bug report and understand better what happens now. I do not think this is a test error, the test is correct and shows a real issue. Just pasting my JBS comment here: The test verifies that calling MetaspaceGC::inc_capacity_until_GC() repeatedly will not overflow the gc threshold. 1) MetaspaceGC::ergo_initialize() MaxMetaspaceSize, if left unspecified, is supposed to be "infinite". In reality it defaults to max_uintx. But it gets aligned down by Metaspace::reserve_alignment. That one is either one of os::vm_allocation_granularity or 4 pages, whatever is larger. The 4 pages thing I recently added with JDK-8245707 to shake loose misuse of this alignment value (exactly cases like this). So on Windows, Metaspace::reserve_alignment has always been os::vm_allocation_granularity (64K). On Linux, it was always 4K, until recently it switched to 16K. This explains the 16K aligned value we see for MaxMetaspaceSize in Alexeys test: 4294950912 (FFFFC000). 2) MetaspaceGC::inc_capacity_until_GC() The increase value causes an overflow - its supposed to do that. Overflow gets handled by: if (new_value < old_capacity_until_GC) { // The addition wrapped around, set new_value to aligned max value. new_value = align_down(max_uintx, Metaspace::commit_alignment()); } which sets the new threshold at max_uintx aligned down by Metaspace::commit_alignment(), which is just os::vm_page size. So the new value is 4294963200, resp. FFFFF000 So the problem is that one value gets aligned down by Metaspace::reserve_alignment() (os::vm_allocation_granularity()), the other by Metaspace::commit_alignment (os::vm_page_size()). Then we compare those values. So the test is okay. It showed us a real issue. I am not 100% sure what the correct behavior would be. Maybe increasing the gc threshold beyond MaxMetaspaceSize should not be an error at all, but we should just cap out at that value? Possible solutions: A) As stated above, just cap. b) All that alignment business is actually unnecessary and we could remove it for clarity after JEP387 is out of the door. MaxMetaspaceSize does not need to be aligned to anything, neither does the GC threshold. C) as a small workaround, at (2) one probably could use Metaspace::reserve_alignment, not commit_alignment. That makes no sense semantically but would remove this error. (This must have been always a problem on Windows, and only recently on Linux, right?) ------------- PR: https://git.openjdk.java.net/jdk/pull/628 From github.com+4146708+a74nh at openjdk.java.net Wed Oct 14 08:23:11 2020 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Wed, 14 Oct 2020 08:23:11 GMT Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: References: <35eLsMpWmcCUoiEWhnYdSpZNmvLy4ra56Qtd6eRW574=.4e7c9278-3e0d-457d-9c15-eef45bae9755@github.com> Message-ID: On Tue, 13 Oct 2020 16:46:42 GMT, Patricio Chilano Mateo wrote: > Regarding the use of cross_modify_fence(), I filed a bug last week to remove some uneeded uses of them in common code > (https://bugs.openjdk.java.net/browse/JDK-8254264). Just a heads up before I send the RFR since I see some reference to > them in the added comments. I'm going to assume your change is just a two line change (removing the cross_modify_fence's), and I'll test that on top of my patches using the VerifyCrossModifyFence flag - I'll give it a run of everything, which can take a while. Plus I'll manually look at the code to to make sure I'm happy. I think it makes sense that your patch goes in first, then I can rebase and update code comments too. Let me know your pull request once you've raised it. ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From stuefe at openjdk.java.net Wed Oct 14 09:03:16 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 14 Oct 2020 09:03:16 GMT Subject: RFR: 8226236: [TESTBUG] win32: gc/metaspace/TestCapacityUntilGCWrapAround.java fails In-Reply-To: References: Message-ID: On Wed, 14 Oct 2020 06:36:59 GMT, Thomas Stuefe wrote: >> I think TestCapacityUntilGCWrapAround.java has two problems: >> >> 1. It might attempt to increase metaspace size over MaxMetaspaceSize (this PR fixes it) >> 2. Overflow test would always fail because `MetaspaceGC::inc_capacity_until_GC()` always returns `false` >> >> This test has introduced in JDK-8049599, but I cannot know details because JBS ticket is closed, but we can see its >> commit: https://github.com/openjdk/jdk/commit/6f4355a3a6be2482c388c2022fc7c3c4b2f481c5 I enabled this test on 64bit >> platforms because these problems might happen on them. >> If you agree with the fix for 1. , I will add the test for 2. to this PR. > > I found @shipilev's analysis in the bug report and understand better what happens now. > > I do not think this is a test error, the test is correct and shows a real issue. Just pasting my JBS comment here: > > The test verifies that calling MetaspaceGC::inc_capacity_until_GC() repeatedly will not overflow the gc threshold. > > 1) MetaspaceGC::ergo_initialize() > > MaxMetaspaceSize, if left unspecified, is supposed to be "infinite". In reality it defaults to max_uintx. But it gets > aligned down by Metaspace::reserve_alignment. That one is either one of os::vm_allocation_granularity or 4 pages, > whatever is larger. The 4 pages thing I recently added with JDK-8245707 to shake loose misuse of this alignment value > (exactly cases like this). So on Windows, Metaspace::reserve_alignment has always been os::vm_allocation_granularity > (64K). On Linux, it was always 4K, until recently it switched to 16K. This explains the 16K aligned value we see for > MaxMetaspaceSize in Alexeys test: 4294950912 (FFFFC000). > 2) MetaspaceGC::inc_capacity_until_GC() > > The increase value causes an overflow - its supposed to do that. Overflow gets handled by: > > if (new_value < old_capacity_until_GC) { > // The addition wrapped around, set new_value to aligned max value. > new_value = align_down(max_uintx, Metaspace::commit_alignment()); > } > > which sets the new threshold at max_uintx aligned down by Metaspace::commit_alignment(), which is just os::vm_page size. > > So the new value is 4294963200, resp. FFFFF000 > > So the problem is that one value gets aligned down by Metaspace::reserve_alignment() (os::vm_allocation_granularity()), > the other by Metaspace::commit_alignment (os::vm_page_size()). Then we compare those values. > So the test is okay. It showed us a real issue. I am not 100% sure what the correct behavior would be. Maybe increasing > the gc threshold beyond MaxMetaspaceSize should not be an error at all, but we should just cap out at that value? > Possible solutions: > > A) As stated above, just cap. > > b) All that alignment business is actually unnecessary and we could remove it for clarity after JEP387 is out of the > door. MaxMetaspaceSize does not need to be aligned to anything, neither does the GC threshold. > C) as a small workaround, at (2) one probably could use Metaspace::reserve_alignment, not commit_alignment. That makes > no sense semantically but would remove this error. > (This must have been always a problem on Windows, and only recently on Linux, right?) A minimal fix - which preserves the current behavior as much as possible - could be: --- a/src/hotspot/share/memory/metaspace.cpp +++ b/src/hotspot/share/memory/metaspace.cpp @@ -152,7 +152,7 @@ bool MetaspaceGC::inc_capacity_until_GC(size_t v, size_t* new_cap_until_GC, size if (new_value < old_capacity_until_GC) { // The addition wrapped around, set new_value to aligned max value. - new_value = align_down(max_uintx, Metaspace::commit_alignment()); + new_value = align_down(max_uintx, Metaspace::reserve_alignment()); } if (new_value > MaxMetaspaceSize) { This preserves the current behavior: - increasing the gc threshold beyond MaxMetaspaceSize should be an error - on 32bit, increasing the gc threshold beyond the scope of the 32bit counter should be tolerated and result in a value capped at the end of 32bit; unless MaxMetaspaceSize is set and lower than max, in which case it would be an error. Note that I have no possibility to test this since the 32bit build on Windows still seems broken. ------------- PR: https://git.openjdk.java.net/jdk/pull/628 From ysuenaga at openjdk.java.net Wed Oct 14 09:34:13 2020 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 14 Oct 2020 09:34:13 GMT Subject: RFR: 8226236: [TESTBUG] win32: gc/metaspace/TestCapacityUntilGCWrapAround.java fails In-Reply-To: References: Message-ID: On Wed, 14 Oct 2020 08:59:51 GMT, Thomas Stuefe wrote: >> I found @shipilev's analysis in the bug report and understand better what happens now. >> >> I do not think this is a test error, the test is correct and shows a real issue. Just pasting my JBS comment here: >> >> The test verifies that calling MetaspaceGC::inc_capacity_until_GC() repeatedly will not overflow the gc threshold. >> >> 1) MetaspaceGC::ergo_initialize() >> >> MaxMetaspaceSize, if left unspecified, is supposed to be "infinite". In reality it defaults to max_uintx. But it gets >> aligned down by Metaspace::reserve_alignment. That one is either one of os::vm_allocation_granularity or 4 pages, >> whatever is larger. The 4 pages thing I recently added with JDK-8245707 to shake loose misuse of this alignment value >> (exactly cases like this). So on Windows, Metaspace::reserve_alignment has always been os::vm_allocation_granularity >> (64K). On Linux, it was always 4K, until recently it switched to 16K. This explains the 16K aligned value we see for >> MaxMetaspaceSize in Alexeys test: 4294950912 (FFFFC000). >> 2) MetaspaceGC::inc_capacity_until_GC() >> >> The increase value causes an overflow - its supposed to do that. Overflow gets handled by: >> >> if (new_value < old_capacity_until_GC) { >> // The addition wrapped around, set new_value to aligned max value. >> new_value = align_down(max_uintx, Metaspace::commit_alignment()); >> } >> >> which sets the new threshold at max_uintx aligned down by Metaspace::commit_alignment(), which is just os::vm_page size. >> >> So the new value is 4294963200, resp. FFFFF000 >> >> So the problem is that one value gets aligned down by Metaspace::reserve_alignment() (os::vm_allocation_granularity()), >> the other by Metaspace::commit_alignment (os::vm_page_size()). Then we compare those values. >> So the test is okay. It showed us a real issue. I am not 100% sure what the correct behavior would be. Maybe increasing >> the gc threshold beyond MaxMetaspaceSize should not be an error at all, but we should just cap out at that value? >> Possible solutions: >> >> A) As stated above, just cap. >> >> b) All that alignment business is actually unnecessary and we could remove it for clarity after JEP387 is out of the >> door. MaxMetaspaceSize does not need to be aligned to anything, neither does the GC threshold. >> C) as a small workaround, at (2) one probably could use Metaspace::reserve_alignment, not commit_alignment. That makes >> no sense semantically but would remove this error. >> (This must have been always a problem on Windows, and only recently on Linux, right?) > > A minimal fix - which preserves the current behavior as much as possible - could be: > > --- a/src/hotspot/share/memory/metaspace.cpp > +++ b/src/hotspot/share/memory/metaspace.cpp > @@ -152,7 +152,7 @@ bool MetaspaceGC::inc_capacity_until_GC(size_t v, size_t* new_cap_until_GC, size > > if (new_value < old_capacity_until_GC) { > // The addition wrapped around, set new_value to aligned max value. > - new_value = align_down(max_uintx, Metaspace::commit_alignment()); > + new_value = align_down(max_uintx, Metaspace::reserve_alignment()); > } > > if (new_value > MaxMetaspaceSize) { > > This preserves the current behavior: > - increasing the gc threshold beyond MaxMetaspaceSize should be an error > - on 32bit, increasing the gc threshold beyond the scope of the 32bit counter should be tolerated and result in a value > capped at the end of 32bit; unless MaxMetaspaceSize is set and lower than max, in which case it would be an error. > > Note that I have no possibility to test this since the 32bit build on Windows still seems broken. Thanks @tstuefe for the fix! I will merge it to this PR. > (This must have been always a problem on Windows, and only recently on Linux, right?) I don't know how long this problem has been available, but I saw this on Linux x64 when I removed `vm.bits == 32`. It is caused by new_value exceeds MaxMetaspaceSize. I think we should be current behavior should be preserved. I added the logic which relates to `_capacity_until_gc` in [JDK-8217432](https://bugs.openjdk.java.net/browse/JDK-8217432). Before this fix, I saw OutOfMemoryError despite that the memory was still available. And also @shipilev seems to had seen [the problems with Metaspace on Epsilon](https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2019-January/024595.html). BTW TestCapacityUntilGCWrapAround.java still should be run on 32bit platforms only? I think we should check whether `_capacity_until_gc` exceeds MaxMetaspaceSize on 64bit platforms. ------------- PR: https://git.openjdk.java.net/jdk/pull/628 From stefank at openjdk.java.net Wed Oct 14 10:32:16 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 14 Oct 2020 10:32:16 GMT Subject: Integrated: 8254668: JVMTI process frames on thread without started processing In-Reply-To: References: Message-ID: On Tue, 13 Oct 2020 09:25:55 GMT, Stefan Karlsson wrote: > I hit the following assert in some tests runs that I've been doing: > # Internal Error (/home/stefank/git/alt/open/src/hotspot/share/runtime/stackWatermark.inline.hpp:67), pid=828170, > tid=828734 # assert(processing_started()) failed: Processing should already have started > > The stack traces for this has been: > Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x1626d75] StackWatermarkSet::on_iteration(JavaThread*, frame const&)+0xd5 > V [libjvm.so+0xad791a] frame::sender(RegisterMap*) const+0x7a > V [libjvm.so+0xacd3f8] frame::real_sender(RegisterMap*) const+0x18 > V [libjvm.so+0x1804c4a] vframe::sender() const+0xea > V [libjvm.so+0x175f47b] JavaThread::last_java_vframe(RegisterMap*)+0x5b > V [libjvm.so+0x10e10fc] JvmtiEnvBase::vframeFor(JavaThread*, int)+0x4c > V [libjvm.so+0x10e6972] JvmtiEnvBase::check_top_frame(Thread*, JavaThread*, jvalue, TosState, Handle*)+0xe2 > V [libjvm.so+0x10e759c] JvmtiEnvBase::force_early_return(JavaThread*, jvalue, TosState)+0x11c > V [libjvm.so+0x105b8f5] jvmti_ForceEarlyReturnObject+0x215 > V [libjvm.so+0x1626d75] StackWatermarkSet::on_iteration(JavaThread*, frame const&)+0xd5 > V [libjvm.so+0xad791a] frame::sender(RegisterMap*) const+0x7a > V [libjvm.so+0xacd3f8] frame::real_sender(RegisterMap*) const+0x18 > V [libjvm.so+0x1804c4a] vframe::sender() const+0xea > V [libjvm.so+0x1804d00] vframe::java_sender() const+0x10 > V [libjvm.so+0x10e1115] JvmtiEnvBase::vframeFor(JavaThread*, int)+0x65 > V [libjvm.so+0x10d475f] JvmtiEnv::NotifyFramePop(JavaThread*, int)+0x9f > V [libjvm.so+0x106b6aa] jvmti_NotifyFramePop+0x23a > The code inspects the top frame of a suspended java thread. However, there's nothing in the code that starts the > watermark processing of the thread, so the code asserts when sender calls on_iteration. > We only have to call start_processing/on_iteration when oops are being read. The failing code does *not* inspect any > oops, so I turn of the on_iteration call by settings process_frame to false. > To notify the readers of the code that vframeFor doesn't process the oops, I've renamed the function to > vframeForNoProcess to give a visual cue. > I found this bug when running this command line: > makec ../build/fastdebug/ test TEST=test/hotspot/jtreg/vmTestbase/nsk/jvmti > JTREG="JAVA_OPTIONS=-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=1 -XX:ZFragmentationLimit=0.01" > JTREG_EXTRA_PROBLEM_LISTS=ProblemList-zgc.txt Five tests consistently asserts with this command line. All tests pass > with the proposed fix. > Recommendations of tests to run are welcome. I intend to get this run through tier1-3, but haven't yet. This pull request has now been integrated. Changeset: db9dcdf1 Author: Stefan Karlsson URL: https://git.openjdk.java.net/jdk/commit/db9dcdf1 Stats: 9 lines in 3 files changed: 1 ins; 0 del; 8 mod 8254668: JVMTI process frames on thread without started processing Reviewed-by: eosterlund, rrich ------------- PR: https://git.openjdk.java.net/jdk/pull/627 From mdoerr at openjdk.java.net Wed Oct 14 10:33:11 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 14 Oct 2020 10:33:11 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v4] In-Reply-To: References: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> <7-p-Kc9lQyyuoWdNtmgbXbwkxsgk4oQGKmFSCcMpvnU=.97810c01-3200-4767-bbd4-35d53c2bc5ca@github.com> <6Voyfr_s-ieyRA-8Rtvvpz7tkhhicA8sY2d2KTp3Kmw=.fa256bae-2143-4b43-bfea-5837ad31eb6a@github.com> Message-ID: On Wed, 14 Oct 2020 02:04:42 GMT, CoreyAshford wrote: >>> Hi Corey, thanks for taking some stuff out of the ?too short? path. There may be a performance regression when decoding >>> many short arrays because of the stub call overhead and the usage of the slower part of the Java implementation. We >>> could do it a little better in many cases to compute the maximum possible iteration count i: i = (sl - sp) / block_size >>> if (i * block_size > sl - 12) i-- if (i <= 0) return 0 What do you think? >> >> Are you thinking of a case where that produces a higher iteration count? It looks effectively the same to me. >> >>> I don?t think branch prediction hints are helpful for the ?too short? check. >> >> My thinking is that most of the time when the intrinsic is called, it will not take the early exit, but I suppose when >> it is processing a sub-block_size buffer, it will return early every time. I will remove the hints. >>> And we should better use CCR1 instead of CCR2 which is specified as non-volatile. >> >> Ah, I should have checked the calling conventions. I thought all of the CR* regs are volatile. I will fix that. >> >>> Did you already find a 2nd reviewer for the PPC64 part? >> >> Your original comment said "2nd review", so I thought you meant you need to review it again after the changes. So, no, >> I haven't looked for or found a second reviewer. Any suggestions? The folks on the team here have been busy with >> other work. Btw, I'm off today, so I will push commits to the above-mentioned issues tomorrow. > >> >> > Did you already find a 2nd reviewer for the PPC64 part? >> >> Your original comment said "2nd review", so I thought you meant you need to review it again after the changes. So, no, >> I haven't looked for or found a second reviewer. Any suggestions? The folks on the team here have been busy with other >> work. > > I am actively asking for some help here, so maybe within a few days I can get a 2nd reviewer. Hi Corey, > Are you thinking of a case where that produces a higher iteration count? Sorry for the confusion. This is also fine: length = sl - sp - 12 i = length / block_size if (i <= 0) return 0 But I still wonder why we should use 2 branches. Why not srawi_ ble(CCR0, return_zero) ? > Ah, I should have checked the calling conventions. I thought all of the CR* regs are volatile. I will fix that. Actually, we do save and restore all CRs, so it?s not a real problem with the current implementation. But I prefer staying closer to the elf ABI as long as there?s no good reason to do it differently. > Your original comment said "2nd review", so I thought you meant you need to review it again after the changes. We usually require at least 2 reviews by different people for all non-trivial changes. And I don?t consider the PPC64 part as trivial. In addition to that, I?m not familiar with Power 10. Best regards, Martin From: CoreyAshford Sent: Dienstag, 13. Oktober 2020 22:59 To: openjdk/jdk Cc: Doerr, Martin ; Mention Subject: Re: [openjdk/jdk] 8248188: Add IntrinsicCandidate and API for Base64 decoding (#293) Hi Corey, thanks for taking some stuff out of the ?too short? path. There may be a performance regression when decoding many short arrays because of the stub call overhead and the usage of the slower part of the Java implementation. We could do it a little better in many cases to compute the maximum possible iteration count i: i = (sl - sp) / block_size if (i * block_size > sl - 12) i-- if (i <= 0) return 0 What do you think? Are you thinking of a case where that produces a higher iteration count? It looks effectively the same to me. I don?t think branch prediction hints are helpful for the ?too short? check. My thinking is that most of the time when the intrinsic is called, it will not take the early exit, but I suppose when it is processing a sub-block_size buffer, it will return early every time. I will remove the hints. And we should better use CCR1 instead of CCR2 which is specified as non-volatile. Ah, I should have checked the calling conventions. I thought all of the CR* regs are volatile. I will fix that. Did you already find a 2nd reviewer for the PPC64 part? Your original comment said "2nd review", so I thought you meant you need to review it again after the changes. So, no, I haven't looked for or found a second reviewer. Any suggestions? The folks on the team here have been busy with other work. Btw, I'm off today, so I will push commits to the above-mentioned issues tomorrow. ? You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From redestad at openjdk.java.net Wed Oct 14 10:39:20 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Wed, 14 Oct 2020 10:39:20 GMT Subject: RFR: 8254744: Clean-up CodeBlob::align_code_offset Message-ID: - Modernize by using align_up - Move definition of the trivial CodeHeap::header_size() to header to help inlining, which optimizes align_code_offset and others. No big gain, but we're calling CodeHeap::header_size() 1300+ times on Hello World, and when inlined it's just a constant fold. ------------- Commit messages: - Merge branch 'master' into align_code - Clean-up align_code_offset Changes: https://git.openjdk.java.net/jdk/pull/651/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=651&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254744 Stats: 9 lines in 3 files changed: 0 ins; 6 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/651.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/651/head:pull/651 PR: https://git.openjdk.java.net/jdk/pull/651 From stuefe at openjdk.java.net Wed Oct 14 10:46:11 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 14 Oct 2020 10:46:11 GMT Subject: RFR: 8226236: [TESTBUG] win32: gc/metaspace/TestCapacityUntilGCWrapAround.java fails In-Reply-To: References: Message-ID: On Wed, 14 Oct 2020 09:31:13 GMT, Yasumasa Suenaga wrote: >> A minimal fix - which preserves the current behavior as much as possible - could be: >> >> --- a/src/hotspot/share/memory/metaspace.cpp >> +++ b/src/hotspot/share/memory/metaspace.cpp >> @@ -152,7 +152,7 @@ bool MetaspaceGC::inc_capacity_until_GC(size_t v, size_t* new_cap_until_GC, size >> >> if (new_value < old_capacity_until_GC) { >> // The addition wrapped around, set new_value to aligned max value. >> - new_value = align_down(max_uintx, Metaspace::commit_alignment()); >> + new_value = align_down(max_uintx, Metaspace::reserve_alignment()); >> } >> >> if (new_value > MaxMetaspaceSize) { >> >> This preserves the current behavior: >> - increasing the gc threshold beyond MaxMetaspaceSize should be an error >> - on 32bit, increasing the gc threshold beyond the scope of the 32bit counter should be tolerated and result in a value >> capped at the end of 32bit; unless MaxMetaspaceSize is set and lower than max, in which case it would be an error. >> >> Note that I have no possibility to test this since the 32bit build on Windows still seems broken. > > Thanks @tstuefe for the fix! I will merge it to this PR. > >> (This must have been always a problem on Windows, and only recently on Linux, right?) > > I don't know how long this problem has been available, but I saw this on Linux x64 when I removed `vm.bits == 32`. It > is caused by new_value exceeds MaxMetaspaceSize. > I think we should be current behavior should be preserved. I added the logic which relates to `_capacity_until_gc` in > [JDK-8217432](https://bugs.openjdk.java.net/browse/JDK-8217432). Before this fix, I saw OutOfMemoryError despite that > the memory was still available. And also @shipilev seems to had seen [the problems with Metaspace on > Epsilon](https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2019-January/024595.html). BTW > TestCapacityUntilGCWrapAround.java still should be run on 32bit platforms only? I think we should check whether > `_capacity_until_gc` exceeds MaxMetaspaceSize on 64bit platforms. Hi Yasumasa, > Thanks @tstuefe for the fix! I will merge it to this PR. > No I think this is the fix. I still think the test is correct and does what it is supposed to do, see below. > > (This must have been always a problem on Windows, and only recently on Linux, right?) > > I don't know how long this problem has been available, but I saw this on Linux x64 when I removed `vm.bits == 32`. It > is caused by new_value exceeds MaxMetaspaceSize. You see this on 64bit because the test, as it is coded now, is not meant to run on 64bit. When running on 64bit first you will trigger this exception: jlong max_size_t = (jlong) ((size_t) -1); if (inc > max_size_t) { THROW_MSG_0(vmSymbols::java_lang_IllegalArgumentException(), err_msg("WB_IncMetaspaceCapacityUntilGC: inc does not fit in size_t: " JLONG_FORMAT, inc)); } because it was written with 32bit in mind. Its purpose is to check if the inc value - itself a jlong - exceeds the range for a 32bit unsigned. It uses a jlong - 64bit signed - as comparison variable and sets it to (size_t)-1, clearly to get a 32bit SIZE_MAX. On 64bit it is oc a real -1, so the comparison will always be true. Once you fix that or comment it out, on 64bit you will trigger this test assertion: Asserts.assertLTE(after, MAX_UINT, "Increasing with MAX_UINT should not cause value larger than MAX_UINT:" + after); because we just increased the gc threshold beyond the 32bit MAX_UINT value coded in the test. Granted, one could make this test cleaner and make it work on 64bit. But what for? This overflow is only an issue on 32bit. It theoretically could be an issue for 64bit too, if you call the increase function just very very often enough, but I think that is a bit far fetched. If you enable the test for 64 bit you have to make sure that you call incMetaspaceCapacityUntilGC() with either a high enough increase or often enough to ensure that you hit the limit of a 64bit gc threshold counter. And then fix the overflow handling. But I seriously think this is not a real world issue. Alternatively one could #ifdef the whole whitebox function out for 64bit, just compile it on 32bit. > > I think we should be current behavior should be preserved. I added the logic which relates to `_capacity_until_gc` in > [JDK-8217432](https://bugs.openjdk.java.net/browse/JDK-8217432). Before this fix, I saw OutOfMemoryError despite that > the memory was still available. And also @shipilev seems to had seen [the problems with Metaspace on > Epsilon](https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2019-January/024595.html). BTW > TestCapacityUntilGCWrapAround.java still should be run on 32bit platforms only? I think we should check whether > `_capacity_until_gc` exceeds MaxMetaspaceSize on 64bit platforms. Look at it this way: The tests are called with MaxMetaspaceSize unset, hence "infinite". Therefore the VM should *never* run into a situation where it thinks that the GC threshold is larger than MaxMetaspaceSize. Regardless what input values we feed into the increase function. The test feeds it very large values on purpose. Reducing the inc values defeats the purpose of this test. The error is caused because, technically, MaxMetaspaceSize="infinite" is in reality implemented by MaxMetaspaceSize="very large value but a bit smaller than SIZE_MAX because of alignment". Which gives a small window, in this case of 16K, for values to be larger than MaxMetaspaceSize. And then the overflow handling is broken and causes, as part of its value correction, the gc threshold to fall into this 16K window and be larger than MaxMetaspaceSize. And that exactly is the error. The test really could be clearer and have comments and such, and the GC metaspace coding too. But the test is not wrong. ------------- PR: https://git.openjdk.java.net/jdk/pull/628 From mdoerr at openjdk.java.net Wed Oct 14 10:48:12 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 14 Oct 2020 10:48:12 GMT Subject: RFR: 8254744: Clean-up CodeBlob::align_code_offset In-Reply-To: References: Message-ID: <8JGtLR7_yv2rTdOoz_jM7GJGFt2VFhO3QoILT71jDeI=.269a8e49-f6c0-42c5-b91a-48427b8a5048@github.com> On Wed, 14 Oct 2020 10:34:32 GMT, Claes Redestad wrote: > - Modernize by using align_up > - Move definition of the trivial CodeHeap::header_size() to header to help inlining, which optimizes align_code_offset > and others. No big gain, but we're calling CodeHeap::header_size() 1300+ times on Hello World, and when inlined it's > just a constant fold. Marked as reviewed by mdoerr (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/651 From eosterlund at openjdk.java.net Wed Oct 14 10:48:12 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 14 Oct 2020 10:48:12 GMT Subject: RFR: 8254744: Clean-up CodeBlob::align_code_offset In-Reply-To: References: Message-ID: On Wed, 14 Oct 2020 10:34:32 GMT, Claes Redestad wrote: > - Modernize by using align_up > - Move definition of the trivial CodeHeap::header_size() to header to help inlining, which optimizes align_code_offset > and others. No big gain, but we're calling CodeHeap::header_size() 1300+ times on Hello World, and when inlined it's > just a constant fold. Looks good. ------------- Marked as reviewed by eosterlund (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/651 From akozlov at openjdk.java.net Wed Oct 14 11:00:15 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Wed, 14 Oct 2020 11:00:15 GMT Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on macOS [v5] In-Reply-To: References: <6iVRP-20baz0_46SouR-dj9SyspR5QvaL9iJMdeipDE=.92688b4e-ebd3-4681-8e63-a4aee752c407@github.com> <_XaA5cQEInPMn5Q5gj2y7AFCRprFQiYfI6BeUN49FhA=.9f17ae05-b37e-4f40-a83f-fd34aa812575@github.com> Message-ID: On Tue, 13 Oct 2020 13:10:13 GMT, Thomas Stuefe wrote: > Are there any users of executable memory which cannot live with anonymous mapping on whatever address with small pages? > Does anyone need large pages or a specific wish address? Nothing jumps out immediately. Recently we've come across CDS problems, which also requires executable permissions, but it uses file-based mapping and os::map_memory. > If not, maybe we really should introduce a (reserve|commit|uncommit|release)_executable_memory() at least temporarily, > as you suggested. At least that would be clear, and could provide a clear starting point for a new interface. Then I'll start doing this. I'll create another JBS issue for the interface closer to the point when it is ready. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/294 From neliasso at openjdk.java.net Wed Oct 14 12:06:19 2020 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Wed, 14 Oct 2020 12:06:19 GMT Subject: RFR: 8173585: Intrinsify StringLatin1.indexOf(char) [v6] In-Reply-To: References: <_T0873dC5tfUtGn9r1_Y21JkPKRt-za3MM9hPN2GQKQ=.b865fe53-5417-424f-81b6-1566a330640e@github.com> Message-ID: On Mon, 12 Oct 2020 11:17:25 GMT, Jason Tatton wrote: >> This is an implementation of the indexOf(char) intrinsic for StringLatin1 (1 byte encoded Strings). It is provided for >> x86 and ARM64. The implementation is greatly inspired by the indexOf(char) intrinsic for StringUTF16. To incorporate it >> I had to make a small change to StringLatin1.java (refactor of functionality to intrisified private method) as well as >> code for C2. Submitted to: hotspot-compiler-dev and core-libs-dev as this patch contains a change to hotspot and >> java/lang/StringLatin1.java https://bugs.openjdk.java.net/browse/JDK-8173585 >> >> Details of testing: >> ============ >> I have created a jtreg test ?compiler/intrinsics/string/TestStringLatin1IndexOfChar? to cover this new intrinsic. Note >> that, particularly for the x86 implementation of the intrinsic, the code path taken is dependent upon the length of the >> input String. Hence the test has been designed to cover all these cases. In summary they are: >> - A ?short? string of < 16 characters. >> - A SIMD String of 16 ? 31 characters. >> - A AVX2 SIMD String of 32 characters+. >> >> Hardware used for testing: >> ----------------------------- >> >> - Intel Xeon CPU E5-2680 (JVM did not recognize this as having AVX2 support) ? Intel i7 processor (with AVX2 support). >> - AWS Graviton 2 (ARM 64 processor). >> >> I also ran; ?run-test-tier1? and ?run-test-tier2? for: x86_64 and aarch64. >> >> Possible future enhancements: >> ==================== >> For the x86 implementation there may be two further improvements we can make in order to improve performance of both >> the StringUTF16 and StringLatin1 indexOf(char) intrinsics: >> 1. Make use of AVX-512 instructions. >> 2. For ?short? Strings (see below), I think it may be possible to modify the existing algorithm to still use SSE SIMD >> instructions instead of a loop. >> Benchmark results: >> ============ >> **Without** the new StringLatin1 indexOf(char) intrinsic: >> >> | Benchmark | Mode | Cnt | Score | Error | Units | >> | ------------- | ------------- |------------- |------------- |------------- |------------- | >> | IndexOfBenchmark.latin1_mixed_char | avgt | 5 | **26,389.129** | ? 182.581 | ns/op | >> | IndexOfBenchmark.utf16_mixed_char | avgt | 5 | 17,885.383 | ? 435.933 | ns/op | >> >> >> **With** the new StringLatin1 indexOf(char) intrinsic: >> >> | Benchmark | Mode | Cnt | Score | Error | Units | >> | ------------- | ------------- |------------- |------------- |------------- |------------- | >> | IndexOfBenchmark.latin1_mixed_char | avgt | 5 | **17,875.185** | ? 407.716 | ns/op | >> | IndexOfBenchmark.utf16_mixed_char | avgt | 5 | 18,292.802 | ? 167.306 | ns/op | >> >> >> The objective of the patch is to bring the performance of StringLatin1 indexOf(char) in line with StringUTF16 >> indexOf(char) for x86 and ARM64. We can see above that this has been achieved. Similar results were obtained when >> running on ARM. > > Jason Tatton has updated the pull request incrementally with one additional commit since the last revision: > > Added missing copyright notices Marked as reviewed by neliasso (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/71 From neliasso at openjdk.java.net Wed Oct 14 12:11:11 2020 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Wed, 14 Oct 2020 12:11:11 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions In-Reply-To: References: <_0-zfIDPieC0Xnc17GaSSsS7Sz9EEUrfjRyqWDtphfU=.298bacde-f330-486a-8bea-03ff1523d00c@github.com> Message-ID: On Thu, 8 Oct 2020 17:29:27 GMT, Jatin Bhateja wrote: >>> Can you explain why 32 bytes are such a distinct performance cliff? >>> >>> Is there any performance difference between doing a single 64 bytes masked copy or two 32 bytes? >> >> Hi Nils, >> Copy for sizes <= 32 bytes can be done using one YMM register, AVX-512 vector length extension allows masked >> instructions to operate on YMM and XMM registers. Using newly added flag -XX:ArrayCopyPartialInlineSize=64 one can >> perform in-lining up to 64 bytes but since it will use a ZMM register CPU will operate at a lower frequency but it >> could still give better performance depending on the application. A single 64 byte masked copy may have a performance >> hit if for majority of the application runtime, CPU operates at highest frequency. There is a switchover penalty from >> higher frequency level to lower frequency level along with some hysteresis which forces subsequent instructions to >> operate a lower frequency for some cycles. Current implementation has been kept simple to avoid emitting too many >> instruction at call site considering arraycopy is a very high frequency operation. > > Hi @neliasso , @vnkozlov , kindly let me know your review comments. Hi Jatin, I'm ready to approve it, but I would like to kick it through some performance testing first. Best regards, Nils Eliasson ------------- PR: https://git.openjdk.java.net/jdk/pull/302 From github.com+70893615+jasontatton-aws at openjdk.java.net Wed Oct 14 13:01:24 2020 From: github.com+70893615+jasontatton-aws at openjdk.java.net (Jason Tatton) Date: Wed, 14 Oct 2020 13:01:24 GMT Subject: Integrated: 8173585: Intrinsify StringLatin1.indexOf(char) In-Reply-To: <_T0873dC5tfUtGn9r1_Y21JkPKRt-za3MM9hPN2GQKQ=.b865fe53-5417-424f-81b6-1566a330640e@github.com> References: <_T0873dC5tfUtGn9r1_Y21JkPKRt-za3MM9hPN2GQKQ=.b865fe53-5417-424f-81b6-1566a330640e@github.com> Message-ID: On Tue, 8 Sep 2020 11:59:36 GMT, Jason Tatton wrote: > This is an implementation of the indexOf(char) intrinsic for StringLatin1 (1 byte encoded Strings). It is provided for > x86 and ARM64. The implementation is greatly inspired by the indexOf(char) intrinsic for StringUTF16. To incorporate it > I had to make a small change to StringLatin1.java (refactor of functionality to intrisified private method) as well as > code for C2. Submitted to: hotspot-compiler-dev and core-libs-dev as this patch contains a change to hotspot and > java/lang/StringLatin1.java https://bugs.openjdk.java.net/browse/JDK-8173585 > > Details of testing: > ============ > I have created a jtreg test ?compiler/intrinsics/string/TestStringLatin1IndexOfChar? to cover this new intrinsic. Note > that, particularly for the x86 implementation of the intrinsic, the code path taken is dependent upon the length of the > input String. Hence the test has been designed to cover all these cases. In summary they are: > - A ?short? string of < 16 characters. > - A SIMD String of 16 ? 31 characters. > - A AVX2 SIMD String of 32 characters+. > > Hardware used for testing: > ----------------------------- > > - Intel Xeon CPU E5-2680 (JVM did not recognize this as having AVX2 support) ? Intel i7 processor (with AVX2 support). > - AWS Graviton 2 (ARM 64 processor). > > I also ran; ?run-test-tier1? and ?run-test-tier2? for: x86_64 and aarch64. > > Possible future enhancements: > ==================== > For the x86 implementation there may be two further improvements we can make in order to improve performance of both > the StringUTF16 and StringLatin1 indexOf(char) intrinsics: > 1. Make use of AVX-512 instructions. > 2. For ?short? Strings (see below), I think it may be possible to modify the existing algorithm to still use SSE SIMD > instructions instead of a loop. > Benchmark results: > ============ > **Without** the new StringLatin1 indexOf(char) intrinsic: > > | Benchmark | Mode | Cnt | Score | Error | Units | > | ------------- | ------------- |------------- |------------- |------------- |------------- | > | IndexOfBenchmark.latin1_mixed_char | avgt | 5 | **26,389.129** | ? 182.581 | ns/op | > | IndexOfBenchmark.utf16_mixed_char | avgt | 5 | 17,885.383 | ? 435.933 | ns/op | > > > **With** the new StringLatin1 indexOf(char) intrinsic: > > | Benchmark | Mode | Cnt | Score | Error | Units | > | ------------- | ------------- |------------- |------------- |------------- |------------- | > | IndexOfBenchmark.latin1_mixed_char | avgt | 5 | **17,875.185** | ? 407.716 | ns/op | > | IndexOfBenchmark.utf16_mixed_char | avgt | 5 | 18,292.802 | ? 167.306 | ns/op | > > > The objective of the patch is to bring the performance of StringLatin1 indexOf(char) in line with StringUTF16 > indexOf(char) for x86 and ARM64. We can see above that this has been achieved. Similar results were obtained when > running on ARM. This pull request has now been integrated. Changeset: f71e8a61 Author: Jason Tatton (AWS) Committer: Paul Hohensee URL: https://git.openjdk.java.net/jdk/commit/f71e8a61 Stats: 613 lines in 15 files changed: 597 ins; 0 del; 16 mod 8173585: Intrinsify StringLatin1.indexOf(char) Reviewed-by: neliasso ------------- PR: https://git.openjdk.java.net/jdk/pull/71 From ysuenaga at openjdk.java.net Wed Oct 14 13:18:36 2020 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 14 Oct 2020 13:18:36 GMT Subject: RFR: 8226236: [TESTBUG] win32: gc/metaspace/TestCapacityUntilGCWrapAround.java fails [v2] In-Reply-To: References: Message-ID: On Wed, 14 Oct 2020 10:43:44 GMT, Thomas Stuefe wrote: >> Thanks @tstuefe for the fix! I will merge it to this PR. >> >>> (This must have been always a problem on Windows, and only recently on Linux, right?) >> >> I don't know how long this problem has been available, but I saw this on Linux x64 when I removed `vm.bits == 32`. It >> is caused by new_value exceeds MaxMetaspaceSize. >> I think we should be current behavior should be preserved. I added the logic which relates to `_capacity_until_gc` in >> [JDK-8217432](https://bugs.openjdk.java.net/browse/JDK-8217432). Before this fix, I saw OutOfMemoryError despite that >> the memory was still available. And also @shipilev seems to had seen [the problems with Metaspace on >> Epsilon](https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2019-January/024595.html). BTW >> TestCapacityUntilGCWrapAround.java still should be run on 32bit platforms only? I think we should check whether >> `_capacity_until_gc` exceeds MaxMetaspaceSize on 64bit platforms. > > Hi Yasumasa, > >> Thanks @tstuefe for the fix! I will merge it to this PR. >> > > No I think this is the fix. I still think the test is correct and does what it is supposed to do, see below. > >> > (This must have been always a problem on Windows, and only recently on Linux, right?) >> >> I don't know how long this problem has been available, but I saw this on Linux x64 when I removed `vm.bits == 32`. It >> is caused by new_value exceeds MaxMetaspaceSize. > > You see this on 64bit because the test, as it is coded now, is not meant to run on 64bit. > > When running on 64bit first you will trigger this exception: > jlong max_size_t = (jlong) ((size_t) -1); > if (inc > max_size_t) { > THROW_MSG_0(vmSymbols::java_lang_IllegalArgumentException(), > err_msg("WB_IncMetaspaceCapacityUntilGC: inc does not fit in size_t: " JLONG_FORMAT, inc)); > } > because it was written with 32bit in mind. Its purpose is to check if the inc value - itself a jlong - exceeds the > range for a 32bit unsigned. It uses a jlong - 64bit signed - as comparison variable and sets it to (size_t)-1, clearly > to get a 32bit SIZE_MAX. On 64bit it is oc a real -1, so the comparison will always be true. Once you fix that or > comment it out, on 64bit you will trigger this test assertion: > Asserts.assertLTE(after, MAX_UINT, > "Increasing with MAX_UINT should not cause value larger than MAX_UINT:" + after); > because we just increased the gc threshold beyond the 32bit MAX_UINT value coded in the test. > > Granted, one could make this test cleaner and make it work on 64bit. But what for? This overflow is only an issue on > 32bit. It theoretically could be an issue for 64bit too, if you call the increase function just very very often enough, > but I think that is a bit far fetched. If you enable the test for 64 bit you have to make sure that you call > incMetaspaceCapacityUntilGC() with either a high enough increase or often enough to ensure that you hit the limit of a > 64bit gc threshold counter. And then fix the overflow handling. But I seriously think this is not a real world issue. > Alternatively one could #ifdef the whole whitebox function out for 64bit, just compile it on 32bit. >> >> I think we should be current behavior should be preserved. I added the logic which relates to `_capacity_until_gc` in >> [JDK-8217432](https://bugs.openjdk.java.net/browse/JDK-8217432). Before this fix, I saw OutOfMemoryError despite that >> the memory was still available. And also @shipilev seems to had seen [the problems with Metaspace on >> Epsilon](https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2019-January/024595.html). BTW >> TestCapacityUntilGCWrapAround.java still should be run on 32bit platforms only? I think we should check whether >> `_capacity_until_gc` exceeds MaxMetaspaceSize on 64bit platforms. > > Look at it this way: > > The tests are called with MaxMetaspaceSize unset, hence "infinite". Therefore the VM should *never* run into a > situation where it thinks that the GC threshold is larger than MaxMetaspaceSize. Regardless what input values we feed > into the increase function. The test feeds it very large values on purpose. Reducing the inc values defeats the purpose > of this test. The error is caused because, technically, MaxMetaspaceSize="infinite" is in reality implemented by > MaxMetaspaceSize="very large value but a bit smaller than SIZE_MAX because of alignment". Which gives a small window, > in this case of 16K, for values to be larger than MaxMetaspaceSize. And then the overflow handling is broken and > causes, as part of its value correction, the gc threshold to fall into this 16K window and be larger than > MaxMetaspaceSize. And that exactly is the error. The test really could be clearer and have comments and such, and the > GC metaspace coding too. But the test is not wrong. Ok, I will remove the change for 64bit platforms. (I aimed to check the case that MaxMetaspaceSize was specified, but it seems to be unneeded.) And thanks for explanation! I understood why `new_value` exceeds MaxMetaspaceSize in JBS case. I will push new change. @shipilev Could you check again with this change? ------------- PR: https://git.openjdk.java.net/jdk/pull/628 From ysuenaga at openjdk.java.net Wed Oct 14 13:18:36 2020 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 14 Oct 2020 13:18:36 GMT Subject: RFR: 8226236: [TESTBUG] win32: gc/metaspace/TestCapacityUntilGCWrapAround.java fails [v2] In-Reply-To: References: Message-ID: > Originally filed at AdoptOpenJDK: > https://github.com/AdoptOpenJDK/openjdk-tests/issues/1162 > > The test fails on 32bit windows with: > > java.lang.IllegalStateException: WB_IncMetaspaceCapacityUntilGC: could not increase capacity until GC due to contention > with another thread > at sun.hotspot.WhiteBox.incMetaspaceCapacityUntilGC(Native Method) > at TestCapacityUntilGCWrapAround.main(TestCapacityUntilGCWrapAround.java:51) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) > at java.lang.Thread.run(Thread.java:748) > > `TestCapacityUntilGCWrapAround` passes `4GB - 1` to `incMetaspaceCapacityUntilGC()`. It seems to be too big. > And also this code seems to want to check the behavior when `_capacity_until_gc` is overflown. White box test would > throw ISE when it hapen. So we need to handle it correctly. Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: Use reserve_alignment() instead of commit_alignment() in inc_capacity_until_GC ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/628/files - new: https://git.openjdk.java.net/jdk/pull/628/files/da9ee3d5..3b656dd8 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=628&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=628&range=00-01 Stats: 37 lines in 3 files changed: 12 ins; 17 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/628.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/628/head:pull/628 PR: https://git.openjdk.java.net/jdk/pull/628 From rriggs at openjdk.java.net Wed Oct 14 13:21:21 2020 From: rriggs at openjdk.java.net (Roger Riggs) Date: Wed, 14 Oct 2020 13:21:21 GMT Subject: RFR: 8173585: Intrinsify StringLatin1.indexOf(char) [v6] In-Reply-To: References: <_T0873dC5tfUtGn9r1_Y21JkPKRt-za3MM9hPN2GQKQ=.b865fe53-5417-424f-81b6-1566a330640e@github.com> Message-ID: On Wed, 14 Oct 2020 12:03:42 GMT, Nils Eliasson wrote: >> Jason Tatton has updated the pull request incrementally with one additional commit since the last revision: >> >> Added missing copyright notices > > Marked as reviewed by neliasso (Reviewer). Due to the requirement for multiple reviewers, I had been waiting to add my review of the Core-Libs files until the HotSpot reviewers had approved! I see only one reviewer credited in the commit. ------------- PR: https://git.openjdk.java.net/jdk/pull/71 From redestad at openjdk.java.net Wed Oct 14 13:24:10 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Wed, 14 Oct 2020 13:24:10 GMT Subject: RFR: 8254744: Clean-up CodeBlob::align_code_offset In-Reply-To: References: Message-ID: On Wed, 14 Oct 2020 10:45:06 GMT, Erik ?sterlund wrote: >> - Modernize by using align_up >> - Move definition of the trivial CodeHeap::header_size() to header to help inlining, which optimizes align_code_offset >> and others. No big gain, but we're calling CodeHeap::header_size() 1300+ times on Hello World, and when inlined it's >> just a constant fold. > > Looks good. Thanks for reviewing! ------------- PR: https://git.openjdk.java.net/jdk/pull/651 From redestad at openjdk.java.net Wed Oct 14 13:24:11 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Wed, 14 Oct 2020 13:24:11 GMT Subject: Integrated: 8254744: Clean-up CodeBlob::align_code_offset In-Reply-To: References: Message-ID: <1l1EZJzTntCUkjfB-_9mQhWCObHMQDYyKWCVWPzWdrQ=.c72ea61b-2828-4b04-8642-6e014cf1d6b7@github.com> On Wed, 14 Oct 2020 10:34:32 GMT, Claes Redestad wrote: > - Modernize by using align_up > - Move definition of the trivial CodeHeap::header_size() to header to help inlining, which optimizes align_code_offset > and others. No big gain, but we're calling CodeHeap::header_size() 1300+ times on Hello World, and when inlined it's > just a constant fold. This pull request has now been integrated. Changeset: 738effad Author: Claes Redestad URL: https://git.openjdk.java.net/jdk/commit/738effad Stats: 9 lines in 3 files changed: 0 ins; 6 del; 3 mod 8254744: Clean-up CodeBlob::align_code_offset Reviewed-by: mdoerr, eosterlund ------------- PR: https://git.openjdk.java.net/jdk/pull/651 From shade at openjdk.java.net Wed Oct 14 14:18:11 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 14 Oct 2020 14:18:11 GMT Subject: RFR: 8226236: [TESTBUG] win32: gc/metaspace/TestCapacityUntilGCWrapAround.java fails [v2] In-Reply-To: References: Message-ID: On Wed, 14 Oct 2020 13:15:29 GMT, Yasumasa Suenaga wrote: > @shipilev Could you check again with this change? Yes, it passes `gc/metaspace` tests on both `x86_32` and `x86_64`. ------------- PR: https://git.openjdk.java.net/jdk/pull/628 From stuefe at openjdk.java.net Wed Oct 14 14:22:20 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 14 Oct 2020 14:22:20 GMT Subject: RFR: 8226236: [TESTBUG] win32: gc/metaspace/TestCapacityUntilGCWrapAround.java fails [v2] In-Reply-To: References: Message-ID: On Wed, 14 Oct 2020 13:18:36 GMT, Yasumasa Suenaga wrote: >> Originally filed at AdoptOpenJDK: >> https://github.com/AdoptOpenJDK/openjdk-tests/issues/1162 >> >> The test fails on 32bit windows with: >> >> java.lang.IllegalStateException: WB_IncMetaspaceCapacityUntilGC: could not increase capacity until GC due to contention >> with another thread >> at sun.hotspot.WhiteBox.incMetaspaceCapacityUntilGC(Native Method) >> at TestCapacityUntilGCWrapAround.main(TestCapacityUntilGCWrapAround.java:51) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:498) >> at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) >> at java.lang.Thread.run(Thread.java:748) >> >> `TestCapacityUntilGCWrapAround` passes `4GB - 1` to `incMetaspaceCapacityUntilGC()`. It seems to be too big. >> And also this code seems to want to check the behavior when `_capacity_until_gc` is overflown. White box test would >> throw ISE when it hapen. So we need to handle it correctly. > > Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: > > Use reserve_alignment() instead of commit_alignment() in inc_capacity_until_GC Thanks for your patience, Yasumasa. LGTM. ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/628 From stuefe at openjdk.java.net Wed Oct 14 14:32:14 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 14 Oct 2020 14:32:14 GMT Subject: RFR: 8226236: [TESTBUG] win32: gc/metaspace/TestCapacityUntilGCWrapAround.java fails [v2] In-Reply-To: References: Message-ID: On Wed, 14 Oct 2020 14:19:46 GMT, Thomas Stuefe wrote: >> Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: >> >> Use reserve_alignment() instead of commit_alignment() in inc_capacity_until_GC > > Thanks for your patience, Yasumasa. LGTM. (Created https://bugs.openjdk.java.net/browse/JDK-8254765 to track a rework of this gc threshold handling) ------------- PR: https://git.openjdk.java.net/jdk/pull/628 From rkennke at openjdk.java.net Wed Oct 14 14:32:28 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Wed, 14 Oct 2020 14:32:28 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v10] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: <5LY7Ty5Q2S8j9S30uXyeX0EE8AQWQ4BFFd02NYMzVio=.2e423faa-895a-42c9-ad9f-6c85844e1ff8@github.com> > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference > and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads > that make heavvy use of such weak references will therefore potentially cause significant GC pauses. There are 3 main > items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is > theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly > reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking > bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever > marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all > objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it > will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter > of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for > FinalReference, marking stops there, and does not mark through the referent. 2. Concurrent processing is performed > after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and > depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list > (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no > referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers > in Reference.get() intrinsics that return NULL when the referent is not reachable. Testing: hotspot_gc_shenadoah > (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions, tier1, tier2 with -XX:+UseShenandoahGC without > regressions Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: - Add fallback support for new properties in ObjArrayChunkedTask - Fix 32bit interpreter LRB-native call ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/505/files - new: https://git.openjdk.java.net/jdk/pull/505/files/2c4b1cbd..46dc1b75 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=09 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=08-09 Stats: 20 lines in 3 files changed: 6 ins; 2 del; 12 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From shade at openjdk.java.net Wed Oct 14 14:58:20 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 14 Oct 2020 14:58:20 GMT Subject: RFR: 8253525: Implement getInstanceSize/sizeOf intrinsics Message-ID: This is fork off the SizeOf JEP, JDK-8249196. There is already the entry point in JDK that can use the intrinsic like this: `Instrumentation.getInstanceSize`. Therefore, we can implement the C1/C2 intrinsic now, hook it up to `Instrumentation`, and let the tools use that fast path today. With this patch, JOL is able to be close to `deepSizeOf` implementation from SizeOf JEP. Example performance improvements for sizing up a custom linked list: Benchmark (size) Mode Cnt Score Error Units # Default LinkedChainBench.linkedChain 1 avgt 5 705.835 ? 8.051 ns/op LinkedChainBench.linkedChain 10 avgt 5 3148.874 ? 37.856 ns/op LinkedChainBench.linkedChain 100 avgt 5 28693.256 ? 142.254 ns/op LinkedChainBench.linkedChain 1000 avgt 5 290161.590 ? 4594.631 ns/op # Instrumentation attached, no intrinsics LinkedChainBench.linkedChain 1 avgt 5 159.659 ? 19.238 ns/op LinkedChainBench.linkedChain 10 avgt 5 717.659 ? 22.540 ns/op LinkedChainBench.linkedChain 100 avgt 5 7739.394 ? 111.683 ns/op LinkedChainBench.linkedChain 1000 avgt 5 80724.238 ? 2887.794 ns/op # Instrumentation attached, new intrinsics LinkedChainBench.linkedChain 1 avgt 5 95.254 ? 0.808 ns/op LinkedChainBench.linkedChain 10 avgt 5 261.564 ? 8.524 ns/op LinkedChainBench.linkedChain 100 avgt 5 3367.192 ? 21.128 ns/op LinkedChainBench.linkedChain 1000 avgt 5 34148.851 ? 373.080 ns/op ------------- Commit messages: - 8253525: Implement getInstanceSize/sizeOf intrinsics Changes: https://git.openjdk.java.net/jdk/pull/650/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=650&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253525 Stats: 613 lines in 10 files changed: 564 ins; 0 del; 49 mod Patch: https://git.openjdk.java.net/jdk/pull/650.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/650/head:pull/650 PR: https://git.openjdk.java.net/jdk/pull/650 From rriggs at openjdk.java.net Wed Oct 14 15:05:28 2020 From: rriggs at openjdk.java.net (Roger Riggs) Date: Wed, 14 Oct 2020 15:05:28 GMT Subject: RFR: 8173585: Intrinsify StringLatin1.indexOf(char) [v6] In-Reply-To: References: <_T0873dC5tfUtGn9r1_Y21JkPKRt-za3MM9hPN2GQKQ=.b865fe53-5417-424f-81b6-1566a330640e@github.com> Message-ID: On Wed, 14 Oct 2020 13:18:19 GMT, Roger Riggs wrote: >> Marked as reviewed by neliasso (Reviewer). > > Due to the requirement for multiple reviewers, I had been waiting to add my review of the Core-Libs files until the > HotSpot reviewers had approved! I see only one reviewer credited in the commit. This integration without testing with a current merge from the master and has caused two build failures. JDK-8254761: Wrong intrinsic annotation used for StringLatin1.indexOfChar Also, there is a raw unicode character in the JMH test that causes a compilation error. == Output from failing command(s) repeated here === [2020-10-14T14:39:09,608Z] * For target support_test_micro_classes__the.BUILD_JDK_MICROBENCHMARK_batch: [2020-10-14T14:39:09,611Z] /opt/mach5/mesos/work_dir/slaves/4076d11c-c6ed-4d07-84c1-4ab8d55cd975-S108796/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/400e1f56-2d49-42d8-8879-97d4fbb6c909/runs/c49da2bc-a8fe-4a5d-8159-57a9b0316fd2/workspace/open/test/micro/org/openjdk/bench/java/lang/StringIndexOfChar.java:71: error: unmappable character (0xE2) for encoding ascii [2020-10-14T14:39:09,611Z] sb.append(isUtf16?'???':'b'); [2020-10-14T14:39:09,611Z] ^ [2020-10-14T14:39:09,611Z] /opt/mach5/mesos/work_dir/slaves/4076d11c-c6ed-4d07-84c1-4ab8d55cd975-S108796/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/400e1f56-2d49-42d8-8879-97d4fbb6c909/runs/c49da2bc-a8fe-4a5d-8159-57a9b0316fd2/workspace/open/test/micro/org/openjdk/bench/java/lang/StringIndexOfChar.java:71: error: unmappable character (0x98) for encoding ascii [2020-10-14T14:39:09,611Z] sb.append(isUtf16?'???':'b'); [2020-10-14T14:39:09,611Z] ^ [2020-10-14T14:39:09,611Z] /opt/mach5/mesos/work_dir/slaves/4076d11c-c6ed-4d07-84c1-4ab8d55cd975-S108796/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/400e1f56-2d49-42d8-8879-97d4fbb6c909/runs/c49da2bc-a8fe-4a5d-8159-57a9b0316fd2/workspace/open/test/micro/org/openjdk/bench/java/lang/StringIndexOfChar.java:71: error: unmappable character (0xBA) for encoding ascii [2020-10-14T14:39:09,611Z] sb.append(isUtf16?'???':'b'); [2020-10-14T14:39:09,611Z] ^ [2020-10-14T14:39:09,611Z] /opt/mach5/mesos/work_dir/slaves/4076d11c-c6ed-4d07-84c1-4ab8d55cd975-S108796/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/400e1f56-2d49-42d8-8879-97d4fbb6c909/runs/c49da2bc-a8fe-4a5d-8159-57a9b0316fd2/workspace/open/test/micro/org/openjdk/bench/java/lang/StringIndexOfChar.java:71: error: unclosed character literal [2020-10-14T14:39:09,611Z] sb.append(isUtf16?'???':'b'); [2020-10-14T14:39:09,611Z] ^ [2020-10-14T14:39:09,611Z] /opt/mach5/mesos/work_dir/slaves/4076d11c-c6ed-4d07-84c1-4ab8d55cd975-S108796/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/400e1f56-2d49-42d8-8879-97d4fbb6c909/runs/c49da2bc-a8fe-4a5d-8159-57a9b0316fd2/workspace/open/test/micro/org/openjdk/bench/java/lang/StringIndexOfChar.java:71: error: unclosed character literal [2020-10-14T14:39:09,611Z] sb.append(isUtf16?'???':'b');``` ------------- PR: https://git.openjdk.java.net/jdk/pull/71 From psandoz at openjdk.java.net Wed Oct 14 15:16:48 2020 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Wed, 14 Oct 2020 15:16:48 GMT Subject: RFR: 8223347: Integration of Vector API (Incubator) [v5] In-Reply-To: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> References: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> Message-ID: > This pull request is for integration of the Vector API. It was previously reviewed under conditions when mercurial was > used for the source code control system. Review threads can be found here (searching for issue number 8223347 in the > title): https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-April/thread.html > https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-May/thread.html > https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-July/thread.html > > If mercurial was still being used the code would be pushed directly, once the CSR is approved. However, in this case a > pull request is required and needs explicit reviewer approval. Between the final review and this pull request no code > has changed, except for that related to merging. Paul Sandoz has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: - Merge master - Merge master - Fix related to merge - HotspotIntrinsicCandidate to IntrinsicCandidate - Merge master - Fix permissions - Fix permissions - Merge master - Vector API new files - Integration of Vector API (Incubator) ------------- Changes: https://git.openjdk.java.net/jdk/pull/367/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=367&range=04 Stats: 295107 lines in 336 files changed: 292957 ins; 1062 del; 1088 mod Patch: https://git.openjdk.java.net/jdk/pull/367.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/367/head:pull/367 PR: https://git.openjdk.java.net/jdk/pull/367 From shade at openjdk.java.net Wed Oct 14 15:25:20 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 14 Oct 2020 15:25:20 GMT Subject: RFR: 8254780: EnterInterpOnlyModeClosure::completed() always returns true Message-ID: JDK-8238761 introduced this funky code: class EnterInterpOnlyModeClosure : public HandshakeClosure { private: bool _completed; public: EnterInterpOnlyModeClosure() : HandshakeClosure("EnterInterpOnlyMode"), _completed(false) { } void do_thread(Thread* th) { ... _completed = true; } bool completed() { return _completed = true; } }; It seems the flag is there to communicate that target thread indeed executed the handshake. But `completed()` sets the bool unconditionally and always returns true. And it is used in one and only place here: guarantee(hs.completed(), "Handshake failed: Target thread is not alive?"); ...which means that guarantee always passes. Attention @robehn :) ------------- Commit messages: - 8254780: EnterInterpOnlyModeClosure::completed() always returns true Changes: https://git.openjdk.java.net/jdk/pull/662/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=662&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254780 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/662.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/662/head:pull/662 PR: https://git.openjdk.java.net/jdk/pull/662 From shade at openjdk.java.net Wed Oct 14 15:39:20 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 14 Oct 2020 15:39:20 GMT Subject: RFR: 8254781: Remove unimplemented ClassFieldMap::compute_field_count Message-ID: There is no definition of `ClassFieldMap::compute_field_count` in current tip or any history after the initial load. Can be removed. Testing: - [x] Linux x86_64 build - [x] Text searches for `compute_field_count` in `src/hotspot` ------------- Commit messages: - 8254781: Remove unimplemented ClassFieldMap::compute_field_count Changes: https://git.openjdk.java.net/jdk/pull/663/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=663&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254781 Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/663.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/663/head:pull/663 PR: https://git.openjdk.java.net/jdk/pull/663 From phh at openjdk.java.net Wed Oct 14 15:45:17 2020 From: phh at openjdk.java.net (Paul Hohensee) Date: Wed, 14 Oct 2020 15:45:17 GMT Subject: RFR: 8254781: Remove unimplemented ClassFieldMap::compute_field_count In-Reply-To: References: Message-ID: On Wed, 14 Oct 2020 15:27:26 GMT, Aleksey Shipilev wrote: > There is no definition of `ClassFieldMap::compute_field_count` in current tip or any history after the initial load. > Can be removed. > Testing: > - [x] Linux x86_64 build > - [x] Text searches for `compute_field_count` in `src/hotspot` Marked as reviewed by phh (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/663 From pchilanomate at openjdk.java.net Wed Oct 14 16:32:24 2020 From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo) Date: Wed, 14 Oct 2020 16:32:24 GMT Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: References: <35eLsMpWmcCUoiEWhnYdSpZNmvLy4ra56Qtd6eRW574=.4e7c9278-3e0d-457d-9c15-eef45bae9755@github.com> Message-ID: On Wed, 14 Oct 2020 08:19:56 GMT, Alan Hayward wrote: >> Regarding the use of cross_modify_fence(), I filed a bug last week to remove some uneeded uses of them in common code >> (https://bugs.openjdk.java.net/browse/JDK-8254264). Just a heads up before I send the RFR since I see some reference to >> them in the added comments. Thanks, >> Patricio > >> Regarding the use of cross_modify_fence(), I filed a bug last week to remove some uneeded uses of them in common code >> (https://bugs.openjdk.java.net/browse/JDK-8254264). Just a heads up before I send the RFR since I see some reference to >> them in the added comments. > > I'm going to assume your change is just a two line change (removing the cross_modify_fence's), and I'll test that on > top of my patches using the VerifyCrossModifyFence flag - I'll give it a run of everything, which can take a while. > Plus I'll manually look at the code to to make sure I'm happy. I think it makes sense that your patch goes in first, > then I can rebase and update code comments too. Let me know your pull request once you've raised it. Yes, the change just removes those extra cross_modify_fence's. Please check https://github.com/openjdk/jdk/pull/655 ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From psandoz at openjdk.java.net Wed Oct 14 17:10:43 2020 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Wed, 14 Oct 2020 17:10:43 GMT Subject: RFR: 8223347: Integration of Vector API (Incubator) [v6] In-Reply-To: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> References: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> Message-ID: > This pull request is for integration of the Vector API. It was previously reviewed under conditions when mercurial was > used for the source code control system. Review threads can be found here (searching for issue number 8223347 in the > title): https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-April/thread.html > https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-May/thread.html > https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-July/thread.html > > If mercurial was still being used the code would be pushed directly, once the CSR is approved. However, in this case a > pull request is required and needs explicit reviewer approval. Between the final review and this pull request no code > has changed, except for that related to merging. Paul Sandoz has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: - Merge master - Merge master - Merge master - Fix related to merge - HotspotIntrinsicCandidate to IntrinsicCandidate - Merge master - Fix permissions - Fix permissions - Merge master - Vector API new files - ... and 1 more: https://git.openjdk.java.net/jdk/compare/96a1f08e...3346d292 ------------- Changes: https://git.openjdk.java.net/jdk/pull/367/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=367&range=05 Stats: 295107 lines in 336 files changed: 292957 ins; 1062 del; 1088 mod Patch: https://git.openjdk.java.net/jdk/pull/367.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/367/head:pull/367 PR: https://git.openjdk.java.net/jdk/pull/367 From sspitsyn at openjdk.java.net Wed Oct 14 17:23:08 2020 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Wed, 14 Oct 2020 17:23:08 GMT Subject: RFR: 8254780: EnterInterpOnlyModeClosure::completed() always returns true In-Reply-To: References: Message-ID: On Wed, 14 Oct 2020 15:16:54 GMT, Aleksey Shipilev wrote: > JDK-8238761 introduced this funky code: > > class EnterInterpOnlyModeClosure : public HandshakeClosure { > private: > bool _completed; > public: > EnterInterpOnlyModeClosure() : HandshakeClosure("EnterInterpOnlyMode"), _completed(false) { } > void do_thread(Thread* th) { > ... > _completed = true; > } > bool completed() { > return _completed = true; > } > }; > > It seems the flag is there to communicate that target thread indeed executed the handshake. But `completed()` sets the > bool unconditionally and always returns true. And it is used in one and only place here: > guarantee(hs.completed(), "Handshake failed: Target thread is not alive?"); > > ...which means that guarantee always passes. > > Attention @robehn :) It looks good. Thank you for catching and fixing it! Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/662 From sspitsyn at openjdk.java.net Wed Oct 14 17:27:20 2020 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Wed, 14 Oct 2020 17:27:20 GMT Subject: RFR: 8254781: Remove unimplemented ClassFieldMap::compute_field_count In-Reply-To: References: Message-ID: <430N-0bDu4Jg7qzWLztx_SZiXp1NkrWEKRiFoIWgPjY=.4fcd1173-e834-471b-9fbc-3419fe91f236@github.com> On Wed, 14 Oct 2020 15:27:26 GMT, Aleksey Shipilev wrote: > There is no definition of `ClassFieldMap::compute_field_count` in current tip or any history after the initial load. > Can be removed. > Testing: > - [x] Linux x86_64 build > - [x] Text searches for `compute_field_count` in `src/hotspot` It looks good and trivial. Thanks, Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/663 From hoffmann at mountainminds.com Wed Oct 14 17:56:41 2020 From: hoffmann at mountainminds.com (Marc Hoffmann) Date: Wed, 14 Oct 2020 19:56:41 +0200 Subject: arm32 builds continue to fail for me after 8253540 and 8253901 In-Reply-To: References: <56ff08d5-a4e5-788a-1c29-02f76e8755d2@redhat.com> <17F91692-4F3D-4FAA-AB94-361B6C84F982@mountainminds.com> Message-ID: <3509926C-B3D5-41D5-9ECE-6AC957E8ECD8@mountainminds.com> Hi Boris, I?m not familiar with the hotspot codebase at all. But the assertions in InterpreterMacroAssembler::unlock_object (https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/arm/interp_masm_arm.cpp#L989) look contradictory to me: assert(Rlock == R0, "the first argument"); but then we assert that Rlock and R0 is different: assert_different_registers(Robj, Rmark, Rlock, R0, Rtemp); That code was not changed in 8253540. Maybe it was not called at all before the change? Regards, -marc > On 13. Oct 2020, at 08:48, Boris Ulasevich wrote: > > Hi Marc, > > I created JDK-8254661 for the issue. I would love to fix it, but still > can't reproduce the crash (even on Raspberry Pi). > What configuration do you have? The following sequence works Ok for me: > pi at raspberrypi $ git clone https://github.com/openjdk/jdk > pi at raspberrypi $ cd jdk > pi at raspberrypi $ bash configure --with-boot-jdk=/home/pi/jdk-15 > pi at raspberrypi $ make > > Your debug build shows that I did not fix the > assert_different_registers in > InterpreterMacroAssembler::unlock_object() > body (and the function comment by the way!), though with eyeballing I > do not see what is wrong for Rlock=R0: > https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/arm/interp_masm_arm.cpp#L1000 > > regards, > Boris > > On Mon, Oct 12, 2020 at 11:34 PM Marc Hoffmann > wrote: >> >> Hi Aleksey, hi Boris, >> >> for me the crash is always reproducible: Every single build after >> >> 77a0f3999afa322b64643afd4a161164440af975 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function >> >> fails on arm32 (build on ubuntu in docker on a raspberry pi 4). Before this commit I haven?t encountered any failures. >> >> Here is the hs_err file with ?enable-debug (reproduced with current master c7f00640627eab38b77d23d07876cf0247fa18f3). >> >> Cheers, >> -marc >> >> >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (/workspace/src/hotspot/share/asm/register.hpp:160), pid=14700, tid=14705 >> # assert(a != b && a != c && a != d && a != e && b != c && b != d && b != e && c != d && c != e && d != e) failed: registers must be different: a=0x00000002, b=0x00000003, c=0x00000000, d=0x00000000, e=0x0000000c >> # >> # JRE version: (16.0) (fastdebug build ) >> # Java VM: OpenJDK Server VM (fastdebug 16-internal+0-adhoc..workspace, mixed mode, g1 gc, linux-arm) >> # Problematic frame: >> # V [libjvm.so+0x7571fc] InterpreterMacroAssembler::unlock_object(RegisterImpl*) [clone .part.34]+0x63 >> # >> # Core dump will be written. Default location: /workspace/make/core >> # >> # >> >> --------------- S U M M A R Y ------------ >> >> Command Line: -Xms64M -Xmx768M --add-exports=java.base/jdk.internal.module=ALL-UNNAMED build.tools.jigsaw.AddPackagesAttribute /workspace/build/linux-arm-server-fastdebug/jdk >> >> Host: 20431585315d, rev 3 (v7l), 4 cores, 3G, Ubuntu 18.04.3 LTS >> Time: Mon Oct 12 20:22:15 2020 UTC elapsed time: 0.144243 seconds (0d 0h 0m 0s) >> >> --------------- T H R E A D --------------- >> >> Current thread (0xb5b16460): JavaThread "Unknown thread" [_thread_in_vm, id=14705, stack(0xb5c6e000,0xb5cbe000)] >> >> Stack: [0xb5c6e000,0xb5cbe000], sp=0xb5cbc2d0, free space=312k >> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x7571fc] InterpreterMacroAssembler::unlock_object(RegisterImpl*) [clone .part.34]+0x63 >> >> Registers: >> r0 = 0x00000003 >> r1 = 0x000000a0 >> r2 = 0x00000002 >> r3 = 0x00000000 >> r4 = 0xb5b168b0 >> r5 = 0x0000000c >> r6 = 0x00000000 >> r7 = 0xb5cbc2e8 >> r8 = 0xb6db1fa8 >> r9 = 0xb5cbc760 >> r10 = 0xe3520000 >> fp = 0xb6db1fa8 >> r12 = 0xb6ff8000 >> sp = 0xb5cbc2d0 >> lr = 0x00000058 >> pc = 0xb64961fc >> cpsr = 0x200f0030 >> >> Top of Stack: (sp=0xb5cbc2d0) >> 0xb5cbc2d0: 00000002 00000003 00000000 00000000 >> 0xb5cbc2e0: 0000000c 00000048 0000006e 00000000 >> 0xb5cbc2f0: 00000000 00000000 00000000 0000007c >> 0xb5cbc300: 00000000 00000077 b5cbc378 00000000 >> 0xb5cbc310: b5cbc380 0000000f b6db1fa8 b5cbc340 >> 0xb5cbc320: b5b168b0 b6db1fa8 b5cbc4c4 b6008a2b >> 0xb5cbc330: b5cbc454 b5cbc348 0000000f b61971cf >> 0xb5cbc340: b5cbc380 b5cbc3b0 b5b168b0 b5cbc388 >> >> Instructions: (pc=0xb64961fc) >> 0xb64960fc: 440be9c7 a034f8c7 f5e06139 68ebf7ed >> 0xb649610c: f040689a 46184164 1180f441 68996011 >> 0xb649611c: f5e03104 4b12f781 46284622 1003f858 >> 0xb649612c: f998f2e4 f64f68eb f2ce7210 6899122f >> 0xb649613c: 600a4618 31046899 f76ef5e0 46284631 >> 0xb649614c: f4e6f69f f5e04630 f507f79b 46bd7786 >> 0xb649615c: 8ff0e8bd 0091bfd0 00006ee4 000059d0 >> 0xb649616c: 00007d1c 4a084b07 b480447b 589baf00 >> 0xb649617c: b91b781b f85d46bd 47707b04 f85d46bd >> 0xb649618c: e7197b04 0091be30 00006a24 bf182900 >> 0xb649619c: f1a1290c e92d0202 b0f34ff0 2301bf08 >> 0xb64961ac: bf18af06 f8df2300 2a018268 461abf8c >> 0xb64961bc: 0201f043 f04f460e 44f833ff 30d0f8c7 >> 0xb64961cc: f8c74604 23003140 333de9c7 30fcf887 >> 0xb64961dc: 3359e9c7 316cf887 4a8eb1da 0e58f04f >> 0xb64961ec: 250c2003 1002f858 f8d12202 21a0c000 >> 0xb64961fc: e000f88c 2000e9cd 6302e9cd 4b874a86 >> 0xb649620c: 447a4887 9504447b f58f4478 f00cfaa3 >> 0xb649621c: 68e2f2d5 4340f44f 33a0f2ce 0a04f04f >> 0xb649622c: 7980f04f 0b00f04f 46106891 600b2501 >> 0xb649623c: 44516891 f6f0f5e0 4a7b497a 3001f858 >> 0xb649624c: 0140f107 a048f8c7 4608643e f8c7681b >> 0xb649625c: 60fb904c f858647b f8c72002 f8c7b054 >> 0xb649626c: 60bab058 e9c73208 653abb19 66fd607a >> 0xb649627c: f732f5e0 689a68e3 4164f040 f4414618 >> 0xb649628c: 60111181 44516899 f6c6f5e0 0110f107 >> 0xb649629c: 68fb687a f8c74608 623aa018 6304e9c7 >> 0xb64962ac: 901cf8c7 bb09e9c7 bb0de9c7 f5e063fd >> 0xb64962bc: 68e3f713 f0406899 46184264 4240f442 >> 0xb64962cc: 6899600a f1074451 f5e00ad0 4b57f6a5 >> 0xb64962dc: 3003f858 2b00781b 8093f040 f04f68bb >> 0xb64962ec: 68f97c80 33082500 c0acf8c7 0b01f04f >> >> >> >> --------------- P R O C E S S --------------- >> >> uid : 0 euid : 0 gid : 0 egid : 0 >> >> umask: 0022 (----w--w-) >> >> Threads class SMR info: >> _java_thread_list=0xb6e56078, length=0, elements={ >> } >> _java_thread_list_alloc_cnt=1, _java_thread_list_free_cnt=0, _java_thread_list_max=0, _nested_thread_list_max=0 >> _delete_lock_wait_cnt=0, _delete_lock_wait_max=0 >> _to_delete_list_cnt=0, _to_delete_list_max=0 >> >> Java Threads: ( => current thread ) >> >> Other Threads: >> 0xb5b73188 GCTaskThread "GC Thread#0" [stack: 0x81d00000,0x81d80000] [id=14706] >> 0xb5b77dc0 ConcurrentGCThread "G1 Main Marker" [stack: 0x81c7e000,0x81cfe000] [id=14707] >> 0xb5b790c0 ConcurrentGCThread "G1 Conc#0" [stack: 0x81a80000,0x81b00000] [id=14708] >> 0xb5bde230 ConcurrentGCThread "G1 Refine#0" [stack: 0x81780000,0x81800000] [id=14709] >> 0xb5bdf488 ConcurrentGCThread "G1 Service" [stack: 0x81580000,0x81600000] [id=14710] >> >> =>0xb5b16460 (exited) JavaThread "Unknown thread" [_thread_in_vm, id=14705, stack(0xb5c6e000,0xb5cbe000)] >> >> Threads with active compile tasks: >> >> VM state: not at safepoint (not fully initialized) >> >> VM Mutex/Monitor currently owned by a thread: None >> >> GC Precious Log: >> CPUs: 4 total, 4 available >> Memory: 3827M >> Large Page Support: Disabled >> NUMA Support: Disabled >> Compressed Oops: Disabled >> Heap Region Size: 1M >> Heap Min Capacity: 64M >> Heap Initial Capacity: 64M >> Heap Max Capacity: 768M >> Pre-touch: Disabled >> Parallel Workers: 4 >> Concurrent Workers: 1 >> Concurrent Refinement Workers: 4 >> Periodic GC: Disabled >> >> Heap: >> garbage-first heap total 65536K, used 0K [0x83a00000, 0xb3a00000) >> region size 1024K, 1 young (1024K), 0 survivors (0K) >> Metaspace used 944K, capacity 2200K, committed 2200K, reserved 4400K >> >> Heap Regions: E=young(eden), S=young(survivor), O=old, HS=humongous(starts), HC=humongous(continues), CS=collection set, F=free, OA=open archive, CA=closed archive, TAMS=top-at-mark-start (previous, next) >> | 0|0x83a00000, 0x83a00000, 0x83b00000| 0%| F| |TAMS 0x83a00000, 0x83a00000| Untracked >> | 1|0x83b00000, 0x83b00000, 0x83c00000| 0%| F| |TAMS 0x83b00000, 0x83b00000| Untracked >> | 2|0x83c00000, 0x83c00000, 0x83d00000| 0%| F| |TAMS 0x83c00000, 0x83c00000| Untracked >> | 3|0x83d00000, 0x83d00000, 0x83e00000| 0%| F| |TAMS 0x83d00000, 0x83d00000| Untracked >> | 4|0x83e00000, 0x83e00000, 0x83f00000| 0%| F| |TAMS 0x83e00000, 0x83e00000| Untracked >> | 5|0x83f00000, 0x83f00000, 0x84000000| 0%| F| |TAMS 0x83f00000, 0x83f00000| Untracked >> | 6|0x84000000, 0x84000000, 0x84100000| 0%| F| |TAMS 0x84000000, 0x84000000| Untracked >> | 7|0x84100000, 0x84100000, 0x84200000| 0%| F| |TAMS 0x84100000, 0x84100000| Untracked >> | 8|0x84200000, 0x84200000, 0x84300000| 0%| F| |TAMS 0x84200000, 0x84200000| Untracked >> | 9|0x84300000, 0x84300000, 0x84400000| 0%| F| |TAMS 0x84300000, 0x84300000| Untracked >> | 10|0x84400000, 0x84400000, 0x84500000| 0%| F| |TAMS 0x84400000, 0x84400000| Untracked >> | 11|0x84500000, 0x84500000, 0x84600000| 0%| F| |TAMS 0x84500000, 0x84500000| Untracked >> | 12|0x84600000, 0x84600000, 0x84700000| 0%| F| |TAMS 0x84600000, 0x84600000| Untracked >> | 13|0x84700000, 0x84700000, 0x84800000| 0%| F| |TAMS 0x84700000, 0x84700000| Untracked >> | 14|0x84800000, 0x84800000, 0x84900000| 0%| F| |TAMS 0x84800000, 0x84800000| Untracked >> | 15|0x84900000, 0x84900000, 0x84a00000| 0%| F| |TAMS 0x84900000, 0x84900000| Untracked >> | 16|0x84a00000, 0x84a00000, 0x84b00000| 0%| F| |TAMS 0x84a00000, 0x84a00000| Untracked >> | 17|0x84b00000, 0x84b00000, 0x84c00000| 0%| F| |TAMS 0x84b00000, 0x84b00000| Untracked >> | 18|0x84c00000, 0x84c00000, 0x84d00000| 0%| F| |TAMS 0x84c00000, 0x84c00000| Untracked >> | 19|0x84d00000, 0x84d00000, 0x84e00000| 0%| F| |TAMS 0x84d00000, 0x84d00000| Untracked >> | 20|0x84e00000, 0x84e00000, 0x84f00000| 0%| F| |TAMS 0x84e00000, 0x84e00000| Untracked >> | 21|0x84f00000, 0x84f00000, 0x85000000| 0%| F| |TAMS 0x84f00000, 0x84f00000| Untracked >> | 22|0x85000000, 0x85000000, 0x85100000| 0%| F| |TAMS 0x85000000, 0x85000000| Untracked >> | 23|0x85100000, 0x85100000, 0x85200000| 0%| F| |TAMS 0x85100000, 0x85100000| Untracked >> | 24|0x85200000, 0x85200000, 0x85300000| 0%| F| |TAMS 0x85200000, 0x85200000| Untracked >> | 25|0x85300000, 0x85300000, 0x85400000| 0%| F| |TAMS 0x85300000, 0x85300000| Untracked >> | 26|0x85400000, 0x85400000, 0x85500000| 0%| F| |TAMS 0x85400000, 0x85400000| Untracked >> | 27|0x85500000, 0x85500000, 0x85600000| 0%| F| |TAMS 0x85500000, 0x85500000| Untracked >> | 28|0x85600000, 0x85600000, 0x85700000| 0%| F| |TAMS 0x85600000, 0x85600000| Untracked >> | 29|0x85700000, 0x85700000, 0x85800000| 0%| F| |TAMS 0x85700000, 0x85700000| Untracked >> | 30|0x85800000, 0x85800000, 0x85900000| 0%| F| |TAMS 0x85800000, 0x85800000| Untracked >> | 31|0x85900000, 0x85900000, 0x85a00000| 0%| F| |TAMS 0x85900000, 0x85900000| Untracked >> | 32|0x85a00000, 0x85a00000, 0x85b00000| 0%| F| |TAMS 0x85a00000, 0x85a00000| Untracked >> | 33|0x85b00000, 0x85b00000, 0x85c00000| 0%| F| |TAMS 0x85b00000, 0x85b00000| Untracked >> | 34|0x85c00000, 0x85c00000, 0x85d00000| 0%| F| |TAMS 0x85c00000, 0x85c00000| Untracked >> | 35|0x85d00000, 0x85d00000, 0x85e00000| 0%| F| |TAMS 0x85d00000, 0x85d00000| Untracked >> | 36|0x85e00000, 0x85e00000, 0x85f00000| 0%| F| |TAMS 0x85e00000, 0x85e00000| Untracked >> | 37|0x85f00000, 0x85f00000, 0x86000000| 0%| F| |TAMS 0x85f00000, 0x85f00000| Untracked >> | 38|0x86000000, 0x86000000, 0x86100000| 0%| F| |TAMS 0x86000000, 0x86000000| Untracked >> | 39|0x86100000, 0x86100000, 0x86200000| 0%| F| |TAMS 0x86100000, 0x86100000| Untracked >> | 40|0x86200000, 0x86200000, 0x86300000| 0%| F| |TAMS 0x86200000, 0x86200000| Untracked >> | 41|0x86300000, 0x86300000, 0x86400000| 0%| F| |TAMS 0x86300000, 0x86300000| Untracked >> | 42|0x86400000, 0x86400000, 0x86500000| 0%| F| |TAMS 0x86400000, 0x86400000| Untracked >> | 43|0x86500000, 0x86500000, 0x86600000| 0%| F| |TAMS 0x86500000, 0x86500000| Untracked >> | 44|0x86600000, 0x86600000, 0x86700000| 0%| F| |TAMS 0x86600000, 0x86600000| Untracked >> | 45|0x86700000, 0x86700000, 0x86800000| 0%| F| |TAMS 0x86700000, 0x86700000| Untracked >> | 46|0x86800000, 0x86800000, 0x86900000| 0%| F| |TAMS 0x86800000, 0x86800000| Untracked >> | 47|0x86900000, 0x86900000, 0x86a00000| 0%| F| |TAMS 0x86900000, 0x86900000| Untracked >> | 48|0x86a00000, 0x86a00000, 0x86b00000| 0%| F| |TAMS 0x86a00000, 0x86a00000| Untracked >> | 49|0x86b00000, 0x86b00000, 0x86c00000| 0%| F| |TAMS 0x86b00000, 0x86b00000| Untracked >> | 50|0x86c00000, 0x86c00000, 0x86d00000| 0%| F| |TAMS 0x86c00000, 0x86c00000| Untracked >> | 51|0x86d00000, 0x86d00000, 0x86e00000| 0%| F| |TAMS 0x86d00000, 0x86d00000| Untracked >> | 52|0x86e00000, 0x86e00000, 0x86f00000| 0%| F| |TAMS 0x86e00000, 0x86e00000| Untracked >> | 53|0x86f00000, 0x86f00000, 0x87000000| 0%| F| |TAMS 0x86f00000, 0x86f00000| Untracked >> | 54|0x87000000, 0x87000000, 0x87100000| 0%| F| |TAMS 0x87000000, 0x87000000| Untracked >> | 55|0x87100000, 0x87100000, 0x87200000| 0%| F| |TAMS 0x87100000, 0x87100000| Untracked >> | 56|0x87200000, 0x87200000, 0x87300000| 0%| F| |TAMS 0x87200000, 0x87200000| Untracked >> | 57|0x87300000, 0x87300000, 0x87400000| 0%| F| |TAMS 0x87300000, 0x87300000| Untracked >> | 58|0x87400000, 0x87400000, 0x87500000| 0%| F| |TAMS 0x87400000, 0x87400000| Untracked >> | 59|0x87500000, 0x87500000, 0x87600000| 0%| F| |TAMS 0x87500000, 0x87500000| Untracked >> | 60|0x87600000, 0x87600000, 0x87700000| 0%| F| |TAMS 0x87600000, 0x87600000| Untracked >> | 61|0x87700000, 0x87700000, 0x87800000| 0%| F| |TAMS 0x87700000, 0x87700000| Untracked >> | 62|0x87800000, 0x87800000, 0x87900000| 0%| F| |TAMS 0x87800000, 0x87800000| Untracked >> | 63|0x87900000, 0x87942908, 0x87a00000| 26%| E| |TAMS 0x87900000, 0x87900000| Complete >> >> Card table byte_map: [0x83700000,0x83880000] _byte_map_base: 0x832e3000 >> >> Marking Bits (Prev, Next): (CMBitMap*) 0xb5b74324, (CMBitMap*) 0xb5b74344 >> Prev Bits: [0x82980000, 0x83580000) >> Next Bits: [0x81d80000, 0x82980000) >> >> GC Heap History (0 events): >> No events >> >> Deoptimization events (0 events): >> No events >> >> Classes unloaded (0 events): >> No events >> >> Classes redefined (0 events): >> No events >> >> Internal exceptions (0 events): >> No events >> >> Events (20 events): >> Event: 0.113 loading class java/lang/Character >> Event: 0.114 loading class java/lang/Character done >> Event: 0.114 loading class java/lang/Float >> Event: 0.115 loading class java/lang/Number >> Event: 0.115 loading class java/lang/Number done >> Event: 0.115 loading class java/lang/Float done >> Event: 0.115 loading class java/lang/Double >> Event: 0.116 loading class java/lang/Double done >> Event: 0.116 loading class java/lang/Byte >> Event: 0.116 loading class java/lang/Byte done >> Event: 0.116 loading class java/lang/Short >> Event: 0.117 loading class java/lang/Short done >> Event: 0.117 loading class java/lang/Integer >> Event: 0.118 loading class java/lang/Integer done >> Event: 0.118 loading class java/lang/Long >> Event: 0.119 loading class java/lang/Long done >> Event: 0.119 loading class java/util/Iterator >> Event: 0.119 loading class java/util/Iterator done >> Event: 0.119 loading class java/lang/reflect/RecordComponent >> Event: 0.119 loading class java/lang/reflect/RecordComponent done >> >> >> Dynamic libraries: >> 00410000-00411000 r-xp 00000000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java >> 00420000-00421000 r--p 00000000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java >> 00421000-00422000 rw-p 00001000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java >> 019b6000-019d7000 rw-p 00000000 00:00 0 [heap] >> 809c9000-80e00000 rw-p 00000000 00:00 0 >> 80e00000-80e8e000 rw-p 00000000 00:00 0 >> 80e8e000-80f00000 ---p 00000000 00:00 0 >> 80fb4000-811da000 rw-p 00000000 00:00 0 >> 811da000-81400000 ---p 00000000 00:00 0 >> 81400000-81421000 rw-p 00000000 00:00 0 >> 81421000-81500000 ---p 00000000 00:00 0 >> 8157e000-8157f000 ---p 00000000 00:00 0 >> 8157f000-81600000 rw-p 00000000 00:00 0 >> 81600000-81621000 rw-p 00000000 00:00 0 >> 81621000-81700000 ---p 00000000 00:00 0 >> 8177e000-8177f000 ---p 00000000 00:00 0 >> 8177f000-81800000 rw-p 00000000 00:00 0 >> 81800000-81821000 rw-p 00000000 00:00 0 >> 81821000-81900000 ---p 00000000 00:00 0 >> 81900000-81921000 rw-p 00000000 00:00 0 >> 81921000-81a00000 ---p 00000000 00:00 0 >> 81a7e000-81a7f000 ---p 00000000 00:00 0 >> 81a7f000-81b00000 rw-p 00000000 00:00 0 >> 81b00000-81b21000 rw-p 00000000 00:00 0 >> 81b21000-81c00000 ---p 00000000 00:00 0 >> 81c21000-81c7c000 rw-p 00000000 00:00 0 >> 81c7c000-81c7d000 ---p 00000000 00:00 0 >> 81c7d000-81cfe000 rw-p 00000000 00:00 0 >> 81cfe000-81cff000 ---p 00000000 00:00 0 >> 81cff000-81e80000 rw-p 00000000 00:00 0 >> 81e80000-82980000 ---p 00000000 00:00 0 >> 82980000-82a80000 rw-p 00000000 00:00 0 >> 82a80000-83580000 ---p 00000000 00:00 0 >> 83580000-835a0000 rw-p 00000000 00:00 0 >> 835a0000-83700000 ---p 00000000 00:00 0 >> 83700000-83720000 rw-p 00000000 00:00 0 >> 83720000-83880000 ---p 00000000 00:00 0 >> 83880000-838a0000 rw-p 00000000 00:00 0 >> 838a0000-83a00000 ---p 00000000 00:00 0 >> 83a00000-87a00000 rw-p 00000000 00:00 0 >> 87a00000-b3a00000 ---p 00000000 00:00 0 >> b3a25000-b3a76000 rw-p 00000000 00:00 0 >> b3a76000-b3ab3000 ---p 00000000 00:00 0 >> b3ab3000-b3c33000 rwxp 00000000 00:00 0 >> b3c33000-b5ab3000 ---p 00000000 00:00 0 >> b5ab3000-b5ac8000 r-xp 00000000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so >> b5ac8000-b5ad8000 ---p 00015000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so >> b5ad8000-b5ad9000 r--p 00015000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so >> b5ad9000-b5ada000 rw-p 00016000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so >> b5ada000-b5ae2000 rw-s 00000000 b3:02 2576900 /tmp/hsperfdata_root/14700 >> b5ae2000-b5ae9000 r-xp 00000000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so >> b5ae9000-b5af8000 ---p 00007000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so >> b5af8000-b5af9000 r--p 00006000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so >> b5af9000-b5afa000 rw-p 00007000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so >> b5afa000-b5b00000 rw-p 00000000 00:00 0 >> b5b00000-b5c00000 rw-p 00000000 00:00 0 >> b5c00000-b5c0d000 r-xp 00000000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so >> b5c0d000-b5c1c000 ---p 0000d000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so >> b5c1c000-b5c1d000 r--p 0000c000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so >> b5c1d000-b5c1e000 rw-p 0000d000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so >> b5c1e000-b5c20000 rw-p 00000000 00:00 0 >> b5c20000-b5c27000 r-xp 00000000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so >> b5c27000-b5c36000 ---p 00007000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so >> b5c36000-b5c37000 r--p 00006000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so >> b5c37000-b5c38000 rw-p 00007000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so >> b5c38000-b5c3d000 r-xp 00000000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so >> b5c3d000-b5c4c000 ---p 00005000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so >> b5c4c000-b5c4d000 r--p 00004000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so >> b5c4d000-b5c4e000 rw-p 00005000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so >> b5c4e000-b5c5d000 r-xp 00000000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so >> b5c5d000-b5c6c000 ---p 0000f000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so >> b5c6c000-b5c6d000 r--p 0000e000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so >> b5c6d000-b5c6e000 rw-p 0000f000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so >> b5c6e000-b5c71000 ---p 00000000 00:00 0 >> b5c71000-b5cbe000 rw-p 00000000 00:00 0 >> b5cbe000-b5d2d000 r-xp 00000000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so >> b5d2d000-b5d3d000 ---p 0006f000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so >> b5d3d000-b5d3e000 r--p 0006f000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so >> b5d3e000-b5d3f000 rw-p 00070000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so >> b5d3f000-b6d56000 r-xp 00000000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so >> b6d56000-b6d65000 ---p 01017000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so >> b6d65000-b6dba000 r--p 01016000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so >> b6dba000-b6dd2000 rw-p 0106b000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so >> b6dd2000-b6e5e000 rw-p 00000000 00:00 0 >> b6e5e000-b6e6f000 r-xp 00000000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so >> b6e6f000-b6e7f000 ---p 00011000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so >> b6e7f000-b6e80000 r--p 00011000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so >> b6e80000-b6e81000 rw-p 00012000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so >> b6e81000-b6e83000 rw-p 00000000 00:00 0 >> b6e83000-b6e85000 r-xp 00000000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so >> b6e85000-b6e94000 ---p 00002000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so >> b6e94000-b6e95000 r--p 00001000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so >> b6e95000-b6e96000 rw-p 00002000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so >> b6e96000-b6eaf000 r-xp 00000000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 >> b6eaf000-b6ebe000 ---p 00019000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 >> b6ebe000-b6ebf000 r--p 00018000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 >> b6ebf000-b6ec0000 rw-p 00019000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 >> b6ec0000-b6fa2000 r-xp 00000000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so >> b6fa2000-b6fb2000 ---p 000e2000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so >> b6fb2000-b6fb4000 r--p 000e2000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so >> b6fb4000-b6fb5000 rw-p 000e4000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so >> b6fb5000-b6fb8000 rw-p 00000000 00:00 0 >> b6fb8000-b6fc2000 r-xp 00000000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so >> b6fc2000-b6fd1000 ---p 0000a000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so >> b6fd1000-b6fd2000 r--p 00009000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so >> b6fd2000-b6fd3000 rw-p 0000a000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so >> b6fd3000-b6feb000 r-xp 00000000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so >> b6ff2000-b6ff4000 rw-p 00000000 00:00 0 >> b6ff6000-b6ff7000 ---p 00000000 00:00 0 >> b6ff7000-b6ff8000 r--p 00000000 00:00 0 >> b6ff8000-b6ff9000 rwxp 00000000 00:00 0 >> b6ff9000-b6ffb000 rw-p 00000000 00:00 0 >> b6ffb000-b6ffc000 r--p 00018000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so >> b6ffc000-b6ffd000 rw-p 00019000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so >> bed97000-bedb8000 rw-p 00000000 00:00 0 [stack] >> beeb0000-beeb1000 r-xp 00000000 00:00 0 [sigpage] >> beeb1000-beeb2000 r--p 00000000 00:00 0 [vvar] >> beeb2000-beeb3000 r-xp 00000000 00:00 0 [vdso] >> ffff0000-ffff1000 r-xp 00000000 00:00 0 [vectors] >> >> >> VM Arguments: >> jvm_args: -Xms64M -Xmx768M --add-exports=java.base/jdk.internal.module=ALL-UNNAMED >> java_command: build.tools.jigsaw.AddPackagesAttribute /workspace/build/linux-arm-server-fastdebug/jdk >> java_class_path (initial): /workspace/build/linux-arm-server-fastdebug/buildtools/tools_jigsaw_classes >> Launcher Type: SUN_STANDARD >> >> [Global flags] >> uint ConcGCThreads = 1 {product} {ergonomic} Number of threads concurrent gc will use >> uint G1ConcRefinementThreads = 4 {product} {ergonomic} The number of parallel rem set update threads. Will be set ergonomically by default. >> size_t G1HeapRegionSize = 1048576 {product} {ergonomic} Size of the G1 regions. >> uintx GCDrainStackTargetSize = 64 {product} {ergonomic} Number of entries we will try to leave on the stack during parallel gc >> size_t InitialHeapSize = 67108864 {product} {command line} Initial heap size (in bytes); zero means use ergonomics >> size_t MarkStackSize = 32768 {product} {ergonomic} Size of marking stack >> size_t MaxHeapSize = 805306368 {product} {command line} Maximum heap size (in bytes) >> size_t MaxNewSize = 482344960 {product} {ergonomic} Maximum new generation size (in bytes), max_uintx means set ergonomically >> size_t MinHeapDeltaBytes = 1048576 {product} {ergonomic} The minimum change in heap space due to GC (in bytes) >> size_t MinHeapSize = 67108864 {product} {command line} Minimum heap size (in bytes); zero means use ergonomics >> uintx NonProfiledCodeHeapSize = 0 {pd product} {ergonomic} Size of code heap with non-profiled methods (in bytes) >> uintx ProfiledCodeHeapSize = 0 {pd product} {ergonomic} Size of code heap with profiled methods (in bytes) >> size_t SoftMaxHeapSize = 805306368 {manageable} {ergonomic} Soft limit for maximum heap size (in bytes) >> bool UseG1GC = true {product} {ergonomic} Use the Garbage-First garbage collector >> >> Logging: >> Log output configuration: >> #0: stdout all=warning uptime,level,tags >> #1: stderr all=off uptime,level,tags >> >> Environment Variables: >> JAVA_HOME=/opt/java/openjdk >> PATH=/opt/java/openjdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin >> LC_ALL=C >> >> Signal Handlers: >> SIGSEGV: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGBUS: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGFPE: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGPIPE: [libjvm.so+0xc9aa9d], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGXFSZ: [libjvm.so+0xc9aa9d], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGILL: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGUSR2: [libjvm.so+0xc9ad95], sa_mask[0]=00000000000000000000000000000000, sa_flags=SA_RESTART|SA_SIGINFO >> SIGHUP: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none >> SIGINT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none >> SIGTERM: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none >> SIGQUIT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none >> >> >> --------------- S Y S T E M --------------- >> >> OS: >> DISTRIB_ID=Ubuntu >> DISTRIB_RELEASE=18.04 >> DISTRIB_CODENAME=bionic >> DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS" >> uname: Linux 20431585315d 5.4.51-v7l+ #1333 SMP Mon Aug 10 16:51:40 BST 2020 armv7l >> OS uptime: 14 days 7:59 hours >> libc: glibc 2.27 NPTL 2.27 >> rlimit (soft/hard): STACK 8192k/infinity , CORE infinity/infinity , NPROC infinity/infinity , NOFILE 1048576/1048576 , AS infinity/infinity , CPU infinity/infinity , DATA infinity/infinity , FSIZE infinity/infinity , MEMLOCK 64k/64k >> load average: 3.37 3.26 3.09 >> >> /proc/meminfo: >> MemTotal: 3919812 kB >> MemFree: 1255688 kB >> MemAvailable: 3518740 kB >> Buffers: 134316 kB >> Cached: 2117828 kB >> SwapCached: 0 kB >> Active: 1266624 kB >> Inactive: 1167412 kB >> Active(anon): 110360 kB >> Inactive(anon): 80744 kB >> Active(file): 1156264 kB >> Inactive(file): 1086668 kB >> Unevictable: 16 kB >> Mlocked: 16 kB >> HighTotal: 3264512 kB >> HighFree: 1038848 kB >> LowTotal: 655300 kB >> LowFree: 216840 kB >> SwapTotal: 102396 kB >> SwapFree: 102396 kB >> Dirty: 24916 kB >> Writeback: 0 kB >> AnonPages: 181884 kB >> Mapped: 125864 kB >> Shmem: 16892 kB >> KReclaimable: 181816 kB >> Slab: 205164 kB >> SReclaimable: 181816 kB >> SUnreclaim: 23348 kB >> KernelStack: 2240 kB >> PageTables: 2684 kB >> NFS_Unstable: 0 kB >> Bounce: 0 kB >> WritebackTmp: 0 kB >> CommitLimit: 2062300 kB >> Committed_AS: 1125176 kB >> VmallocTotal: 245760 kB >> VmallocUsed: 5520 kB >> VmallocChunk: 0 kB >> Percpu: 512 kB >> CmaTotal: 262144 kB >> CmaFree: 171244 kB >> >> /sys/kernel/mm/transparent_hugepage/enabled: >> /sys/kernel/mm/transparent_hugepage/defrag (defrag/compaction efforts parameter): >> >> Process Memory: >> Virtual Size: 888828K (peak: 888828K) >> Resident Set Size: 25020K (peak: 25020K) (anon: 11372K, file: 13648K, shmem: 0K) >> Swapped out: 0K >> C-Heap outstanding allocations: 1636K >> >> /proc/sys/kernel/threads-max (system-wide limit on the number of threads): 57119 >> /proc/sys/vm/max_map_count (maximum number of memory map areas a process may have): 65530 >> /proc/sys/kernel/pid_max (system-wide limit on number of process identifiers): 32768 >> >> Steal ticks since vm start: 0 >> Steal ticks percentage since vm start: 0.000 >> >> CPU: total 4 (initial active 4) (ARMv7), vfp, vfp3-32, simd, mp_ext >> /proc/cpuinfo: >> processor : 0 >> model name : ARMv7 Processor rev 3 (v7l) >> BogoMIPS : 270.00 >> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 >> CPU implementer : 0x41 >> CPU architecture: 7 >> CPU variant : 0x0 >> CPU part : 0xd08 >> CPU revision : 3 >> >> processor : 1 >> model name : ARMv7 Processor rev 3 (v7l) >> BogoMIPS : 270.00 >> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 >> CPU implementer : 0x41 >> CPU architecture: 7 >> CPU variant : 0x0 >> CPU part : 0xd08 >> CPU revision : 3 >> >> processor : 2 >> model name : ARMv7 Processor rev 3 (v7l) >> BogoMIPS : 270.00 >> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 >> CPU implementer : 0x41 >> CPU architecture: 7 >> CPU variant : 0x0 >> CPU part : 0xd08 >> CPU revision : 3 >> >> processor : 3 >> model name : ARMv7 Processor rev 3 (v7l) >> BogoMIPS : 270.00 >> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 >> CPU implementer : 0x41 >> CPU architecture: 7 >> CPU variant : 0x0 >> CPU part : 0xd08 >> CPU revision : 3 >> >> Hardware : BCM2711 >> Revision : c03111 >> Serial : 100000001c47254f >> Model : Raspberry Pi 4 Model B Rev 1.1 >> >> Online cpus: 0-3 >> Offline cpus: >> >> Memory: 4k page, physical 3919812k(1255688k free), swap 102396k(102396k free) >> >> vm_info: OpenJDK Server VM (fastdebug 16-internal+0-adhoc..workspace) for linux-arm JRE (16-internal+0-adhoc..workspace), built on Oct 12 2020 19:49:51 by "" with gcc 7.5.0 >> >> END. >> >> >> >> >>> On 12. Oct 2020, at 20:24, Aleksey Shipilev wrote: >>> >>> Hi, >>> >>> On 10/12/20 8:12 PM, Marc Hoffmann wrote: >>>> Please find the build log and the hs_err file for commit fd0cb98ed03c6214c02ccd3503c1e6d77065a428 attached. >>> >>> Please try to build with fastdebug (./configure --enable-debug), so that JVM asserts meaninfully somewhere? >>> >>>> Is there any additional information I can provide to help getting these builds fixed again? >>> >>> I am seeing plenty of weird x86_32 crashes since last week. Pretty sure some of them would manifest on ARM32 as well. This is why building with fastdebug is the next step: it maps out the bug symptoms. >>> >>> -- >>> Thanks, >>> -Aleksey >>> >> From rriggs at openjdk.java.net Wed Oct 14 18:02:21 2020 From: rriggs at openjdk.java.net (Roger Riggs) Date: Wed, 14 Oct 2020 18:02:21 GMT Subject: RFR: 8173585: Intrinsify StringLatin1.indexOf(char) [v6] In-Reply-To: References: <_T0873dC5tfUtGn9r1_Y21JkPKRt-za3MM9hPN2GQKQ=.b865fe53-5417-424f-81b6-1566a330640e@github.com> Message-ID: On Wed, 14 Oct 2020 15:01:42 GMT, Roger Riggs wrote: >> Due to the requirement for multiple reviewers, I had been waiting to add my review of the Core-Libs files until the >> HotSpot reviewers had approved! I see only one reviewer credited in the commit. > > This integration without testing with a current merge from the master and has caused two build failures. > > JDK-8254761: Wrong intrinsic annotation used for StringLatin1.indexOfChar > > JDK-8254775: Microbenchmark StringIndexOfChar doesn't compile > > There is a raw unicode character in the JMH test that causes a compilation error. > == Output from failing command(s) repeated here === > [2020-10-14T14:39:09,608Z] * For target support_test_micro_classes__the.BUILD_JDK_MICROBENCHMARK_batch: > [2020-10-14T14:39:09,611Z] > /opt/mach5/mesos/work_dir/slaves/4076d11c-c6ed-4d07-84c1-4ab8d55cd975-S108796/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/400e1f56-2d49-42d8-8879-97d4fbb6c909/runs/c49da2bc-a8fe-4a5d-8159-57a9b0316fd2/workspace/open/test/micro/org/openjdk/bench/java/lang/StringIndexOfChar.java:71: > error: unmappable character (0xE2) for encoding ascii [2020-10-14T14:39:09,611Z] > sb.append(isUtf16?'???':'b'); [2020-10-14T14:39:09,611Z] ^ [2020-10-14T14:39:09,611Z] > /opt/mach5/mesos/work_dir/slaves/4076d11c-c6ed-4d07-84c1-4ab8d55cd975-S108796/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/400e1f56-2d49-42d8-8879-97d4fbb6c909/runs/c49da2bc-a8fe-4a5d-8159-57a9b0316fd2/workspace/open/test/micro/org/openjdk/bench/java/lang/StringIndexOfChar.java:71: > error: unmappable character (0x98) for encoding ascii [2020-10-14T14:39:09,611Z] > sb.append(isUtf16?'???':'b'); [2020-10-14T14:39:09,611Z] ^ [2020-10-14T14:39:09,611Z] > /opt/mach5/mesos/work_dir/slaves/4076d11c-c6ed-4d07-84c1-4ab8d55cd975-S108796/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/400e1f56-2d49-42d8-8879-97d4fbb6c909/runs/c49da2bc-a8fe-4a5d-8159-57a9b0316fd2/workspace/open/test/micro/org/openjdk/bench/java/lang/StringIndexOfChar.java:71: > error: unmappable character (0xBA) for encoding ascii [2020-10-14T14:39:09,611Z] > sb.append(isUtf16?'???':'b'); [2020-10-14T14:39:09,611Z] ^ [2020-10-14T14:39:09,611Z] > /opt/mach5/mesos/work_dir/slaves/4076d11c-c6ed-4d07-84c1-4ab8d55cd975-S108796/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/400e1f56-2d49-42d8-8879-97d4fbb6c909/runs/c49da2bc-a8fe-4a5d-8159-57a9b0316fd2/workspace/open/test/micro/org/openjdk/bench/java/lang/StringIndexOfChar.java:71: > error: unclosed character literal [2020-10-14T14:39:09,611Z] sb.append(isUtf16?'???':'b'); > [2020-10-14T14:39:09,611Z] ^ [2020-10-14T14:39:09,611Z] > /opt/mach5/mesos/work_dir/slaves/4076d11c-c6ed-4d07-84c1-4ab8d55cd975-S108796/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/400e1f56-2d49-42d8-8879-97d4fbb6c909/runs/c49da2bc-a8fe-4a5d-8159-57a9b0316fd2/workspace/open/test/micro/org/openjdk/bench/java/lang/StringIndexOfChar.java:71: > error: unclosed character literal [2020-10-14T14:39:09,611Z] sb.append(isUtf16?'???':'b');``` And also a failed Graal test because of the new intrinsic. And JDK-8254785: compiler/graalunit/HotspotTest.java failed with "missing Graal intrinsics for: java/lang/StringLatin1.indexOfChar([BIII)I" @phohensee don't be so quick to type `/sponsor`; there are three separate build and test failures. ------------- PR: https://git.openjdk.java.net/jdk/pull/71 From rehn at openjdk.java.net Wed Oct 14 18:52:09 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Wed, 14 Oct 2020 18:52:09 GMT Subject: RFR: 8254780: EnterInterpOnlyModeClosure::completed() always returns true In-Reply-To: References: Message-ID: On Wed, 14 Oct 2020 15:16:54 GMT, Aleksey Shipilev wrote: > JDK-8238761 introduced this funky code: > > class EnterInterpOnlyModeClosure : public HandshakeClosure { > private: > bool _completed; > public: > EnterInterpOnlyModeClosure() : HandshakeClosure("EnterInterpOnlyMode"), _completed(false) { } > void do_thread(Thread* th) { > ... > _completed = true; > } > bool completed() { > return _completed = true; > } > }; > > It seems the flag is there to communicate that target thread indeed executed the handshake. But `completed()` sets the > bool unconditionally and always returns true. And it is used in one and only place here: > guarantee(hs.completed(), "Handshake failed: Target thread is not alive?"); > > ...which means that guarantee always passes. > > Attention @robehn :) My bad, thanks @shipilev ! ------------- Marked as reviewed by rehn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/662 From psandoz at openjdk.java.net Wed Oct 14 20:06:30 2020 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Wed, 14 Oct 2020 20:06:30 GMT Subject: Integrated: 8223347: Integration of Vector API (Incubator) In-Reply-To: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> References: <-PE4TwXgvq2bemAn_8csjn4_j7zoAolnQz6QQt3z0Wk=.eaa9999f-0713-4349-b31d-934717aa37a1@github.com> Message-ID: On Fri, 25 Sep 2020 20:14:29 GMT, Paul Sandoz wrote: > This pull request is for integration of the Vector API. It was previously reviewed under conditions when mercurial was > used for the source code control system. Review threads can be found here (searching for issue number 8223347 in the > title): https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-April/thread.html > https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-May/thread.html > https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-July/thread.html > > If mercurial was still being used the code would be pushed directly, once the CSR is approved. However, in this case a > pull request is required and needs explicit reviewer approval. Between the final review and this pull request no code > has changed, except for that related to merging. This pull request has now been integrated. Changeset: 0c99b192 Author: Paul Sandoz URL: https://git.openjdk.java.net/jdk/commit/0c99b192 Stats: 295107 lines in 336 files changed: 292957 ins; 1062 del; 1088 mod 8223347: Integration of Vector API (Incubator) Co-authored-by: Vivek Deshpande Co-authored-by: Qi Feng Co-authored-by: Ian Graves Co-authored-by: Jean-Philippe Halimi Co-authored-by: Vladimir Ivanov Co-authored-by: Ningsheng Jian Co-authored-by: Razvan Lupusoru Co-authored-by: Smita Kamath Co-authored-by: Rahul Kandu Co-authored-by: Kishor Kharbas Co-authored-by: Eric Liu Co-authored-by: Aaloan Miftah Co-authored-by: John R Rose Co-authored-by: Shravya Rukmannagari Co-authored-by: Paul Sandoz Co-authored-by: Sandhya Viswanathan Co-authored-by: Lauren Walkowski Co-authored-by: Yang Zang Co-authored-by: Joshua Zhu Co-authored-by: Wang Zhuo Co-authored-by: Jatin Bhateja Reviewed-by: erikj, chegar, kvn, darcy, forax, briangoetz, aph, epavlova, coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/367 From rehn at openjdk.java.net Wed Oct 14 20:46:12 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Wed, 14 Oct 2020 20:46:12 GMT Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: References: <35eLsMpWmcCUoiEWhnYdSpZNmvLy4ra56Qtd6eRW574=.4e7c9278-3e0d-457d-9c15-eef45bae9755@github.com> Message-ID: On Wed, 14 Oct 2020 16:29:30 GMT, Patricio Chilano Mateo wrote: >>> Regarding the use of cross_modify_fence(), I filed a bug last week to remove some uneeded uses of them in common code >>> (https://bugs.openjdk.java.net/browse/JDK-8254264). Just a heads up before I send the RFR since I see some reference to >>> them in the added comments. >> >> I'm going to assume your change is just a two line change (removing the cross_modify_fence's), and I'll test that on >> top of my patches using the VerifyCrossModifyFence flag - I'll give it a run of everything, which can take a while. >> Plus I'll manually look at the code to to make sure I'm happy. I think it makes sense that your patch goes in first, >> then I can rebase and update code comments too. Let me know your pull request once you've raised it. > > Yes, the change just removes those extra cross_modify_fence's. Please check https://github.com/openjdk/jdk/pull/655 A question, ISB don't flush the I-cache which I thought was needed? I would have expected something more similar to gcc clear_cache. ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From rrich at openjdk.java.net Wed Oct 14 20:53:15 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Wed, 14 Oct 2020 20:53:15 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v10] In-Reply-To: References: Message-ID: On Wed, 14 Oct 2020 00:25:14 GMT, Vladimir Kozlov wrote: > > > Good. Thanks for the review, Vladimir (@vnkozlov)! I'm still (stress) testing adaptations to lazy/concurrent thread stack processing for ZGC. --Richard. ------------- PR: https://git.openjdk.java.net/jdk/pull/119 From dcubed at openjdk.java.net Wed Oct 14 20:55:17 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 14 Oct 2020 20:55:17 GMT Subject: RFR: 8173585: Intrinsify StringLatin1.indexOf(char) [v6] In-Reply-To: References: <_T0873dC5tfUtGn9r1_Y21JkPKRt-za3MM9hPN2GQKQ=.b865fe53-5417-424f-81b6-1566a330640e@github.com> Message-ID: On Wed, 14 Oct 2020 17:59:53 GMT, Roger Riggs wrote: >> This integration without testing with a current merge from the master and has caused two build failures. >> >> JDK-8254761: Wrong intrinsic annotation used for StringLatin1.indexOfChar >> >> JDK-8254775: Microbenchmark StringIndexOfChar doesn't compile >> >> There is a raw unicode character in the JMH test that causes a compilation error. >> == Output from failing command(s) repeated here === >> [2020-10-14T14:39:09,608Z] * For target support_test_micro_classes__the.BUILD_JDK_MICROBENCHMARK_batch: >> [2020-10-14T14:39:09,611Z] >> /opt/mach5/mesos/work_dir/slaves/4076d11c-c6ed-4d07-84c1-4ab8d55cd975-S108796/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/400e1f56-2d49-42d8-8879-97d4fbb6c909/runs/c49da2bc-a8fe-4a5d-8159-57a9b0316fd2/workspace/open/test/micro/org/openjdk/bench/java/lang/StringIndexOfChar.java:71: >> error: unmappable character (0xE2) for encoding ascii [2020-10-14T14:39:09,611Z] >> sb.append(isUtf16?'???':'b'); [2020-10-14T14:39:09,611Z] ^ [2020-10-14T14:39:09,611Z] >> /opt/mach5/mesos/work_dir/slaves/4076d11c-c6ed-4d07-84c1-4ab8d55cd975-S108796/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/400e1f56-2d49-42d8-8879-97d4fbb6c909/runs/c49da2bc-a8fe-4a5d-8159-57a9b0316fd2/workspace/open/test/micro/org/openjdk/bench/java/lang/StringIndexOfChar.java:71: >> error: unmappable character (0x98) for encoding ascii [2020-10-14T14:39:09,611Z] >> sb.append(isUtf16?'???':'b'); [2020-10-14T14:39:09,611Z] ^ [2020-10-14T14:39:09,611Z] >> /opt/mach5/mesos/work_dir/slaves/4076d11c-c6ed-4d07-84c1-4ab8d55cd975-S108796/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/400e1f56-2d49-42d8-8879-97d4fbb6c909/runs/c49da2bc-a8fe-4a5d-8159-57a9b0316fd2/workspace/open/test/micro/org/openjdk/bench/java/lang/StringIndexOfChar.java:71: >> error: unmappable character (0xBA) for encoding ascii [2020-10-14T14:39:09,611Z] >> sb.append(isUtf16?'???':'b'); [2020-10-14T14:39:09,611Z] ^ [2020-10-14T14:39:09,611Z] >> /opt/mach5/mesos/work_dir/slaves/4076d11c-c6ed-4d07-84c1-4ab8d55cd975-S108796/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/400e1f56-2d49-42d8-8879-97d4fbb6c909/runs/c49da2bc-a8fe-4a5d-8159-57a9b0316fd2/workspace/open/test/micro/org/openjdk/bench/java/lang/StringIndexOfChar.java:71: >> error: unclosed character literal [2020-10-14T14:39:09,611Z] sb.append(isUtf16?'???':'b'); >> [2020-10-14T14:39:09,611Z] ^ [2020-10-14T14:39:09,611Z] >> /opt/mach5/mesos/work_dir/slaves/4076d11c-c6ed-4d07-84c1-4ab8d55cd975-S108796/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/400e1f56-2d49-42d8-8879-97d4fbb6c909/runs/c49da2bc-a8fe-4a5d-8159-57a9b0316fd2/workspace/open/test/micro/org/openjdk/bench/java/lang/StringIndexOfChar.java:71: >> error: unclosed character literal [2020-10-14T14:39:09,611Z] sb.append(isUtf16?'???':'b');``` > > And also a failed Graal test because of the new intrinsic. > > And JDK-8254785: compiler/graalunit/HotspotTest.java failed with "missing Graal intrinsics for: > java/lang/StringLatin1.indexOfChar([BIII)I" > @phohensee don't be so quick to type `/sponsor`; there are three separate build and test failures. @phohensee - @vnkozlov has determined that a new Tier2 test failure is also caused by this fix. See https://bugs.openjdk.java.net/browse/JDK-8254790. ------------- PR: https://git.openjdk.java.net/jdk/pull/71 From shade at openjdk.java.net Wed Oct 14 21:00:17 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 14 Oct 2020 21:00:17 GMT Subject: Integrated: 8254781: Remove unimplemented ClassFieldMap::compute_field_count In-Reply-To: References: Message-ID: On Wed, 14 Oct 2020 15:27:26 GMT, Aleksey Shipilev wrote: > There is no definition of `ClassFieldMap::compute_field_count` in current tip or any history after the initial load. > Can be removed. > Testing: > - [x] Linux x86_64 build > - [x] Text searches for `compute_field_count` in `src/hotspot` This pull request has now been integrated. Changeset: 8fb294a2 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/8fb294a2 Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod 8254781: Remove unimplemented ClassFieldMap::compute_field_count Reviewed-by: phh, sspitsyn ------------- PR: https://git.openjdk.java.net/jdk/pull/663 From shade at openjdk.java.net Wed Oct 14 21:01:17 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 14 Oct 2020 21:01:17 GMT Subject: Integrated: 8254780: EnterInterpOnlyModeClosure::completed() always returns true In-Reply-To: References: Message-ID: On Wed, 14 Oct 2020 15:16:54 GMT, Aleksey Shipilev wrote: > JDK-8238761 introduced this funky code: > > class EnterInterpOnlyModeClosure : public HandshakeClosure { > private: > bool _completed; > public: > EnterInterpOnlyModeClosure() : HandshakeClosure("EnterInterpOnlyMode"), _completed(false) { } > void do_thread(Thread* th) { > ... > _completed = true; > } > bool completed() { > return _completed = true; > } > }; > > It seems the flag is there to communicate that target thread indeed executed the handshake. But `completed()` sets the > bool unconditionally and always returns true. And it is used in one and only place here: > guarantee(hs.completed(), "Handshake failed: Target thread is not alive?"); > > ...which means that guarantee always passes. > > Attention @robehn :) This pull request has now been integrated. Changeset: da2f5ab5 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/da2f5ab5 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8254780: EnterInterpOnlyModeClosure::completed() always returns true Reviewed-by: sspitsyn, rehn ------------- PR: https://git.openjdk.java.net/jdk/pull/662 From hohensee at amazon.com Wed Oct 14 21:28:09 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 14 Oct 2020 21:28:09 +0000 Subject: RFR: 8173585: Intrinsify StringLatin1.indexOf(char) [v6] Message-ID: <838553DC-4961-4681-990E-4254D3316538@amazon.com> My apologies. I relied on the other reviewers. I'll do an independent review in the future. Thanks, Paul ?On 10/14/20, 11:02 AM, "core-libs-dev on behalf of Roger Riggs" wrote: On Wed, 14 Oct 2020 15:01:42 GMT, Roger Riggs wrote: >> Due to the requirement for multiple reviewers, I had been waiting to add my review of the Core-Libs files until the >> HotSpot reviewers had approved! I see only one reviewer credited in the commit. > > This integration without testing with a current merge from the master and has caused two build failures. > > JDK-8254761: Wrong intrinsic annotation used for StringLatin1.indexOfChar > > JDK-8254775: Microbenchmark StringIndexOfChar doesn't compile > > There is a raw unicode character in the JMH test that causes a compilation error. > == Output from failing command(s) repeated here === > [2020-10-14T14:39:09,608Z] * For target support_test_micro_classes__the.BUILD_JDK_MICROBENCHMARK_batch: > [2020-10-14T14:39:09,611Z] > /opt/mach5/mesos/work_dir/slaves/4076d11c-c6ed-4d07-84c1-4ab8d55cd975-S108796/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/400e1f56-2d49-42d8-8879-97d4fbb6c909/runs/c49da2bc-a8fe-4a5d-8159-57a9b0316fd2/workspace/open/test/micro/org/openjdk/bench/java/lang/StringIndexOfChar.java:71: > error: unmappable character (0xE2) for encoding ascii [2020-10-14T14:39:09,611Z] > sb.append(isUtf16?'???':'b'); [2020-10-14T14:39:09,611Z] ^ [2020-10-14T14:39:09,611Z] > /opt/mach5/mesos/work_dir/slaves/4076d11c-c6ed-4d07-84c1-4ab8d55cd975-S108796/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/400e1f56-2d49-42d8-8879-97d4fbb6c909/runs/c49da2bc-a8fe-4a5d-8159-57a9b0316fd2/workspace/open/test/micro/org/openjdk/bench/java/lang/StringIndexOfChar.java:71: > error: unmappable character (0x98) for encoding ascii [2020-10-14T14:39:09,611Z] > sb.append(isUtf16?'???':'b'); [2020-10-14T14:39:09,611Z] ^ [2020-10-14T14:39:09,611Z] > /opt/mach5/mesos/work_dir/slaves/4076d11c-c6ed-4d07-84c1-4ab8d55cd975-S108796/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/400e1f56-2d49-42d8-8879-97d4fbb6c909/runs/c49da2bc-a8fe-4a5d-8159-57a9b0316fd2/workspace/open/test/micro/org/openjdk/bench/java/lang/StringIndexOfChar.java:71: > error: unmappable character (0xBA) for encoding ascii [2020-10-14T14:39:09,611Z] > sb.append(isUtf16?'???':'b'); [2020-10-14T14:39:09,611Z] ^ [2020-10-14T14:39:09,611Z] > /opt/mach5/mesos/work_dir/slaves/4076d11c-c6ed-4d07-84c1-4ab8d55cd975-S108796/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/400e1f56-2d49-42d8-8879-97d4fbb6c909/runs/c49da2bc-a8fe-4a5d-8159-57a9b0316fd2/workspace/open/test/micro/org/openjdk/bench/java/lang/StringIndexOfChar.java:71: > error: unclosed character literal [2020-10-14T14:39:09,611Z] sb.append(isUtf16?'???':'b'); > [2020-10-14T14:39:09,611Z] ^ [2020-10-14T14:39:09,611Z] > /opt/mach5/mesos/work_dir/slaves/4076d11c-c6ed-4d07-84c1-4ab8d55cd975-S108796/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/400e1f56-2d49-42d8-8879-97d4fbb6c909/runs/c49da2bc-a8fe-4a5d-8159-57a9b0316fd2/workspace/open/test/micro/org/openjdk/bench/java/lang/StringIndexOfChar.java:71: > error: unclosed character literal [2020-10-14T14:39:09,611Z] sb.append(isUtf16?'???':'b');``` And also a failed Graal test because of the new intrinsic. And JDK-8254785: compiler/graalunit/HotspotTest.java failed with "missing Graal intrinsics for: java/lang/StringLatin1.indexOfChar([BIII)I" @phohensee don't be so quick to type `/sponsor`; there are three separate build and test failures. ------------- PR: https://git.openjdk.java.net/jdk/pull/71 From github.com+51754783+coreyashford at openjdk.java.net Wed Oct 14 21:33:15 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Wed, 14 Oct 2020 21:33:15 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v4] In-Reply-To: References: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> <7-p-Kc9lQyyuoWdNtmgbXbwkxsgk4oQGKmFSCcMpvnU=.97810c01-3200-4767-bbd4-35d53c2bc5ca@github.com> <6Voyfr_s-ieyRA-8Rtvvpz7tkhhicA8sY2d2KTp3Kmw=.fa256bae-2143-4b43-bfea-5837ad31eb6a@github.com> Message-ID: <7XjzEn5DggliDrvjhrGwXZL5r4lsqeGF9SGLmRr5L84=.a4481a62-4ecf-4e3f-98f3-70e548c67b52@github.com> On Wed, 14 Oct 2020 10:29:40 GMT, Martin Doerr wrote: > Hi Corey, > Are you thinking of a case where that produces a higher iteration count? > Sorry for the confusion. This is also fine: length = sl - sp - 12 i = length / block_size if (i <= 0) return 0 But I > still wonder why we should use 2 branches. Why not srawi_ ble(CCR0, return_zero) ? You're right! I originally thought that the `srawi.` was setting only the Zero bit, but it sets others as well. > Ah, I should have checked the calling conventions. I thought all of the CR* regs are volatile. I will fix that. > Actually, we do save and restore all CRs, so it?s not a real problem with the current implementation. But I prefer > staying closer to the elf ABI as long as there?s no good reason to do it differently. Looks like I don't need that code at all now, but it's good to know for the future; I have an encode intrinsic in the works. > Your original comment said "2nd review", so I thought you meant you need to review it again after the changes. > We usually require at least 2 reviews by different people for all non-trivial changes. And I don?t consider the PPC64 > part as trivial. In addition to that, I?m not familiar with Power 10. I received permission to request help from the GNU toolchain team here to review it. Due to family issues and work schedule on my end, it will be at least the middle of next week before I can get a reviewer to have a look. Thanks for your continued patience and help. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From david.holmes at oracle.com Wed Oct 14 21:55:44 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 15 Oct 2020 07:55:44 +1000 Subject: arm32 builds continue to fail for me after 8253540 and 8253901 In-Reply-To: <3509926C-B3D5-41D5-9ECE-6AC957E8ECD8@mountainminds.com> References: <56ff08d5-a4e5-788a-1c29-02f76e8755d2@redhat.com> <17F91692-4F3D-4FAA-AB94-361B6C84F982@mountainminds.com> <3509926C-B3D5-41D5-9ECE-6AC957E8ECD8@mountainminds.com> Message-ID: On 15/10/2020 3:56 am, Marc Hoffmann wrote: > Hi Boris, > > I?m not familiar with the hotspot codebase at all. But the assertions in InterpreterMacroAssembler::unlock_object (https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/arm/interp_masm_arm.cpp#L989) look contradictory to me: > > assert(Rlock == R0, "the first argument"); > > but then we assert that Rlock and R0 is different: > > assert_different_registers(Robj, Rmark, Rlock, R0, Rtemp); > > That code was not changed in 8253540. Maybe it was not called at all before the change? This change: https://github.com/openjdk/jdk/commit/fd0cb98ed03c6214c02ccd3503c1e6d77065a428 for 8253901, modified the assert to use R0 instead of R1 and so introduced the problem AFAICS. The comments were not updated either: // Argument: R1: Points to BasicObjectLock structure for lock // Throw an IllegalMonitorException if object is not locked by current thread // Blows volatile registers R0-R3, Rtemp, LR. Calls VM. void InterpreterMacroAssembler::unlock_object(Register Rlock) { assert(Rlock == R1, "the second argument"); David ----- > Regards, > -marc > > > >> On 13. Oct 2020, at 08:48, Boris Ulasevich wrote: >> >> Hi Marc, >> >> I created JDK-8254661 for the issue. I would love to fix it, but still >> can't reproduce the crash (even on Raspberry Pi). >> What configuration do you have? The following sequence works Ok for me: >> pi at raspberrypi $ git clone https://github.com/openjdk/jdk >> pi at raspberrypi $ cd jdk >> pi at raspberrypi $ bash configure --with-boot-jdk=/home/pi/jdk-15 >> pi at raspberrypi $ make >> >> Your debug build shows that I did not fix the >> assert_different_registers in >> InterpreterMacroAssembler::unlock_object() >> body (and the function comment by the way!), though with eyeballing I >> do not see what is wrong for Rlock=R0: >> https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/arm/interp_masm_arm.cpp#L1000 >> >> regards, >> Boris >> >> On Mon, Oct 12, 2020 at 11:34 PM Marc Hoffmann >> wrote: >>> >>> Hi Aleksey, hi Boris, >>> >>> for me the crash is always reproducible: Every single build after >>> >>> 77a0f3999afa322b64643afd4a161164440af975 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function >>> >>> fails on arm32 (build on ubuntu in docker on a raspberry pi 4). Before this commit I haven?t encountered any failures. >>> >>> Here is the hs_err file with ?enable-debug (reproduced with current master c7f00640627eab38b77d23d07876cf0247fa18f3). >>> >>> Cheers, >>> -marc >>> >>> >>> # >>> # A fatal error has been detected by the Java Runtime Environment: >>> # >>> # Internal Error (/workspace/src/hotspot/share/asm/register.hpp:160), pid=14700, tid=14705 >>> # assert(a != b && a != c && a != d && a != e && b != c && b != d && b != e && c != d && c != e && d != e) failed: registers must be different: a=0x00000002, b=0x00000003, c=0x00000000, d=0x00000000, e=0x0000000c >>> # >>> # JRE version: (16.0) (fastdebug build ) >>> # Java VM: OpenJDK Server VM (fastdebug 16-internal+0-adhoc..workspace, mixed mode, g1 gc, linux-arm) >>> # Problematic frame: >>> # V [libjvm.so+0x7571fc] InterpreterMacroAssembler::unlock_object(RegisterImpl*) [clone .part.34]+0x63 >>> # >>> # Core dump will be written. Default location: /workspace/make/core >>> # >>> # >>> >>> --------------- S U M M A R Y ------------ >>> >>> Command Line: -Xms64M -Xmx768M --add-exports=java.base/jdk.internal.module=ALL-UNNAMED build.tools.jigsaw.AddPackagesAttribute /workspace/build/linux-arm-server-fastdebug/jdk >>> >>> Host: 20431585315d, rev 3 (v7l), 4 cores, 3G, Ubuntu 18.04.3 LTS >>> Time: Mon Oct 12 20:22:15 2020 UTC elapsed time: 0.144243 seconds (0d 0h 0m 0s) >>> >>> --------------- T H R E A D --------------- >>> >>> Current thread (0xb5b16460): JavaThread "Unknown thread" [_thread_in_vm, id=14705, stack(0xb5c6e000,0xb5cbe000)] >>> >>> Stack: [0xb5c6e000,0xb5cbe000], sp=0xb5cbc2d0, free space=312k >>> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) >>> V [libjvm.so+0x7571fc] InterpreterMacroAssembler::unlock_object(RegisterImpl*) [clone .part.34]+0x63 >>> >>> Registers: >>> r0 = 0x00000003 >>> r1 = 0x000000a0 >>> r2 = 0x00000002 >>> r3 = 0x00000000 >>> r4 = 0xb5b168b0 >>> r5 = 0x0000000c >>> r6 = 0x00000000 >>> r7 = 0xb5cbc2e8 >>> r8 = 0xb6db1fa8 >>> r9 = 0xb5cbc760 >>> r10 = 0xe3520000 >>> fp = 0xb6db1fa8 >>> r12 = 0xb6ff8000 >>> sp = 0xb5cbc2d0 >>> lr = 0x00000058 >>> pc = 0xb64961fc >>> cpsr = 0x200f0030 >>> >>> Top of Stack: (sp=0xb5cbc2d0) >>> 0xb5cbc2d0: 00000002 00000003 00000000 00000000 >>> 0xb5cbc2e0: 0000000c 00000048 0000006e 00000000 >>> 0xb5cbc2f0: 00000000 00000000 00000000 0000007c >>> 0xb5cbc300: 00000000 00000077 b5cbc378 00000000 >>> 0xb5cbc310: b5cbc380 0000000f b6db1fa8 b5cbc340 >>> 0xb5cbc320: b5b168b0 b6db1fa8 b5cbc4c4 b6008a2b >>> 0xb5cbc330: b5cbc454 b5cbc348 0000000f b61971cf >>> 0xb5cbc340: b5cbc380 b5cbc3b0 b5b168b0 b5cbc388 >>> >>> Instructions: (pc=0xb64961fc) >>> 0xb64960fc: 440be9c7 a034f8c7 f5e06139 68ebf7ed >>> 0xb649610c: f040689a 46184164 1180f441 68996011 >>> 0xb649611c: f5e03104 4b12f781 46284622 1003f858 >>> 0xb649612c: f998f2e4 f64f68eb f2ce7210 6899122f >>> 0xb649613c: 600a4618 31046899 f76ef5e0 46284631 >>> 0xb649614c: f4e6f69f f5e04630 f507f79b 46bd7786 >>> 0xb649615c: 8ff0e8bd 0091bfd0 00006ee4 000059d0 >>> 0xb649616c: 00007d1c 4a084b07 b480447b 589baf00 >>> 0xb649617c: b91b781b f85d46bd 47707b04 f85d46bd >>> 0xb649618c: e7197b04 0091be30 00006a24 bf182900 >>> 0xb649619c: f1a1290c e92d0202 b0f34ff0 2301bf08 >>> 0xb64961ac: bf18af06 f8df2300 2a018268 461abf8c >>> 0xb64961bc: 0201f043 f04f460e 44f833ff 30d0f8c7 >>> 0xb64961cc: f8c74604 23003140 333de9c7 30fcf887 >>> 0xb64961dc: 3359e9c7 316cf887 4a8eb1da 0e58f04f >>> 0xb64961ec: 250c2003 1002f858 f8d12202 21a0c000 >>> 0xb64961fc: e000f88c 2000e9cd 6302e9cd 4b874a86 >>> 0xb649620c: 447a4887 9504447b f58f4478 f00cfaa3 >>> 0xb649621c: 68e2f2d5 4340f44f 33a0f2ce 0a04f04f >>> 0xb649622c: 7980f04f 0b00f04f 46106891 600b2501 >>> 0xb649623c: 44516891 f6f0f5e0 4a7b497a 3001f858 >>> 0xb649624c: 0140f107 a048f8c7 4608643e f8c7681b >>> 0xb649625c: 60fb904c f858647b f8c72002 f8c7b054 >>> 0xb649626c: 60bab058 e9c73208 653abb19 66fd607a >>> 0xb649627c: f732f5e0 689a68e3 4164f040 f4414618 >>> 0xb649628c: 60111181 44516899 f6c6f5e0 0110f107 >>> 0xb649629c: 68fb687a f8c74608 623aa018 6304e9c7 >>> 0xb64962ac: 901cf8c7 bb09e9c7 bb0de9c7 f5e063fd >>> 0xb64962bc: 68e3f713 f0406899 46184264 4240f442 >>> 0xb64962cc: 6899600a f1074451 f5e00ad0 4b57f6a5 >>> 0xb64962dc: 3003f858 2b00781b 8093f040 f04f68bb >>> 0xb64962ec: 68f97c80 33082500 c0acf8c7 0b01f04f >>> >>> >>> >>> --------------- P R O C E S S --------------- >>> >>> uid : 0 euid : 0 gid : 0 egid : 0 >>> >>> umask: 0022 (----w--w-) >>> >>> Threads class SMR info: >>> _java_thread_list=0xb6e56078, length=0, elements={ >>> } >>> _java_thread_list_alloc_cnt=1, _java_thread_list_free_cnt=0, _java_thread_list_max=0, _nested_thread_list_max=0 >>> _delete_lock_wait_cnt=0, _delete_lock_wait_max=0 >>> _to_delete_list_cnt=0, _to_delete_list_max=0 >>> >>> Java Threads: ( => current thread ) >>> >>> Other Threads: >>> 0xb5b73188 GCTaskThread "GC Thread#0" [stack: 0x81d00000,0x81d80000] [id=14706] >>> 0xb5b77dc0 ConcurrentGCThread "G1 Main Marker" [stack: 0x81c7e000,0x81cfe000] [id=14707] >>> 0xb5b790c0 ConcurrentGCThread "G1 Conc#0" [stack: 0x81a80000,0x81b00000] [id=14708] >>> 0xb5bde230 ConcurrentGCThread "G1 Refine#0" [stack: 0x81780000,0x81800000] [id=14709] >>> 0xb5bdf488 ConcurrentGCThread "G1 Service" [stack: 0x81580000,0x81600000] [id=14710] >>> >>> =>0xb5b16460 (exited) JavaThread "Unknown thread" [_thread_in_vm, id=14705, stack(0xb5c6e000,0xb5cbe000)] >>> >>> Threads with active compile tasks: >>> >>> VM state: not at safepoint (not fully initialized) >>> >>> VM Mutex/Monitor currently owned by a thread: None >>> >>> GC Precious Log: >>> CPUs: 4 total, 4 available >>> Memory: 3827M >>> Large Page Support: Disabled >>> NUMA Support: Disabled >>> Compressed Oops: Disabled >>> Heap Region Size: 1M >>> Heap Min Capacity: 64M >>> Heap Initial Capacity: 64M >>> Heap Max Capacity: 768M >>> Pre-touch: Disabled >>> Parallel Workers: 4 >>> Concurrent Workers: 1 >>> Concurrent Refinement Workers: 4 >>> Periodic GC: Disabled >>> >>> Heap: >>> garbage-first heap total 65536K, used 0K [0x83a00000, 0xb3a00000) >>> region size 1024K, 1 young (1024K), 0 survivors (0K) >>> Metaspace used 944K, capacity 2200K, committed 2200K, reserved 4400K >>> >>> Heap Regions: E=young(eden), S=young(survivor), O=old, HS=humongous(starts), HC=humongous(continues), CS=collection set, F=free, OA=open archive, CA=closed archive, TAMS=top-at-mark-start (previous, next) >>> | 0|0x83a00000, 0x83a00000, 0x83b00000| 0%| F| |TAMS 0x83a00000, 0x83a00000| Untracked >>> | 1|0x83b00000, 0x83b00000, 0x83c00000| 0%| F| |TAMS 0x83b00000, 0x83b00000| Untracked >>> | 2|0x83c00000, 0x83c00000, 0x83d00000| 0%| F| |TAMS 0x83c00000, 0x83c00000| Untracked >>> | 3|0x83d00000, 0x83d00000, 0x83e00000| 0%| F| |TAMS 0x83d00000, 0x83d00000| Untracked >>> | 4|0x83e00000, 0x83e00000, 0x83f00000| 0%| F| |TAMS 0x83e00000, 0x83e00000| Untracked >>> | 5|0x83f00000, 0x83f00000, 0x84000000| 0%| F| |TAMS 0x83f00000, 0x83f00000| Untracked >>> | 6|0x84000000, 0x84000000, 0x84100000| 0%| F| |TAMS 0x84000000, 0x84000000| Untracked >>> | 7|0x84100000, 0x84100000, 0x84200000| 0%| F| |TAMS 0x84100000, 0x84100000| Untracked >>> | 8|0x84200000, 0x84200000, 0x84300000| 0%| F| |TAMS 0x84200000, 0x84200000| Untracked >>> | 9|0x84300000, 0x84300000, 0x84400000| 0%| F| |TAMS 0x84300000, 0x84300000| Untracked >>> | 10|0x84400000, 0x84400000, 0x84500000| 0%| F| |TAMS 0x84400000, 0x84400000| Untracked >>> | 11|0x84500000, 0x84500000, 0x84600000| 0%| F| |TAMS 0x84500000, 0x84500000| Untracked >>> | 12|0x84600000, 0x84600000, 0x84700000| 0%| F| |TAMS 0x84600000, 0x84600000| Untracked >>> | 13|0x84700000, 0x84700000, 0x84800000| 0%| F| |TAMS 0x84700000, 0x84700000| Untracked >>> | 14|0x84800000, 0x84800000, 0x84900000| 0%| F| |TAMS 0x84800000, 0x84800000| Untracked >>> | 15|0x84900000, 0x84900000, 0x84a00000| 0%| F| |TAMS 0x84900000, 0x84900000| Untracked >>> | 16|0x84a00000, 0x84a00000, 0x84b00000| 0%| F| |TAMS 0x84a00000, 0x84a00000| Untracked >>> | 17|0x84b00000, 0x84b00000, 0x84c00000| 0%| F| |TAMS 0x84b00000, 0x84b00000| Untracked >>> | 18|0x84c00000, 0x84c00000, 0x84d00000| 0%| F| |TAMS 0x84c00000, 0x84c00000| Untracked >>> | 19|0x84d00000, 0x84d00000, 0x84e00000| 0%| F| |TAMS 0x84d00000, 0x84d00000| Untracked >>> | 20|0x84e00000, 0x84e00000, 0x84f00000| 0%| F| |TAMS 0x84e00000, 0x84e00000| Untracked >>> | 21|0x84f00000, 0x84f00000, 0x85000000| 0%| F| |TAMS 0x84f00000, 0x84f00000| Untracked >>> | 22|0x85000000, 0x85000000, 0x85100000| 0%| F| |TAMS 0x85000000, 0x85000000| Untracked >>> | 23|0x85100000, 0x85100000, 0x85200000| 0%| F| |TAMS 0x85100000, 0x85100000| Untracked >>> | 24|0x85200000, 0x85200000, 0x85300000| 0%| F| |TAMS 0x85200000, 0x85200000| Untracked >>> | 25|0x85300000, 0x85300000, 0x85400000| 0%| F| |TAMS 0x85300000, 0x85300000| Untracked >>> | 26|0x85400000, 0x85400000, 0x85500000| 0%| F| |TAMS 0x85400000, 0x85400000| Untracked >>> | 27|0x85500000, 0x85500000, 0x85600000| 0%| F| |TAMS 0x85500000, 0x85500000| Untracked >>> | 28|0x85600000, 0x85600000, 0x85700000| 0%| F| |TAMS 0x85600000, 0x85600000| Untracked >>> | 29|0x85700000, 0x85700000, 0x85800000| 0%| F| |TAMS 0x85700000, 0x85700000| Untracked >>> | 30|0x85800000, 0x85800000, 0x85900000| 0%| F| |TAMS 0x85800000, 0x85800000| Untracked >>> | 31|0x85900000, 0x85900000, 0x85a00000| 0%| F| |TAMS 0x85900000, 0x85900000| Untracked >>> | 32|0x85a00000, 0x85a00000, 0x85b00000| 0%| F| |TAMS 0x85a00000, 0x85a00000| Untracked >>> | 33|0x85b00000, 0x85b00000, 0x85c00000| 0%| F| |TAMS 0x85b00000, 0x85b00000| Untracked >>> | 34|0x85c00000, 0x85c00000, 0x85d00000| 0%| F| |TAMS 0x85c00000, 0x85c00000| Untracked >>> | 35|0x85d00000, 0x85d00000, 0x85e00000| 0%| F| |TAMS 0x85d00000, 0x85d00000| Untracked >>> | 36|0x85e00000, 0x85e00000, 0x85f00000| 0%| F| |TAMS 0x85e00000, 0x85e00000| Untracked >>> | 37|0x85f00000, 0x85f00000, 0x86000000| 0%| F| |TAMS 0x85f00000, 0x85f00000| Untracked >>> | 38|0x86000000, 0x86000000, 0x86100000| 0%| F| |TAMS 0x86000000, 0x86000000| Untracked >>> | 39|0x86100000, 0x86100000, 0x86200000| 0%| F| |TAMS 0x86100000, 0x86100000| Untracked >>> | 40|0x86200000, 0x86200000, 0x86300000| 0%| F| |TAMS 0x86200000, 0x86200000| Untracked >>> | 41|0x86300000, 0x86300000, 0x86400000| 0%| F| |TAMS 0x86300000, 0x86300000| Untracked >>> | 42|0x86400000, 0x86400000, 0x86500000| 0%| F| |TAMS 0x86400000, 0x86400000| Untracked >>> | 43|0x86500000, 0x86500000, 0x86600000| 0%| F| |TAMS 0x86500000, 0x86500000| Untracked >>> | 44|0x86600000, 0x86600000, 0x86700000| 0%| F| |TAMS 0x86600000, 0x86600000| Untracked >>> | 45|0x86700000, 0x86700000, 0x86800000| 0%| F| |TAMS 0x86700000, 0x86700000| Untracked >>> | 46|0x86800000, 0x86800000, 0x86900000| 0%| F| |TAMS 0x86800000, 0x86800000| Untracked >>> | 47|0x86900000, 0x86900000, 0x86a00000| 0%| F| |TAMS 0x86900000, 0x86900000| Untracked >>> | 48|0x86a00000, 0x86a00000, 0x86b00000| 0%| F| |TAMS 0x86a00000, 0x86a00000| Untracked >>> | 49|0x86b00000, 0x86b00000, 0x86c00000| 0%| F| |TAMS 0x86b00000, 0x86b00000| Untracked >>> | 50|0x86c00000, 0x86c00000, 0x86d00000| 0%| F| |TAMS 0x86c00000, 0x86c00000| Untracked >>> | 51|0x86d00000, 0x86d00000, 0x86e00000| 0%| F| |TAMS 0x86d00000, 0x86d00000| Untracked >>> | 52|0x86e00000, 0x86e00000, 0x86f00000| 0%| F| |TAMS 0x86e00000, 0x86e00000| Untracked >>> | 53|0x86f00000, 0x86f00000, 0x87000000| 0%| F| |TAMS 0x86f00000, 0x86f00000| Untracked >>> | 54|0x87000000, 0x87000000, 0x87100000| 0%| F| |TAMS 0x87000000, 0x87000000| Untracked >>> | 55|0x87100000, 0x87100000, 0x87200000| 0%| F| |TAMS 0x87100000, 0x87100000| Untracked >>> | 56|0x87200000, 0x87200000, 0x87300000| 0%| F| |TAMS 0x87200000, 0x87200000| Untracked >>> | 57|0x87300000, 0x87300000, 0x87400000| 0%| F| |TAMS 0x87300000, 0x87300000| Untracked >>> | 58|0x87400000, 0x87400000, 0x87500000| 0%| F| |TAMS 0x87400000, 0x87400000| Untracked >>> | 59|0x87500000, 0x87500000, 0x87600000| 0%| F| |TAMS 0x87500000, 0x87500000| Untracked >>> | 60|0x87600000, 0x87600000, 0x87700000| 0%| F| |TAMS 0x87600000, 0x87600000| Untracked >>> | 61|0x87700000, 0x87700000, 0x87800000| 0%| F| |TAMS 0x87700000, 0x87700000| Untracked >>> | 62|0x87800000, 0x87800000, 0x87900000| 0%| F| |TAMS 0x87800000, 0x87800000| Untracked >>> | 63|0x87900000, 0x87942908, 0x87a00000| 26%| E| |TAMS 0x87900000, 0x87900000| Complete >>> >>> Card table byte_map: [0x83700000,0x83880000] _byte_map_base: 0x832e3000 >>> >>> Marking Bits (Prev, Next): (CMBitMap*) 0xb5b74324, (CMBitMap*) 0xb5b74344 >>> Prev Bits: [0x82980000, 0x83580000) >>> Next Bits: [0x81d80000, 0x82980000) >>> >>> GC Heap History (0 events): >>> No events >>> >>> Deoptimization events (0 events): >>> No events >>> >>> Classes unloaded (0 events): >>> No events >>> >>> Classes redefined (0 events): >>> No events >>> >>> Internal exceptions (0 events): >>> No events >>> >>> Events (20 events): >>> Event: 0.113 loading class java/lang/Character >>> Event: 0.114 loading class java/lang/Character done >>> Event: 0.114 loading class java/lang/Float >>> Event: 0.115 loading class java/lang/Number >>> Event: 0.115 loading class java/lang/Number done >>> Event: 0.115 loading class java/lang/Float done >>> Event: 0.115 loading class java/lang/Double >>> Event: 0.116 loading class java/lang/Double done >>> Event: 0.116 loading class java/lang/Byte >>> Event: 0.116 loading class java/lang/Byte done >>> Event: 0.116 loading class java/lang/Short >>> Event: 0.117 loading class java/lang/Short done >>> Event: 0.117 loading class java/lang/Integer >>> Event: 0.118 loading class java/lang/Integer done >>> Event: 0.118 loading class java/lang/Long >>> Event: 0.119 loading class java/lang/Long done >>> Event: 0.119 loading class java/util/Iterator >>> Event: 0.119 loading class java/util/Iterator done >>> Event: 0.119 loading class java/lang/reflect/RecordComponent >>> Event: 0.119 loading class java/lang/reflect/RecordComponent done >>> >>> >>> Dynamic libraries: >>> 00410000-00411000 r-xp 00000000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java >>> 00420000-00421000 r--p 00000000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java >>> 00421000-00422000 rw-p 00001000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java >>> 019b6000-019d7000 rw-p 00000000 00:00 0 [heap] >>> 809c9000-80e00000 rw-p 00000000 00:00 0 >>> 80e00000-80e8e000 rw-p 00000000 00:00 0 >>> 80e8e000-80f00000 ---p 00000000 00:00 0 >>> 80fb4000-811da000 rw-p 00000000 00:00 0 >>> 811da000-81400000 ---p 00000000 00:00 0 >>> 81400000-81421000 rw-p 00000000 00:00 0 >>> 81421000-81500000 ---p 00000000 00:00 0 >>> 8157e000-8157f000 ---p 00000000 00:00 0 >>> 8157f000-81600000 rw-p 00000000 00:00 0 >>> 81600000-81621000 rw-p 00000000 00:00 0 >>> 81621000-81700000 ---p 00000000 00:00 0 >>> 8177e000-8177f000 ---p 00000000 00:00 0 >>> 8177f000-81800000 rw-p 00000000 00:00 0 >>> 81800000-81821000 rw-p 00000000 00:00 0 >>> 81821000-81900000 ---p 00000000 00:00 0 >>> 81900000-81921000 rw-p 00000000 00:00 0 >>> 81921000-81a00000 ---p 00000000 00:00 0 >>> 81a7e000-81a7f000 ---p 00000000 00:00 0 >>> 81a7f000-81b00000 rw-p 00000000 00:00 0 >>> 81b00000-81b21000 rw-p 00000000 00:00 0 >>> 81b21000-81c00000 ---p 00000000 00:00 0 >>> 81c21000-81c7c000 rw-p 00000000 00:00 0 >>> 81c7c000-81c7d000 ---p 00000000 00:00 0 >>> 81c7d000-81cfe000 rw-p 00000000 00:00 0 >>> 81cfe000-81cff000 ---p 00000000 00:00 0 >>> 81cff000-81e80000 rw-p 00000000 00:00 0 >>> 81e80000-82980000 ---p 00000000 00:00 0 >>> 82980000-82a80000 rw-p 00000000 00:00 0 >>> 82a80000-83580000 ---p 00000000 00:00 0 >>> 83580000-835a0000 rw-p 00000000 00:00 0 >>> 835a0000-83700000 ---p 00000000 00:00 0 >>> 83700000-83720000 rw-p 00000000 00:00 0 >>> 83720000-83880000 ---p 00000000 00:00 0 >>> 83880000-838a0000 rw-p 00000000 00:00 0 >>> 838a0000-83a00000 ---p 00000000 00:00 0 >>> 83a00000-87a00000 rw-p 00000000 00:00 0 >>> 87a00000-b3a00000 ---p 00000000 00:00 0 >>> b3a25000-b3a76000 rw-p 00000000 00:00 0 >>> b3a76000-b3ab3000 ---p 00000000 00:00 0 >>> b3ab3000-b3c33000 rwxp 00000000 00:00 0 >>> b3c33000-b5ab3000 ---p 00000000 00:00 0 >>> b5ab3000-b5ac8000 r-xp 00000000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so >>> b5ac8000-b5ad8000 ---p 00015000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so >>> b5ad8000-b5ad9000 r--p 00015000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so >>> b5ad9000-b5ada000 rw-p 00016000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so >>> b5ada000-b5ae2000 rw-s 00000000 b3:02 2576900 /tmp/hsperfdata_root/14700 >>> b5ae2000-b5ae9000 r-xp 00000000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so >>> b5ae9000-b5af8000 ---p 00007000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so >>> b5af8000-b5af9000 r--p 00006000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so >>> b5af9000-b5afa000 rw-p 00007000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so >>> b5afa000-b5b00000 rw-p 00000000 00:00 0 >>> b5b00000-b5c00000 rw-p 00000000 00:00 0 >>> b5c00000-b5c0d000 r-xp 00000000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so >>> b5c0d000-b5c1c000 ---p 0000d000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so >>> b5c1c000-b5c1d000 r--p 0000c000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so >>> b5c1d000-b5c1e000 rw-p 0000d000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so >>> b5c1e000-b5c20000 rw-p 00000000 00:00 0 >>> b5c20000-b5c27000 r-xp 00000000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so >>> b5c27000-b5c36000 ---p 00007000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so >>> b5c36000-b5c37000 r--p 00006000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so >>> b5c37000-b5c38000 rw-p 00007000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so >>> b5c38000-b5c3d000 r-xp 00000000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so >>> b5c3d000-b5c4c000 ---p 00005000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so >>> b5c4c000-b5c4d000 r--p 00004000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so >>> b5c4d000-b5c4e000 rw-p 00005000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so >>> b5c4e000-b5c5d000 r-xp 00000000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so >>> b5c5d000-b5c6c000 ---p 0000f000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so >>> b5c6c000-b5c6d000 r--p 0000e000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so >>> b5c6d000-b5c6e000 rw-p 0000f000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so >>> b5c6e000-b5c71000 ---p 00000000 00:00 0 >>> b5c71000-b5cbe000 rw-p 00000000 00:00 0 >>> b5cbe000-b5d2d000 r-xp 00000000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so >>> b5d2d000-b5d3d000 ---p 0006f000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so >>> b5d3d000-b5d3e000 r--p 0006f000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so >>> b5d3e000-b5d3f000 rw-p 00070000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so >>> b5d3f000-b6d56000 r-xp 00000000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so >>> b6d56000-b6d65000 ---p 01017000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so >>> b6d65000-b6dba000 r--p 01016000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so >>> b6dba000-b6dd2000 rw-p 0106b000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so >>> b6dd2000-b6e5e000 rw-p 00000000 00:00 0 >>> b6e5e000-b6e6f000 r-xp 00000000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so >>> b6e6f000-b6e7f000 ---p 00011000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so >>> b6e7f000-b6e80000 r--p 00011000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so >>> b6e80000-b6e81000 rw-p 00012000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so >>> b6e81000-b6e83000 rw-p 00000000 00:00 0 >>> b6e83000-b6e85000 r-xp 00000000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so >>> b6e85000-b6e94000 ---p 00002000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so >>> b6e94000-b6e95000 r--p 00001000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so >>> b6e95000-b6e96000 rw-p 00002000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so >>> b6e96000-b6eaf000 r-xp 00000000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 >>> b6eaf000-b6ebe000 ---p 00019000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 >>> b6ebe000-b6ebf000 r--p 00018000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 >>> b6ebf000-b6ec0000 rw-p 00019000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 >>> b6ec0000-b6fa2000 r-xp 00000000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so >>> b6fa2000-b6fb2000 ---p 000e2000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so >>> b6fb2000-b6fb4000 r--p 000e2000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so >>> b6fb4000-b6fb5000 rw-p 000e4000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so >>> b6fb5000-b6fb8000 rw-p 00000000 00:00 0 >>> b6fb8000-b6fc2000 r-xp 00000000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so >>> b6fc2000-b6fd1000 ---p 0000a000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so >>> b6fd1000-b6fd2000 r--p 00009000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so >>> b6fd2000-b6fd3000 rw-p 0000a000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so >>> b6fd3000-b6feb000 r-xp 00000000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so >>> b6ff2000-b6ff4000 rw-p 00000000 00:00 0 >>> b6ff6000-b6ff7000 ---p 00000000 00:00 0 >>> b6ff7000-b6ff8000 r--p 00000000 00:00 0 >>> b6ff8000-b6ff9000 rwxp 00000000 00:00 0 >>> b6ff9000-b6ffb000 rw-p 00000000 00:00 0 >>> b6ffb000-b6ffc000 r--p 00018000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so >>> b6ffc000-b6ffd000 rw-p 00019000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so >>> bed97000-bedb8000 rw-p 00000000 00:00 0 [stack] >>> beeb0000-beeb1000 r-xp 00000000 00:00 0 [sigpage] >>> beeb1000-beeb2000 r--p 00000000 00:00 0 [vvar] >>> beeb2000-beeb3000 r-xp 00000000 00:00 0 [vdso] >>> ffff0000-ffff1000 r-xp 00000000 00:00 0 [vectors] >>> >>> >>> VM Arguments: >>> jvm_args: -Xms64M -Xmx768M --add-exports=java.base/jdk.internal.module=ALL-UNNAMED >>> java_command: build.tools.jigsaw.AddPackagesAttribute /workspace/build/linux-arm-server-fastdebug/jdk >>> java_class_path (initial): /workspace/build/linux-arm-server-fastdebug/buildtools/tools_jigsaw_classes >>> Launcher Type: SUN_STANDARD >>> >>> [Global flags] >>> uint ConcGCThreads = 1 {product} {ergonomic} Number of threads concurrent gc will use >>> uint G1ConcRefinementThreads = 4 {product} {ergonomic} The number of parallel rem set update threads. Will be set ergonomically by default. >>> size_t G1HeapRegionSize = 1048576 {product} {ergonomic} Size of the G1 regions. >>> uintx GCDrainStackTargetSize = 64 {product} {ergonomic} Number of entries we will try to leave on the stack during parallel gc >>> size_t InitialHeapSize = 67108864 {product} {command line} Initial heap size (in bytes); zero means use ergonomics >>> size_t MarkStackSize = 32768 {product} {ergonomic} Size of marking stack >>> size_t MaxHeapSize = 805306368 {product} {command line} Maximum heap size (in bytes) >>> size_t MaxNewSize = 482344960 {product} {ergonomic} Maximum new generation size (in bytes), max_uintx means set ergonomically >>> size_t MinHeapDeltaBytes = 1048576 {product} {ergonomic} The minimum change in heap space due to GC (in bytes) >>> size_t MinHeapSize = 67108864 {product} {command line} Minimum heap size (in bytes); zero means use ergonomics >>> uintx NonProfiledCodeHeapSize = 0 {pd product} {ergonomic} Size of code heap with non-profiled methods (in bytes) >>> uintx ProfiledCodeHeapSize = 0 {pd product} {ergonomic} Size of code heap with profiled methods (in bytes) >>> size_t SoftMaxHeapSize = 805306368 {manageable} {ergonomic} Soft limit for maximum heap size (in bytes) >>> bool UseG1GC = true {product} {ergonomic} Use the Garbage-First garbage collector >>> >>> Logging: >>> Log output configuration: >>> #0: stdout all=warning uptime,level,tags >>> #1: stderr all=off uptime,level,tags >>> >>> Environment Variables: >>> JAVA_HOME=/opt/java/openjdk >>> PATH=/opt/java/openjdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin >>> LC_ALL=C >>> >>> Signal Handlers: >>> SIGSEGV: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >>> SIGBUS: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >>> SIGFPE: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >>> SIGPIPE: [libjvm.so+0xc9aa9d], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >>> SIGXFSZ: [libjvm.so+0xc9aa9d], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >>> SIGILL: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >>> SIGUSR2: [libjvm.so+0xc9ad95], sa_mask[0]=00000000000000000000000000000000, sa_flags=SA_RESTART|SA_SIGINFO >>> SIGHUP: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none >>> SIGINT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none >>> SIGTERM: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none >>> SIGQUIT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none >>> >>> >>> --------------- S Y S T E M --------------- >>> >>> OS: >>> DISTRIB_ID=Ubuntu >>> DISTRIB_RELEASE=18.04 >>> DISTRIB_CODENAME=bionic >>> DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS" >>> uname: Linux 20431585315d 5.4.51-v7l+ #1333 SMP Mon Aug 10 16:51:40 BST 2020 armv7l >>> OS uptime: 14 days 7:59 hours >>> libc: glibc 2.27 NPTL 2.27 >>> rlimit (soft/hard): STACK 8192k/infinity , CORE infinity/infinity , NPROC infinity/infinity , NOFILE 1048576/1048576 , AS infinity/infinity , CPU infinity/infinity , DATA infinity/infinity , FSIZE infinity/infinity , MEMLOCK 64k/64k >>> load average: 3.37 3.26 3.09 >>> >>> /proc/meminfo: >>> MemTotal: 3919812 kB >>> MemFree: 1255688 kB >>> MemAvailable: 3518740 kB >>> Buffers: 134316 kB >>> Cached: 2117828 kB >>> SwapCached: 0 kB >>> Active: 1266624 kB >>> Inactive: 1167412 kB >>> Active(anon): 110360 kB >>> Inactive(anon): 80744 kB >>> Active(file): 1156264 kB >>> Inactive(file): 1086668 kB >>> Unevictable: 16 kB >>> Mlocked: 16 kB >>> HighTotal: 3264512 kB >>> HighFree: 1038848 kB >>> LowTotal: 655300 kB >>> LowFree: 216840 kB >>> SwapTotal: 102396 kB >>> SwapFree: 102396 kB >>> Dirty: 24916 kB >>> Writeback: 0 kB >>> AnonPages: 181884 kB >>> Mapped: 125864 kB >>> Shmem: 16892 kB >>> KReclaimable: 181816 kB >>> Slab: 205164 kB >>> SReclaimable: 181816 kB >>> SUnreclaim: 23348 kB >>> KernelStack: 2240 kB >>> PageTables: 2684 kB >>> NFS_Unstable: 0 kB >>> Bounce: 0 kB >>> WritebackTmp: 0 kB >>> CommitLimit: 2062300 kB >>> Committed_AS: 1125176 kB >>> VmallocTotal: 245760 kB >>> VmallocUsed: 5520 kB >>> VmallocChunk: 0 kB >>> Percpu: 512 kB >>> CmaTotal: 262144 kB >>> CmaFree: 171244 kB >>> >>> /sys/kernel/mm/transparent_hugepage/enabled: >>> /sys/kernel/mm/transparent_hugepage/defrag (defrag/compaction efforts parameter): >>> >>> Process Memory: >>> Virtual Size: 888828K (peak: 888828K) >>> Resident Set Size: 25020K (peak: 25020K) (anon: 11372K, file: 13648K, shmem: 0K) >>> Swapped out: 0K >>> C-Heap outstanding allocations: 1636K >>> >>> /proc/sys/kernel/threads-max (system-wide limit on the number of threads): 57119 >>> /proc/sys/vm/max_map_count (maximum number of memory map areas a process may have): 65530 >>> /proc/sys/kernel/pid_max (system-wide limit on number of process identifiers): 32768 >>> >>> Steal ticks since vm start: 0 >>> Steal ticks percentage since vm start: 0.000 >>> >>> CPU: total 4 (initial active 4) (ARMv7), vfp, vfp3-32, simd, mp_ext >>> /proc/cpuinfo: >>> processor : 0 >>> model name : ARMv7 Processor rev 3 (v7l) >>> BogoMIPS : 270.00 >>> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 >>> CPU implementer : 0x41 >>> CPU architecture: 7 >>> CPU variant : 0x0 >>> CPU part : 0xd08 >>> CPU revision : 3 >>> >>> processor : 1 >>> model name : ARMv7 Processor rev 3 (v7l) >>> BogoMIPS : 270.00 >>> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 >>> CPU implementer : 0x41 >>> CPU architecture: 7 >>> CPU variant : 0x0 >>> CPU part : 0xd08 >>> CPU revision : 3 >>> >>> processor : 2 >>> model name : ARMv7 Processor rev 3 (v7l) >>> BogoMIPS : 270.00 >>> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 >>> CPU implementer : 0x41 >>> CPU architecture: 7 >>> CPU variant : 0x0 >>> CPU part : 0xd08 >>> CPU revision : 3 >>> >>> processor : 3 >>> model name : ARMv7 Processor rev 3 (v7l) >>> BogoMIPS : 270.00 >>> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 >>> CPU implementer : 0x41 >>> CPU architecture: 7 >>> CPU variant : 0x0 >>> CPU part : 0xd08 >>> CPU revision : 3 >>> >>> Hardware : BCM2711 >>> Revision : c03111 >>> Serial : 100000001c47254f >>> Model : Raspberry Pi 4 Model B Rev 1.1 >>> >>> Online cpus: 0-3 >>> Offline cpus: >>> >>> Memory: 4k page, physical 3919812k(1255688k free), swap 102396k(102396k free) >>> >>> vm_info: OpenJDK Server VM (fastdebug 16-internal+0-adhoc..workspace) for linux-arm JRE (16-internal+0-adhoc..workspace), built on Oct 12 2020 19:49:51 by "" with gcc 7.5.0 >>> >>> END. >>> >>> >>> >>> >>>> On 12. Oct 2020, at 20:24, Aleksey Shipilev wrote: >>>> >>>> Hi, >>>> >>>> On 10/12/20 8:12 PM, Marc Hoffmann wrote: >>>>> Please find the build log and the hs_err file for commit fd0cb98ed03c6214c02ccd3503c1e6d77065a428 attached. >>>> >>>> Please try to build with fastdebug (./configure --enable-debug), so that JVM asserts meaninfully somewhere? >>>> >>>>> Is there any additional information I can provide to help getting these builds fixed again? >>>> >>>> I am seeing plenty of weird x86_32 crashes since last week. Pretty sure some of them would manifest on ARM32 as well. This is why building with fastdebug is the next step: it maps out the bug symptoms. >>>> >>>> -- >>>> Thanks, >>>> -Aleksey >>>> >>> > From dnsimon at openjdk.java.net Wed Oct 14 22:09:15 2020 From: dnsimon at openjdk.java.net (Doug Simon) Date: Wed, 14 Oct 2020 22:09:15 GMT Subject: RFR: 8254793: [JVMCI] improve speculation encoding Message-ID: This PR changes the encoding of a `jdk.vm.ci.hotspot.HotSpotSpeculationLog.HotSpotSpeculation` from a long to an int. The `Thread::_pending_failed_speculation` field remains as a `jlong` since it is already exposed to JVMCI Java code already via VMStructs and this PR does not update its usage in Graal. ------------- Commit messages: - 8254793: encode a HotSpotSpeculation in an int instead of a long Changes: https://git.openjdk.java.net/jdk/pull/667/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=667&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254793 Stats: 80 lines in 11 files changed: 54 ins; 2 del; 24 mod Patch: https://git.openjdk.java.net/jdk/pull/667.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/667/head:pull/667 PR: https://git.openjdk.java.net/jdk/pull/667 From ysuenaga at openjdk.java.net Thu Oct 15 00:24:18 2020 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Thu, 15 Oct 2020 00:24:18 GMT Subject: Integrated: 8226236: [TESTBUG] win32: gc/metaspace/TestCapacityUntilGCWrapAround.java fails In-Reply-To: References: Message-ID: On Tue, 13 Oct 2020 10:00:37 GMT, Yasumasa Suenaga wrote: > Originally filed at AdoptOpenJDK: > https://github.com/AdoptOpenJDK/openjdk-tests/issues/1162 > > The test fails on 32bit windows with: > > java.lang.IllegalStateException: WB_IncMetaspaceCapacityUntilGC: could not increase capacity until GC due to contention > with another thread > at sun.hotspot.WhiteBox.incMetaspaceCapacityUntilGC(Native Method) > at TestCapacityUntilGCWrapAround.main(TestCapacityUntilGCWrapAround.java:51) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) > at java.lang.Thread.run(Thread.java:748) > > `TestCapacityUntilGCWrapAround` passes `4GB - 1` to `incMetaspaceCapacityUntilGC()`. It seems to be too big. > And also this code seems to want to check the behavior when `_capacity_until_gc` is overflown. White box test would > throw ISE when it hapen. So we need to handle it correctly. This pull request has now been integrated. Changeset: 038f58d4 Author: Yasumasa Suenaga URL: https://git.openjdk.java.net/jdk/commit/038f58d4 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8226236: [TESTBUG] win32: gc/metaspace/TestCapacityUntilGCWrapAround.java fails Reviewed-by: stuefe ------------- PR: https://git.openjdk.java.net/jdk/pull/628 From iklam at openjdk.java.net Thu Oct 15 04:20:24 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 15 Oct 2020 04:20:24 GMT Subject: RFR: 8253402: Convert vmSymbols::SID to enum class [v7] In-Reply-To: References: Message-ID: > Convert `vmSymbols::SID` to an `enum class` to provide better type safety. > > - The original enum type `vmSymbols::SID` cannot be forward-declared. I moved it out of the `vmSymbols` class and > renamed, so now it can be forward-declared as `enum class vmSymbolID : int;`, without including the large vmSymbols.hpp > file. > - This also breaks the mutual dependency between the `vmSymbols` and `vmIntrinsics` classes. Now the declaration of > `vmIntrinsics` can be moved from vmSymbols.hpp to vmIntrinsics.hpp, where it naturally belongs. > - Type-safe enumeration (contributed by Kim Barrett) > for (vmSymbolsIterator it = vmSymbolsRange.begin(); it != vmSymbolsRange.end(); ++it) { > vmSymbolID index = *it; .... > } > - I moved `vmSymbols::_symbols[]` to `Symbol::_vm_symbols[]`, and made it accessible via `Symbol::vm_symbol_at()`. This > way, header files (e.g. fieldInfo.hpp) that need to convert from `vmSymbolID` to `Symbol*` don't need to include the > large vmSymbols.hpp file. > - I changed the `VM_SYMBOL_ENUM_NAME` macro so that the users don't need to explicitly add the `vmSymbolID::` scope. > - I removed many unnecessary casts between `int` and `vmSymbolID`. > - The remaining casts are done via `vmSymbol::as_int()` and `vmSymbols::as_SID()` with range checks. > > ----- > If this is successful, I will do the same for `vmIntrinsics::ID`. Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: - Merge branch 'master' into 8253402-convert-vmsymbols-sid-to-enum-class - Merge branch 'master' into 8253402-convert-vmsymbols-sid-to-enum-class - revert NO_SID to 0 due to assert(Symbol::_vm_symbols[NO_SID] == NULL) - Addressed review comments by Kim Barrett - added missing #include from enumIterator.hpp - Use 2-style EnumIterator - Merge master into 8253402-convert-vmsymbols-sid-to-enum-class - more vmEnums.hpp fixes; fixed minimal VM build - Merge branch 'master' into 8253402-convert-vmsymbols-sid-to-enum-class - Moved forward declaration of vmSymbolID to vmEnums.hpp - ... and 3 more: https://git.openjdk.java.net/jdk/compare/038f58d4...5e939ca7 ------------- Changes: https://git.openjdk.java.net/jdk/pull/276/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=276&range=06 Stats: 798 lines in 29 files changed: 478 ins; 144 del; 176 mod Patch: https://git.openjdk.java.net/jdk/pull/276.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/276/head:pull/276 PR: https://git.openjdk.java.net/jdk/pull/276 From iklam at openjdk.java.net Thu Oct 15 05:56:19 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 15 Oct 2020 05:56:19 GMT Subject: Integrated: 8253402: Convert vmSymbols::SID to enum class In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 07:07:25 GMT, Ioi Lam wrote: > Convert `vmSymbols::SID` to an `enum class` to provide better type safety. > > - The original enum type `vmSymbols::SID` cannot be forward-declared. I moved it out of the `vmSymbols` class and > renamed, so now it can be forward-declared as `enum class vmSymbolID : int;`, without including the large vmSymbols.hpp > file. > - This also breaks the mutual dependency between the `vmSymbols` and `vmIntrinsics` classes. Now the declaration of > `vmIntrinsics` can be moved from vmSymbols.hpp to vmIntrinsics.hpp, where it naturally belongs. > - Type-safe enumeration (contributed by Kim Barrett) > for (vmSymbolsIterator it = vmSymbolsRange.begin(); it != vmSymbolsRange.end(); ++it) { > vmSymbolID index = *it; .... > } > - I moved `vmSymbols::_symbols[]` to `Symbol::_vm_symbols[]`, and made it accessible via `Symbol::vm_symbol_at()`. This > way, header files (e.g. fieldInfo.hpp) that need to convert from `vmSymbolID` to `Symbol*` don't need to include the > large vmSymbols.hpp file. > - I changed the `VM_SYMBOL_ENUM_NAME` macro so that the users don't need to explicitly add the `vmSymbolID::` scope. > - I removed many unnecessary casts between `int` and `vmSymbolID`. > - The remaining casts are done via `vmSymbol::as_int()` and `vmSymbols::as_SID()` with range checks. > > ----- > If this is successful, I will do the same for `vmIntrinsics::ID`. This pull request has now been integrated. Changeset: 7e5eb493 Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/7e5eb493 Stats: 798 lines in 29 files changed: 478 ins; 144 del; 176 mod 8253402: Convert vmSymbols::SID to enum class Reviewed-by: kvn, coleenp, kbarrett, iveresov ------------- PR: https://git.openjdk.java.net/jdk/pull/276 From burban at openjdk.java.net Thu Oct 15 08:57:27 2020 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Thu, 15 Oct 2020 08:57:27 GMT Subject: RFR: 8254072: AArch64: Get rid of --disable-warnings-as-errors on Windows+ARM64 build [v3] In-Reply-To: References: Message-ID: > I organized this PR so that each commit contains the warning emitted by MSVC as commit message and its relevant fix. > > Verified on > * Linux+ARM64: `{hotspot,jdk,langtools}:tier1`, no failures. > * Windows+ARM64: `{hotspot,jdk,langtools}:tier1`, no (new) failures. > * internal macOS+ARM64 port: build without `--disable-warnings-as-errors` still works. Just mentioning this here, because > it's yet another toolchain (Xcode / clang) that needs to be kept happy [going > forward](https://openjdk.java.net/jeps/391). Bernhard Urban-Forster has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 20 commits: - disable warning only for hotspot - Merge remote-tracking branch 'upstream/master' into 8254072-fix-windows-arm64-warnings - Merge remote-tracking branch 'upstream/master' into 8254072-fix-windows-arm64-warnings - ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1441): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1446): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1654): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data - Revert changes for "warning C4146: unary minus operator applied to unsigned type, result still unsigned" - msvc: disable unary minus warning for unsigned types - ./src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp(1123): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data ./src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp(1123): warning C4267: 'initializing': conversion from 'size_t' to 'const int', possible loss of data - ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1312): warning C4267: 'argument': conversion from 'size_t' to 'unsigned int', possible loss of data ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1370): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1441): warning C4146: unary minus operator applied to unsigned type, result still unsigned ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1441): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data - ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp(2472): warning C4312: 'type cast': conversion from 'unsigned int' to 'address' of greater size - ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp(1527): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data - ... and 10 more: https://git.openjdk.java.net/jdk/compare/9359ff03...32e922da ------------- Changes: https://git.openjdk.java.net/jdk/pull/530/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=530&range=02 Stats: 22 lines in 8 files changed: 1 ins; 0 del; 21 mod Patch: https://git.openjdk.java.net/jdk/pull/530.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/530/head:pull/530 PR: https://git.openjdk.java.net/jdk/pull/530 From burban at openjdk.java.net Thu Oct 15 08:57:28 2020 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Thu, 15 Oct 2020 08:57:28 GMT Subject: RFR: 8254072: AArch64: Get rid of --disable-warnings-as-errors on Windows+ARM64 build [v2] In-Reply-To: References: Message-ID: On Mon, 12 Oct 2020 10:29:11 GMT, Magnus Ihse Bursie wrote: >> Bernhard Urban-Forster has updated the pull request with a new target base due to a merge or a rebase. The pull request >> now contains 18 commits: >> - Merge remote-tracking branch 'upstream/master' into 8254072-fix-windows-arm64-warnings >> - ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1441): warning C4267: 'argument': conversion from 'size_t' to >> 'int', possible loss of data >> ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1446): warning C4267: 'argument': conversion from 'size_t' to >> 'int', possible loss of data ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1654): warning C4267: 'argument': >> conversion from 'size_t' to 'int', possible loss of data >> - Revert changes for "warning C4146: unary minus operator applied to unsigned type, result still unsigned" >> - msvc: disable unary minus warning for unsigned types >> - ./src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp(1123): warning C4267: 'initializing': conversion >> from 'size_t' to 'int', possible loss of data >> ./src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp(1123): warning C4267: 'initializing': conversion >> from 'size_t' to 'const int', possible loss of data >> - ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1312): warning C4267: 'argument': conversion from 'size_t' to >> 'unsigned int', possible loss of data >> ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1370): warning C4267: 'argument': conversion from 'size_t' to >> 'int', possible loss of data ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1441): warning C4146: unary minus >> operator applied to unsigned type, result still unsigned ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1441): >> warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data >> - ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp(2472): warning C4312: 'type cast': conversion from 'unsigned int' >> to 'address' of greater size >> - ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp(1527): warning C4267: 'argument': conversion from 'size_t' to >> 'int', possible loss of data >> - ./src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp(2901): warning C4267: 'initializing': conversion from 'size_t' to >> 'int', possible loss of data >> ./src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp(2901): warning C4267: 'initializing': conversion from 'size_t' to >> 'const int', possible loss of data >> - ./src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp(2756): warning C4146: unary minus operator applied to unsigned >> type, result still unsigned >> - ... and 8 more: https://git.openjdk.java.net/jdk/compare/5351ba6c...a081dfb4 > > make/autoconf/flags-cflags.m4 line 137: > >> 135: WARNINGS_ENABLE_ALL="-W3" >> 136: DISABLED_WARNINGS="4800" >> 137: DISABLED_WARNINGS+=" 4146" # unary minus operator applied to unsigned type, result still unsigned > > This change will affect *all* JDK code. I'm not sure this was intended? > > If it was intended, I think you need to motivate this more explicitly. > > If you only wanted to disable the warning for hotspot, the proper solution would be to add it to > DISABLED_WARNINGS_microsoft in make/hotspot/lib/CompileJvm.gmk. Thank you @magicus! It was indeed meant only for the hotspot part. ------------- PR: https://git.openjdk.java.net/jdk/pull/530 From dnsimon at openjdk.java.net Thu Oct 15 09:01:28 2020 From: dnsimon at openjdk.java.net (Doug Simon) Date: Thu, 15 Oct 2020 09:01:28 GMT Subject: RFR: 8254793: [JVMCI] improve speculation encoding [v2] In-Reply-To: References: Message-ID: > This PR changes the encoding of a `jdk.vm.ci.hotspot.HotSpotSpeculationLog.HotSpotSpeculation` from a long to an int. > The `Thread::_pending_failed_speculation` field remains as a `jlong` since it is already exposed to JVMCI Java code > already via VMStructs and this PR does not update its usage in Graal. Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: 8254793: encode a HotSpotSpeculation in an int instead of a long ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/667/files - new: https://git.openjdk.java.net/jdk/pull/667/files/175c179d..2fb4c99e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=667&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=667&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/667.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/667/head:pull/667 PR: https://git.openjdk.java.net/jdk/pull/667 From burban at openjdk.java.net Thu Oct 15 09:05:25 2020 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Thu, 15 Oct 2020 09:05:25 GMT Subject: RFR: 8254072: AArch64: Get rid of --disable-warnings-as-errors on Windows+ARM64 build [v2] In-Reply-To: References: Message-ID: <5UFJWe28-kzXCq0aP2gC4AA_JTLety2CjKFDLI-rtkA=.72f68b0f-4347-486c-9039-534ba569f34c@github.com> On Mon, 12 Oct 2020 10:29:23 GMT, Magnus Ihse Bursie wrote: >> Bernhard Urban-Forster has updated the pull request with a new target base due to a merge or a rebase. The pull request >> now contains 18 commits: >> - Merge remote-tracking branch 'upstream/master' into 8254072-fix-windows-arm64-warnings >> - ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1441): warning C4267: 'argument': conversion from 'size_t' to >> 'int', possible loss of data >> ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1446): warning C4267: 'argument': conversion from 'size_t' to >> 'int', possible loss of data ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1654): warning C4267: 'argument': >> conversion from 'size_t' to 'int', possible loss of data >> - Revert changes for "warning C4146: unary minus operator applied to unsigned type, result still unsigned" >> - msvc: disable unary minus warning for unsigned types >> - ./src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp(1123): warning C4267: 'initializing': conversion >> from 'size_t' to 'int', possible loss of data >> ./src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp(1123): warning C4267: 'initializing': conversion >> from 'size_t' to 'const int', possible loss of data >> - ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1312): warning C4267: 'argument': conversion from 'size_t' to >> 'unsigned int', possible loss of data >> ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1370): warning C4267: 'argument': conversion from 'size_t' to >> 'int', possible loss of data ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1441): warning C4146: unary minus >> operator applied to unsigned type, result still unsigned ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1441): >> warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data >> - ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp(2472): warning C4312: 'type cast': conversion from 'unsigned int' >> to 'address' of greater size >> - ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp(1527): warning C4267: 'argument': conversion from 'size_t' to >> 'int', possible loss of data >> - ./src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp(2901): warning C4267: 'initializing': conversion from 'size_t' to >> 'int', possible loss of data >> ./src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp(2901): warning C4267: 'initializing': conversion from 'size_t' to >> 'const int', possible loss of data >> - ./src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp(2756): warning C4146: unary minus operator applied to unsigned >> type, result still unsigned >> - ... and 8 more: https://git.openjdk.java.net/jdk/compare/5351ba6c...a081dfb4 > > Changes requested by ihse (Reviewer). @theRealAph I prototyped changing the argument of `bang_stack_with_offset()` from `int` to `size_t` here: https://github.com/lewurm/openjdk/commit/85a8f655aa0cb69ef13a2de44dd64c60caf19852. In that approach casting is essentially pushed down to `bang_stack_with_offset` because the assembler instruction of most (all) architectures that is eventually consuming that offset needs a signed integer anyway. Doesn't seem like a win to me to be honest. I would rather prefer to go with what we have in this patch (similar to what x86 is doing today): --- a/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp +++ b/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp @@ -1524,7 +1524,7 @@ nmethod* SharedRuntime::generate_native_wrapper(MacroAssembler* masm, // Generate stack overflow check if (UseStackBanging) { - __ bang_stack_with_offset(JavaThread::stack_shadow_zone_size()); + __ bang_stack_with_offset((int)JavaThread::stack_shadow_zone_size()); } else { Unimplemented(); } and leave it with that. What do you think? ------------- PR: https://git.openjdk.java.net/jdk/pull/530 From dnsimon at openjdk.java.net Thu Oct 15 09:30:21 2020 From: dnsimon at openjdk.java.net (Doug Simon) Date: Thu, 15 Oct 2020 09:30:21 GMT Subject: RFR: 8254793: [JVMCI] improve speculation encoding [v3] In-Reply-To: References: Message-ID: <1I4nYb122752fKhe92W8XvGpo3BtTJ3LxUoK-oH2hus=.82161ac3-f1b9-4997-a2a3-5517eda94a45@github.com> > This PR changes the encoding of a `jdk.vm.ci.hotspot.HotSpotSpeculationLog.HotSpotSpeculation` from a long to an int. > The `Thread::_pending_failed_speculation` field remains as a `jlong` since it is already exposed to JVMCI Java code > already via VMStructs and this PR does not update its usage in Graal. Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: 8254793: encode a HotSpotSpeculation in an int instead of a long ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/667/files - new: https://git.openjdk.java.net/jdk/pull/667/files/2fb4c99e..2e4e4521 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=667&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=667&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/667.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/667/head:pull/667 PR: https://git.openjdk.java.net/jdk/pull/667 From aph at openjdk.java.net Thu Oct 15 10:00:11 2020 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 15 Oct 2020 10:00:11 GMT Subject: RFR: 8254072: AArch64: Get rid of --disable-warnings-as-errors on Windows+ARM64 build [v2] In-Reply-To: <5UFJWe28-kzXCq0aP2gC4AA_JTLety2CjKFDLI-rtkA=.72f68b0f-4347-486c-9039-534ba569f34c@github.com> References: <5UFJWe28-kzXCq0aP2gC4AA_JTLety2CjKFDLI-rtkA=.72f68b0f-4347-486c-9039-534ba569f34c@github.com> Message-ID: On Thu, 15 Oct 2020 09:02:35 GMT, Bernhard Urban-Forster wrote: >> Changes requested by ihse (Reviewer). > > @theRealAph I prototyped changing the argument of `bang_stack_with_offset()` from `int` to `size_t` here: > https://github.com/lewurm/openjdk/commit/85a8f655aa0cb69ef13a2de44dd64c60caf19852. In that approach casting is > essentially pushed down to `bang_stack_with_offset` because the assembler instruction of most (all) architectures that > is eventually consuming that offset needs a signed integer anyway. Doesn't seem like a win to me to be honest. I would > rather prefer to go with what we have in this patch (similar to what x86 is doing today): > --- a/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp > +++ b/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp > @@ -1524,7 +1524,7 @@ nmethod* SharedRuntime::generate_native_wrapper(MacroAssembler* masm, > > // Generate stack overflow check > if (UseStackBanging) { > - __ bang_stack_with_offset(JavaThread::stack_shadow_zone_size()); > + __ bang_stack_with_offset((int)JavaThread::stack_shadow_zone_size()); > } else { > Unimplemented(); > } > and leave it with that. What do you think? > @theRealAph I prototyped changing the argument of `bang_stack_with_offset()` from `int` to `size_t` here: > [lewurm at 85a8f65](https://github.com/lewurm/openjdk/commit/85a8f655aa0cb69ef13a2de44dd64c60caf19852). In that approach > casting is essentially pushed down to `bang_stack_with_offset` because the assembler instruction of most (all) > architectures that is eventually consuming that offset needs a signed integer anyway. Doesn't seem like a win to me to > be honest. I would rather prefer to go with what we have in this patch (similar to what x86 is doing today): ```diff > --- a/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp > +++ b/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp > @@ -1524,7 +1524,7 @@ nmethod* SharedRuntime::generate_native_wrapper(MacroAssembler* masm, > > // Generate stack overflow check > if (UseStackBanging) { > - __ bang_stack_with_offset(JavaThread::stack_shadow_zone_size()); > + __ bang_stack_with_offset((int)JavaThread::stack_shadow_zone_size()); > } else { > Unimplemented(); > } > ``` > > and leave it with that. What do you think? Fine, but please assert `JavaThread::stack_shadow_zone_size() == (int)JavaThread::stack_shadow_zone_size()`. If all this sounds a bit paranoid, that's because I am. Adding casts to shut up compilers is a very risky business, because often (if not in this case) the programmer doesn't understand the code well, and just sprinkles casts everywhere. But those casts disable compile-time type checking, and this leads to risks for future maintainability. I wonder if we should fix this in a better way, and use this in the future: template T1 checked_cast(T2 thing) { guarantee(static_cast(thing) == thing, "must be"); return static_cast(thing); } Then I promise we'll never need to have this conversation again. ------------- PR: https://git.openjdk.java.net/jdk/pull/530 From github.com+4146708+a74nh at openjdk.java.net Thu Oct 15 11:10:10 2020 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Thu, 15 Oct 2020 11:10:10 GMT Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: References: <35eLsMpWmcCUoiEWhnYdSpZNmvLy4ra56Qtd6eRW574=.4e7c9278-3e0d-457d-9c15-eef45bae9755@github.com> Message-ID: On Wed, 14 Oct 2020 20:43:53 GMT, Robbin Ehn wrote: > A question, ISB don't flush the I-cache which I thought was needed? > I would have expected something more similar to gcc clear_cache. (Possible I've missed something in your question...) Any cache flushing would be performed by the thread that modifies the code. The cross_modify_fence is for the other threads after the code has been modified. The only thing they need to do is an isb. ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From neliasso at openjdk.java.net Thu Oct 15 15:07:13 2020 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Thu, 15 Oct 2020 15:07:13 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v3] In-Reply-To: <94qadtiTzSkdsJAc_8IWrLxpBvmfiBXMf_W9Z965P80=.9a59a5db-2209-4007-94bb-16ccd8ff0b77@github.com> References: <94qadtiTzSkdsJAc_8IWrLxpBvmfiBXMf_W9Z965P80=.9a59a5db-2209-4007-94bb-16ccd8ff0b77@github.com> Message-ID: <9lPNMo1V33tQD6qp-1l78dII5Hfle8Ea5VWwuY1l_qA=.2e420c11-6e70-41f8-80b4-5992dcdd02eb@github.com> On Tue, 13 Oct 2020 18:03:27 GMT, Jatin Bhateja wrote: >> Summary: >> >> 1) Partial in-lining technique avoids call overhead penalty for small array copy operations with size less than 32 >> bytes. 2) At runtime, a conditional check based on copy length either calls an array-copy stub or executes an optimized >> instruction sequence using AVX-512 masked instructions emitted at the call site. 3) New runtime flag >> ArrayCopyPartialInlineSize=0/32(default)/64 bytes determines the maximum size for partial in-lining. 4) Based on the >> perf results seen in benchmarks currently partial in-lining is performed only for arraycopy involving sub-word types >> (bool/byte/char/short). Once PR-61 gets integrated we can extend this patch to cover all the primitive types. >> Performance Results: >> System : CascadeLake Server, Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz >> Micros : test/micro/org/openjdk/bench/java/lang/ArrayCopy*.java >> ArrayCopyPartialInlineSize : 32 >> >> JMH | Block Size | Baseline (ns/op) | Partial Inling (ns/op) | Gain >> -- | -- | -- | -- | -- >> ArrayCopyAligned.testByte | 1 | 5.417 | 2.696 | 2.009272997 >> ArrayCopyAligned.testByte | 3 | 5.494 | 2.702 | 2.03330866 >> ArrayCopyAligned.testByte | 5 | 5.417 | 2.637 | 2.05422829 >> ArrayCopyAligned.testByte | 10 | 5.343 | 2.703 | 1.976692564 >> ArrayCopyAligned.testByte | 20 | 5.837 | 2.636 | 2.214339909 >> ArrayCopyAligned.testByte | 70 | 5.86 | 6 | 0.976666667 >> ArrayCopyAligned.testByte | 150 | 6.766 | 6.906 | 0.979727773 >> ArrayCopyAligned.testByte | 300 | 7.605 | 7.952 | 0.956363179 >> ArrayCopyAligned.testByte | 600 | 11.989 | 12.007 | 0.998500874 >> ArrayCopyAligned.testByte | 1200 | 16.447 | 16.585 | 0.991679228 >> ArrayCopyAligned.testChar | 1 | 5.02 | 2.828 | 1.775106082 >> ArrayCopyAligned.testChar | 3 | 5.129 | 2.762 | 1.85698769 >> ArrayCopyAligned.testChar | 5 | 5.041 | 2.762 | 1.82512672 >> ArrayCopyAligned.testChar | 10 | 5.716 | 2.762 | 2.069514844 >> ArrayCopyAligned.testChar | 20 | 5.111 | 5.399 | 0.946656788 >> ArrayCopyAligned.testChar | 70 | 6.271 | 6.242 | 1.004645947 >> ArrayCopyAligned.testChar | 150 | 7.45 | 7.599 | 0.980392157 >> ArrayCopyAligned.testChar | 300 | 9.904 | 10.112 | 0.97943038 >> ArrayCopyAligned.testChar | 600 | 17.131 | 17.167 | 0.997902953 >> ArrayCopyAligned.testChar | 1200 | 29.556 | 29.851 | 0.990117584 >> ArrayCopyUnalignedBoth.testByte | 1 | 5.419 | 2.702 | 2.005551443 >> ArrayCopyUnalignedBoth.testByte | 3 | 5.558 | 2.636 | 2.108497724 >> ArrayCopyUnalignedBoth.testByte | 5 | 5.43 | 2.636 | 2.059939302 >> ArrayCopyUnalignedBoth.testByte | 10 | 5.378 | 2.637 | 2.039438756 >> ArrayCopyUnalignedBoth.testByte | 20 | 5.914 | 2.636 | 2.243550836 >> ArrayCopyUnalignedBoth.testByte | 70 | 5.882 | 5.954 | 0.987907289 >> ArrayCopyUnalignedBoth.testByte | 150 | 6.784 | 6.88 | 0.986046512 >> ArrayCopyUnalignedBoth.testByte | 300 | 7.635 | 7.968 | 0.958207831 >> ArrayCopyUnalignedBoth.testByte | 600 | 12.226 | 12.129 | 1.007997362 >> ArrayCopyUnalignedBoth.testByte | 1200 | 16.992 | 20.717 | 0.820195974 >> ArrayCopyUnalignedBoth.testChar | 1 | 5.019 | 2.828 | 1.774752475 >> ArrayCopyUnalignedBoth.testChar | 3 | 5.163 | 2.763 | 1.868621064 >> ArrayCopyUnalignedBoth.testChar | 5 | 5.042 | 2.827 | 1.783516095 >> ArrayCopyUnalignedBoth.testChar | 10 | 5.718 | 2.828 | 2.021923621 >> ArrayCopyUnalignedBoth.testChar | 20 | 5.111 | 5.404 | 0.945780903 >> ArrayCopyUnalignedBoth.testChar | 70 | 6.367 | 6.235 | 1.02117081 >> ArrayCopyUnalignedBoth.testChar | 150 | 7.367 | 8.269 | 0.890917886 >> ArrayCopyUnalignedBoth.testChar | 300 | 10.358 | 10.642 | 0.973313287 >> ArrayCopyUnalignedBoth.testChar | 600 | 20.84 | 17.522 | 1.189361945 >> ArrayCopyUnalignedBoth.testChar | 1200 | 31.895 | 31.892 | 1.000094067 >> ArrayCopyUnalignedDst.testByte | 1 | 5.455 | 2.637 | 2.068638604 >> ArrayCopyUnalignedDst.testByte | 3 | 5.562 | 2.702 | 2.058475204 >> ArrayCopyUnalignedDst.testByte | 5 | 5.427 | 2.702 | 2.008512213 >> ArrayCopyUnalignedDst.testByte | 10 | 5.367 | 2.696 | 1.990727003 >> ArrayCopyUnalignedDst.testByte | 20 | 5.839 | 2.637 | 2.214258627 >> ArrayCopyUnalignedDst.testByte | 70 | 5.888 | 5.968 | 0.986595174 >> ArrayCopyUnalignedDst.testByte | 150 | 6.785 | 6.773 | 1.001771741 >> ArrayCopyUnalignedDst.testByte | 300 | 7.606 | 7.972 | 0.954089313 >> ArrayCopyUnalignedDst.testByte | 600 | 11.986 | 21.195 | 0.565510734 >> ArrayCopyUnalignedDst.testByte | 1200 | 16.54 | 16.784 | 0.985462345 >> ArrayCopyUnalignedDst.testChar | 1 | 5.02 | 2.827 | 1.775733994 >> ArrayCopyUnalignedDst.testChar | 3 | 5.131 | 2.762 | 1.857711803 >> ArrayCopyUnalignedDst.testChar | 5 | 5.038 | 2.762 | 1.82404055 >> ArrayCopyUnalignedDst.testChar | 10 | 5.718 | 2.762 | 2.070238957 >> ArrayCopyUnalignedDst.testChar | 20 | 5.113 | 5.401 | 0.946676541 >> ArrayCopyUnalignedDst.testChar | 70 | 6.222 | 6.214 | 1.001287416 >> ArrayCopyUnalignedDst.testChar | 150 | 7.367 | 8.125 | 0.906707692 >> ArrayCopyUnalignedDst.testChar | 300 | 10.204 | 10.082 | 1.012100774 >> ArrayCopyUnalignedDst.testChar | 600 | 16.978 | 17.135 | 0.990837467 >> ArrayCopyUnalignedDst.testChar | 1200 | 32.351 | 31.996 | 1.011095137 >> ArrayCopyUnalignedSrc.testByte | 1 | 5.414 | 2.696 | 2.008160237 >> ArrayCopyUnalignedSrc.testByte | 3 | 5.494 | 2.637 | 2.083428138 >> ArrayCopyUnalignedSrc.testByte | 5 | 5.431 | 2.637 | 2.059537353 >> ArrayCopyUnalignedSrc.testByte | 10 | 5.344 | 2.703 | 1.977062523 >> ArrayCopyUnalignedSrc.testByte | 20 | 5.834 | 2.696 | 2.163946588 >> ArrayCopyUnalignedSrc.testByte | 70 | 5.883 | 6.009 | 0.979031453 >> ArrayCopyUnalignedSrc.testByte | 150 | 6.729 | 6.87 | 0.979475983 >> ArrayCopyUnalignedSrc.testByte | 300 | 7.603 | 7.97 | 0.953952321 >> ArrayCopyUnalignedSrc.testByte | 600 | 12.004 | 12.16 | 0.987171053 >> ArrayCopyUnalignedSrc.testByte | 1200 | 16.534 | 16.643 | 0.9934507 >> ArrayCopyUnalignedSrc.testChar | 1 | 5.021 | 2.762 | 1.81788559 >> ArrayCopyUnalignedSrc.testChar | 3 | 5.13 | 2.762 | 1.857349747 >> ArrayCopyUnalignedSrc.testChar | 5 | 5.042 | 2.827 | 1.783516095 >> ArrayCopyUnalignedSrc.testChar | 10 | 5.726 | 2.761 | 2.073886273 >> ArrayCopyUnalignedSrc.testChar | 20 | 5.112 | 5.401 | 0.94649139 >> ArrayCopyUnalignedSrc.testChar | 70 | 6.113 | 6.227 | 0.981692629 >> ArrayCopyUnalignedSrc.testChar | 150 | 7.493 | 7.888 | 0.949923935 >> ArrayCopyUnalignedSrc.testChar | 300 | 10.234 | 10.501 | 0.97457385 >> ArrayCopyUnalignedSrc.testChar | 600 | 17.175 | 17.142 | 1.001925096 >> ArrayCopyUnalignedSrc.testChar | 1200 | 31.926 | 31.987 | 0.998092975 >> >> Detailed Reports: >> Baseline : >> [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt) >> WithOpt : >> [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt) > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Replacing explicit type checks with existing type checking routines Changes requested by neliasso (Reviewer). src/hotspot/share/opto/macroArrayCopy.cpp line 859: > 857: #ifdef ASSERT > 858: const TypeOopPtr* dest_t = _igvn.type(dest)->is_oopptr(); > 859: if (dest_t->is_known_instance() && false == is_partial_array_copy) { "false == is_partial_array_copy" change to (!is_partial_array_copy) src/hotspot/share/opto/memnode.hpp line 1188: > 1186: TrailingLoadStore, > 1187: LeadingLoadStore, > 1188: AfterPartialArrayCopy Change to keep consistent with the other names: AfterPartialArrayCopy -> TrailingPartialArrayCopy Why is a special kind needed for partial array copy? ------------- PR: https://git.openjdk.java.net/jdk/pull/302 From psandoz at openjdk.java.net Thu Oct 15 16:02:20 2020 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Thu, 15 Oct 2020 16:02:20 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v4] In-Reply-To: <7XjzEn5DggliDrvjhrGwXZL5r4lsqeGF9SGLmRr5L84=.a4481a62-4ecf-4e3f-98f3-70e548c67b52@github.com> References: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> <7-p-Kc9lQyyuoWdNtmgbXbwkxsgk4oQGKmFSCcMpvnU=.97810c01-3200-4767-bbd4-35d53c2bc5ca@github.com> <6Voyfr_s-ieyRA-8Rtvvpz7tkhhicA8sY2d2KTp3Kmw=.fa256bae-2143-4b43-bfea-5837ad31eb6a@github.com> <7XjzEn5DggliDrvjhrGwXZL5r4lsqeGF9SGLmRr5L84=.a4481a62-4ecf-4e3f-98f3-70e548c67b52@github.com> Message-ID: <9y5m4zfsDZMdIZ6CT38BzO0tpFMuFxUswAbpjfDny-w=.44c7ada7-bf77-45f1-b5a6-b542731d6685@github.com> On Wed, 14 Oct 2020 21:29:55 GMT, CoreyAshford wrote: >> Hi Corey, >> >>> Are you thinking of a case where that produces a higher iteration count? >> Sorry for the confusion. This is also fine: >> length = sl - sp - 12 >> i = length / block_size >> if (i <= 0) return 0 >> >> But I still wonder why we should use 2 branches. Why not >> srawi_ >> ble(CCR0, return_zero) >> ? >> >>> Ah, I should have checked the calling conventions. I thought all of the CR* regs are volatile. I will fix that. >> Actually, we do save and restore all CRs, so it?s not a real problem with the current implementation. But I prefer >> staying closer to the elf ABI as long as there?s no good reason to do it differently. >>> Your original comment said "2nd review", so I thought you meant you need to review it again after the changes. >> We usually require at least 2 reviews by different people for all non-trivial changes. And I don?t consider the PPC64 >> part as trivial. In addition to that, I?m not familiar with Power 10. >> Best regards, >> Martin >> >> >> From: CoreyAshford >> Sent: Dienstag, 13. Oktober 2020 22:59 >> To: openjdk/jdk >> Cc: Doerr, Martin ; Mention >> Subject: Re: [openjdk/jdk] 8248188: Add IntrinsicCandidate and API for Base64 decoding (#293) >> >> >> Hi Corey, thanks for taking some stuff out of the ?too short? path. There may be a performance regression when decoding >> many short arrays because of the stub call overhead and the usage of the slower part of the Java implementation. We >> could do it a little better in many cases to compute the maximum possible iteration count i: i = (sl - sp) / block_size >> if (i * block_size > sl - 12) i-- if (i <= 0) return 0 What do you think? Are you thinking of a case where that >> produces a higher iteration count? It looks effectively the same to me. I don?t think branch prediction hints are >> helpful for the ?too short? check. >> My thinking is that most of the time when the intrinsic is called, it will not take the early exit, but I suppose when >> it is processing a sub-block_size buffer, it will return early every time. I will remove the hints. >> And we should better use CCR1 instead of CCR2 which is specified as non-volatile. >> >> Ah, I should have checked the calling conventions. I thought all of the CR* regs are volatile. I will fix that. >> >> Did you already find a 2nd reviewer for the PPC64 part? >> >> Your original comment said "2nd review", so I thought you meant you need to review it again after the changes. So, no, >> I haven't looked for or found a second reviewer. Any suggestions? The folks on the team here have been busy with other >> work. Btw, I'm off today, so I will push commits to the above-mentioned issues tomorrow. >> >> ? >> You are receiving this because you were mentioned. >> Reply to this email directly, view it on GitHub, or >> unsubscribe. > >> Hi Corey, >> Are you thinking of a case where that produces a higher iteration count? >> Sorry for the confusion. This is also fine: length = sl - sp - 12 i = length / block_size if (i <= 0) return 0 But I >> still wonder why we should use 2 branches. Why not srawi_ ble(CCR0, return_zero) ? > > You're right! I originally thought that the `srawi.` was setting only the Zero bit, but it sets others as well. > >> Ah, I should have checked the calling conventions. I thought all of the CR* regs are volatile. I will fix that. >> Actually, we do save and restore all CRs, so it?s not a real problem with the current implementation. But I prefer >> staying closer to the elf ABI as long as there?s no good reason to do it differently. > > Looks like I don't need that code at all now, but it's good to know for the future; I have an encode intrinsic in the > works. >> Your original comment said "2nd review", so I thought you meant you need to review it again after the changes. >> We usually require at least 2 reviews by different people for all non-trivial changes. And I don?t consider the PPC64 >> part as trivial. In addition to that, I?m not familiar with Power 10. > > I received permission to request help from the GNU toolchain team here to review it. Due to family issues and work > schedule on my end, it will be at least the middle of next week before I can get a reviewer to have a look. > Thanks for your continued patience and help. Please update [compiler/graalunit/HotspotTest.java](https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/compiler/graalunit/HotspotTest.java), and add the intrinsic signature. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From rehn at openjdk.java.net Thu Oct 15 16:13:14 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 15 Oct 2020 16:13:14 GMT Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: References: <35eLsMpWmcCUoiEWhnYdSpZNmvLy4ra56Qtd6eRW574=.4e7c9278-3e0d-457d-9c15-eef45bae9755@github.com> Message-ID: On Thu, 15 Oct 2020 11:07:20 GMT, Alan Hayward wrote: > > A question, ISB don't flush the I-cache which I thought was needed? > > I would have expected something more similar to gcc clear_cache. > > (Possible I've missed something in your question...) > Any cache flushing would be performed by the thread that modifies the code. > The cross_modify_fence is for the other threads after the code has been modified. The only thing they need to do is an > isb. Correct me if I'm wrong: We write the updated instructions to D-cache which is not coherent with the I-cache. To synchronize the I-cache to what we wrote in the D-cache, we first need make sure the D-cache is committed to memory. Then each processor must invalidate it's I-cache, thus fetching the new instructions from memory. So yes the thread that modifies the code should make sure the D-cache is consistent with main memory. Can the I-cache invalidation be done by the modifying thread in a distributed way ? When we do "fix_oop_relocations" we end up calling: MacroAssembler::patch_oop(..) in aarch64 case. To me this indicate that we just changed an oop in a code segment, but I see no signs of above. Maybe I got something wrong about how this I-cache/D-cache works or missed something in our code? I think @theRealAph suggested having a indirect to the oops and thus move them of out code and avoiding I-cache. ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From mcimadamore at openjdk.java.net Thu Oct 15 16:20:25 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 15 Oct 2020 16:20:25 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v2] In-Reply-To: References: Message-ID: <8CkvfiM_PR0iDEo-tgUNRXWlG-qDwEDN2Qd15pbgQ0E=.353d8045-3593-4c72-9d81-c290b43ca89c@github.com> > This patch contains the changes associated with the first incubation round of the foreign linker access API incubation > (see JEP 389 [1]). This work is meant to sit on top of the foreign memory access support (see JEP 393 [2] and > associated pull request [3]). > The main goal of this API is to provide a way to call native functions from Java code without the need of intermediate > JNI glue code. In order to do this, native calls are modeled through the MethodHandle API. I suggest reading the > writeup [4] I put together few weeks ago, which illustrates what the foreign linker support is, and how it should be > used by clients. Disclaimer: the pull request mechanism isn't great at managing *dependent* reviews. For this reasons, > I'm attaching a webrev which contains only the differences between this PR and the memory access PR. I will be > periodically uploading new webrevs, as new iterations come out, to try and make the life of reviewers as simple as > possible. A big thank to Jorn Vernee and Vladimir Ivanov - they are the main architects of all the hotspot changes you > see here, and without their help, the foreign linker support wouldn't be what it is today. As usual, a big thank to > Paul Sandoz, who provided many insights (often by trying the bits first hand). Thanks Maurizio > Webrev: > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/webrev > > Javadoc: > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/javadoc/jdk/incubator/foreign/package-summary.html > > Specdiff (relative to [3]): > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/specdiff_delta/overview-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254232 > > > > ### API Changes > > The API changes are actually rather slim: > > * `LibraryLookup` > * This class allows clients to lookup symbols in native libraries; the interface is fairly simple; you can load a library > by name, or absolute path, and then lookup symbols on that library. > * `FunctionDescriptor` > * This is an abstraction that is very similar, in spirit, to `MethodType`; it is, at its core, an aggregate of memory > layouts for the function arguments/return type. A function descriptor is used to describe the signature of a native > function. > * `CLinker` > * This is the real star of the show. A `CLinker` has two main methods: `downcallHandle` and `upcallStub`; the first takes > a native symbol (as obtained from `LibraryLookup`), a `MethodType` and a `FunctionDescriptor` and returns a > `MethodHandle` instance which can be used to call the target native symbol. The second takes an existing method handle, > and a `FunctionDescriptor` and returns a new `MemorySegment` corresponding to a code stub allocated by the VM which > acts as a trampoline from native code to the user-provided method handle. This is very useful for implementing upcalls. > * This class also contains the various layout constants that should be used by clients when describing native signatures > (e.g. `C_LONG` and friends); these layouts contain additional ABI classfication information (in the form of layout > attributes) which is used by the runtime to *infer* how Java arguments should be shuffled for the native call to take > place. > * Finally, this class provides some helper functions e.g. so that clients can convert Java strings into C strings and > back. > * `NativeScope` > * This is an helper class which allows clients to group together logically related allocations; that is, rather than > allocating separate memory segments using separate *try-with-resource* constructs, a `NativeScope` allows clients to > use a _single_ block, and allocate all the required segments there. This is not only an usability boost, but also a > performance boost, since not all allocation requests will be turned into `malloc` calls. > * `MemorySegment` > * Only one method added here - namely `handoff(NativeScope)` which allows a segment to be transferred onto an existing > native scope. > > ### Safety > > The foreign linker API is intrinsically unsafe; many things can go wrong when requesting a native method handle. For > instance, the description of the native signature might be wrong (e.g. have too many arguments) - and the runtime has, > in the general case, no way to detect such mismatches. For these reasons, obtaining a `CLinker` instance is > a *restricted* operation, which can be enabled by specifying the usual JDK property `-Dforeign.restricted=permit` (as > it's the case for other restricted method in the foreign memory API). ### Implementation changes The Java changes > associated with `LibraryLookup` are relative straightforward; the only interesting thing to note here is that library > loading does _not_ depend on class loaders, so `LibraryLookup` is not subject to the same restrictions which apply to > JNI library loading (e.g. same library cannot be loaded by different classloaders). As for `NativeScope` the changes > are again relatively straightforward; it is an API which sits neatly on top of the foreign meory access API, providing > some kind of allocation service which shares the same underlying memory segment(s), and turns an allocation request > into a segment slice, which is a much less expensive operation. `NativeScope` comes in two variants: there are native > scopes for which the allocation size is known a priori, and native scopes which can grow - these two schemes are > implemented by two separate subclasses of `AbstractNativeScopeImpl`. Of course the bulk of the changes are to support > the `CLinker` downcall/upcall routines. These changes cut pretty deep into the JVM; I'll briefly summarize the goal of > some of this changes - for further details, Jorn has put together a detailed writeup which explains the rationale > behind the VM support, with some references to the code [5]. The main idea behind foreign linker is to infer, given a > Java method type (expressed as a `MethodType` instance) and the description of the signature of a native function > (expressed as a `FunctionDescriptor` instance) a _recipe_ that can be used to turn a Java call into the corresponding > native call targeting the requested native function. This inference scheme can be defined in a pretty straightforward > fashion by looking at the various ABI specifications (for instance, see [6] for the SysV ABI, which is the one used on > Linux/Mac). The various `CallArranger` classes, of which we have a flavor for each supported platform, do exactly that > kind of inference. For the inference process to work, we need to attach extra information to memory layouts; it is no > longer sufficient to know e.g. that a layout is 32/64 bits - we need to know whether it is meant to represent a > floating point value, or an integral value; this knowledge is required because floating points are passed in different > registers by most ABIs. For this reason, `CLinker` offers a set of pre-baked, platform-dependent layout constants which > contain the required classification attributes (e.g. a `Clinker.TypeKind` enum value). The runtime extracts this > attribute, and performs classification accordingly. A native call is decomposed into a sequence of basic, primitive > operations, called `Binding` (see the great javadoc on the `Binding.java` class for more info). There are many such > bindings - for instance the `Move` binding is used to move a value into a specific machine register/stack slot. So, the > main job of the various `CallingArranger` classes is to determine, given a Java `MethodType` and `FunctionDescriptor` > what is the set of bindings associated with the downcall/upcall. At the heart of the foreign linker support is the > `ProgrammableInvoker` class. This class effectively generates a `MethodHandle` which follows the steps described by the > various bindings obtained by `CallArranger`. There are actually various strategies to interpret these bindings - listed > below: > * basic intepreted mode; in this mode, all bindings are interpreted using a stack-based machine written in Java (see > `BindingInterpreter`), except for the `Move` bindings. For these bindings, the move is implemented by allocating > a *buffer* (whose size is ABI specific) and by moving all the lowered values into positions within this buffer. The > buffer is then passed to a piece of assembly code inside the VM which takes values from the buffer and moves them in > their expected registers/stack slots (note that each position in the buffer corresponds to a different register). This > is the most general invocation mode, the more "customizable" one, but also the slowest - since for every call there is > some extra allocation which takes place. > > * specialized interpreted mode; same as before, but instead of interpreting the bindings with a stack-based interpreter, > we generate a method handle chain which effectively interprets all the bindings (again, except `Move` ones). > > * intrinsified mode; this is typically used in combination with the specialized interpreted mode described above > (although it can also be used with the Java-based binding interpreter). The goal here is to remove the buffer > allocation and copy by introducing an additional JVM intrinsic. If a native call recipe is constant (e.g. the set of > bindings is constant, which is probably the case if the native method handle is stored in a `static`, `final` field), > then the VM can generate specialized assembly code which interprets the `Move` binding without the need to go for an > intermediate buffer. This gives us back performances that are on par with JNI. > > For upcalls, the support is not (yet) as advanced, and only the basic interpreted mode is available there. We plan to > add support for intrinsified modes there as well, which should considerably boost perfomances (probably well beyond > what JNI can offer at the moment, since the upcall support in JNI is not very well optimized). Again, for more > readings on the internals of the foreign linker support, please refer to [5]. > #### Test changes > > Many new tests have been added to validate the foreign linker support; we have high level tests (see `StdLibTest`) > which aim at testing the linker from the perspective of code that clients could write. But we also have deeper > combinatorial tests (see `TestUpcall` and `TestDowncall`) which are meant to stress every corner of the ABI > implementation. There are also some great tests (see the `callarranger` folder) which test the various `CallArranger`s > for all the possible platforms; these tests adopt more of a white-box approach - that is, instead of treating the > linker machinery as a black box and verify that the support works by checking that the native call returned the results > we expected, these tests aims at checking that the set of bindings generated by the call arranger is correct. This also > mean that we can test the classification logic for Windows, Mac and Linux regardless of the platform we're executing > on. Some additional microbenchmarks have been added to compare the performances of downcall/upcall with JNI. [1] - > https://openjdk.java.net/jeps/389 [2] - https://openjdk.java.net/jeps/393 [3] - > https://git.openjdk.java.net/jdk/pull/548 [4] - > https://github.com/openjdk/panama-foreign/blob/foreign-jextract/doc/panama_ffi.md [5] - > http://cr.openjdk.java.net/~jvernee/docs/Foreign-abi%20downcall%20intrinsics%20technical%20description.html Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 19 commits: - Merge branch 'master' into 8254231_linker - Fix tests - Fix more whitespaces - Fix whitespaces - Remove rejected file - More updates - Add new files - Merge with master - Merge branch 'master' into 8254162 - Remove spurious check on MemoryScope::confineTo Added tests to make sure no spurious exception is thrown when: * handing off a segment from A to A * sharing an already shared segment - Merge branch 'master' into 8254162 - ... and 9 more: https://git.openjdk.java.net/jdk/compare/3c2f5e08...ad8bee12 ------------- Changes: https://git.openjdk.java.net/jdk/pull/634/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=634&range=01 Stats: 84781 lines in 279 files changed: 72658 ins; 10861 del; 1262 mod Patch: https://git.openjdk.java.net/jdk/pull/634.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/634/head:pull/634 PR: https://git.openjdk.java.net/jdk/pull/634 From mcimadamore at openjdk.java.net Thu Oct 15 16:32:03 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 15 Oct 2020 16:32:03 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v3] In-Reply-To: References: Message-ID: <4w2EB316f1EZC9b-zMmHIa1xqQL6Jw0-Vif_CXzkDS4=.16eaa7d6-f59b-4622-8b34-b9df57d63f45@github.com> > This patch contains the changes associated with the first incubation round of the foreign linker access API incubation > (see JEP 389 [1]). This work is meant to sit on top of the foreign memory access support (see JEP 393 [2] and > associated pull request [3]). > The main goal of this API is to provide a way to call native functions from Java code without the need of intermediate > JNI glue code. In order to do this, native calls are modeled through the MethodHandle API. I suggest reading the > writeup [4] I put together few weeks ago, which illustrates what the foreign linker support is, and how it should be > used by clients. Disclaimer: the pull request mechanism isn't great at managing *dependent* reviews. For this reasons, > I'm attaching a webrev which contains only the differences between this PR and the memory access PR. I will be > periodically uploading new webrevs, as new iterations come out, to try and make the life of reviewers as simple as > possible. A big thank to Jorn Vernee and Vladimir Ivanov - they are the main architects of all the hotspot changes you > see here, and without their help, the foreign linker support wouldn't be what it is today. As usual, a big thank to > Paul Sandoz, who provided many insights (often by trying the bits first hand). Thanks Maurizio > Webrev: > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/webrev > > Javadoc: > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/javadoc/jdk/incubator/foreign/package-summary.html > > Specdiff (relative to [3]): > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/specdiff_delta/overview-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254232 > > > > ### API Changes > > The API changes are actually rather slim: > > * `LibraryLookup` > * This class allows clients to lookup symbols in native libraries; the interface is fairly simple; you can load a library > by name, or absolute path, and then lookup symbols on that library. > * `FunctionDescriptor` > * This is an abstraction that is very similar, in spirit, to `MethodType`; it is, at its core, an aggregate of memory > layouts for the function arguments/return type. A function descriptor is used to describe the signature of a native > function. > * `CLinker` > * This is the real star of the show. A `CLinker` has two main methods: `downcallHandle` and `upcallStub`; the first takes > a native symbol (as obtained from `LibraryLookup`), a `MethodType` and a `FunctionDescriptor` and returns a > `MethodHandle` instance which can be used to call the target native symbol. The second takes an existing method handle, > and a `FunctionDescriptor` and returns a new `MemorySegment` corresponding to a code stub allocated by the VM which > acts as a trampoline from native code to the user-provided method handle. This is very useful for implementing upcalls. > * This class also contains the various layout constants that should be used by clients when describing native signatures > (e.g. `C_LONG` and friends); these layouts contain additional ABI classfication information (in the form of layout > attributes) which is used by the runtime to *infer* how Java arguments should be shuffled for the native call to take > place. > * Finally, this class provides some helper functions e.g. so that clients can convert Java strings into C strings and > back. > * `NativeScope` > * This is an helper class which allows clients to group together logically related allocations; that is, rather than > allocating separate memory segments using separate *try-with-resource* constructs, a `NativeScope` allows clients to > use a _single_ block, and allocate all the required segments there. This is not only an usability boost, but also a > performance boost, since not all allocation requests will be turned into `malloc` calls. > * `MemorySegment` > * Only one method added here - namely `handoff(NativeScope)` which allows a segment to be transferred onto an existing > native scope. > > ### Safety > > The foreign linker API is intrinsically unsafe; many things can go wrong when requesting a native method handle. For > instance, the description of the native signature might be wrong (e.g. have too many arguments) - and the runtime has, > in the general case, no way to detect such mismatches. For these reasons, obtaining a `CLinker` instance is > a *restricted* operation, which can be enabled by specifying the usual JDK property `-Dforeign.restricted=permit` (as > it's the case for other restricted method in the foreign memory API). ### Implementation changes The Java changes > associated with `LibraryLookup` are relative straightforward; the only interesting thing to note here is that library > loading does _not_ depend on class loaders, so `LibraryLookup` is not subject to the same restrictions which apply to > JNI library loading (e.g. same library cannot be loaded by different classloaders). As for `NativeScope` the changes > are again relatively straightforward; it is an API which sits neatly on top of the foreign meory access API, providing > some kind of allocation service which shares the same underlying memory segment(s), and turns an allocation request > into a segment slice, which is a much less expensive operation. `NativeScope` comes in two variants: there are native > scopes for which the allocation size is known a priori, and native scopes which can grow - these two schemes are > implemented by two separate subclasses of `AbstractNativeScopeImpl`. Of course the bulk of the changes are to support > the `CLinker` downcall/upcall routines. These changes cut pretty deep into the JVM; I'll briefly summarize the goal of > some of this changes - for further details, Jorn has put together a detailed writeup which explains the rationale > behind the VM support, with some references to the code [5]. The main idea behind foreign linker is to infer, given a > Java method type (expressed as a `MethodType` instance) and the description of the signature of a native function > (expressed as a `FunctionDescriptor` instance) a _recipe_ that can be used to turn a Java call into the corresponding > native call targeting the requested native function. This inference scheme can be defined in a pretty straightforward > fashion by looking at the various ABI specifications (for instance, see [6] for the SysV ABI, which is the one used on > Linux/Mac). The various `CallArranger` classes, of which we have a flavor for each supported platform, do exactly that > kind of inference. For the inference process to work, we need to attach extra information to memory layouts; it is no > longer sufficient to know e.g. that a layout is 32/64 bits - we need to know whether it is meant to represent a > floating point value, or an integral value; this knowledge is required because floating points are passed in different > registers by most ABIs. For this reason, `CLinker` offers a set of pre-baked, platform-dependent layout constants which > contain the required classification attributes (e.g. a `Clinker.TypeKind` enum value). The runtime extracts this > attribute, and performs classification accordingly. A native call is decomposed into a sequence of basic, primitive > operations, called `Binding` (see the great javadoc on the `Binding.java` class for more info). There are many such > bindings - for instance the `Move` binding is used to move a value into a specific machine register/stack slot. So, the > main job of the various `CallingArranger` classes is to determine, given a Java `MethodType` and `FunctionDescriptor` > what is the set of bindings associated with the downcall/upcall. At the heart of the foreign linker support is the > `ProgrammableInvoker` class. This class effectively generates a `MethodHandle` which follows the steps described by the > various bindings obtained by `CallArranger`. There are actually various strategies to interpret these bindings - listed > below: > * basic intepreted mode; in this mode, all bindings are interpreted using a stack-based machine written in Java (see > `BindingInterpreter`), except for the `Move` bindings. For these bindings, the move is implemented by allocating > a *buffer* (whose size is ABI specific) and by moving all the lowered values into positions within this buffer. The > buffer is then passed to a piece of assembly code inside the VM which takes values from the buffer and moves them in > their expected registers/stack slots (note that each position in the buffer corresponds to a different register). This > is the most general invocation mode, the more "customizable" one, but also the slowest - since for every call there is > some extra allocation which takes place. > > * specialized interpreted mode; same as before, but instead of interpreting the bindings with a stack-based interpreter, > we generate a method handle chain which effectively interprets all the bindings (again, except `Move` ones). > > * intrinsified mode; this is typically used in combination with the specialized interpreted mode described above > (although it can also be used with the Java-based binding interpreter). The goal here is to remove the buffer > allocation and copy by introducing an additional JVM intrinsic. If a native call recipe is constant (e.g. the set of > bindings is constant, which is probably the case if the native method handle is stored in a `static`, `final` field), > then the VM can generate specialized assembly code which interprets the `Move` binding without the need to go for an > intermediate buffer. This gives us back performances that are on par with JNI. > > For upcalls, the support is not (yet) as advanced, and only the basic interpreted mode is available there. We plan to > add support for intrinsified modes there as well, which should considerably boost perfomances (probably well beyond > what JNI can offer at the moment, since the upcall support in JNI is not very well optimized). Again, for more > readings on the internals of the foreign linker support, please refer to [5]. > #### Test changes > > Many new tests have been added to validate the foreign linker support; we have high level tests (see `StdLibTest`) > which aim at testing the linker from the perspective of code that clients could write. But we also have deeper > combinatorial tests (see `TestUpcall` and `TestDowncall`) which are meant to stress every corner of the ABI > implementation. There are also some great tests (see the `callarranger` folder) which test the various `CallArranger`s > for all the possible platforms; these tests adopt more of a white-box approach - that is, instead of treating the > linker machinery as a black box and verify that the support works by checking that the native call returned the results > we expected, these tests aims at checking that the set of bindings generated by the call arranger is correct. This also > mean that we can test the classification logic for Windows, Mac and Linux regardless of the platform we're executing > on. Some additional microbenchmarks have been added to compare the performances of downcall/upcall with JNI. [1] - > https://openjdk.java.net/jeps/389 [2] - https://openjdk.java.net/jeps/393 [3] - > https://git.openjdk.java.net/jdk/pull/548 [4] - > https://github.com/openjdk/panama-foreign/blob/foreign-jextract/doc/panama_ffi.md [5] - > http://cr.openjdk.java.net/~jvernee/docs/Foreign-abi%20downcall%20intrinsics%20technical%20description.html Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Re-add erroneously removed files ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/634/files - new: https://git.openjdk.java.net/jdk/pull/634/files/ad8bee12..2184831e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=634&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=634&range=01-02 Stats: 9218 lines in 14 files changed: 9218 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/634.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/634/head:pull/634 PR: https://git.openjdk.java.net/jdk/pull/634 From aph at openjdk.java.net Thu Oct 15 17:01:16 2020 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 15 Oct 2020 17:01:16 GMT Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: References: <35eLsMpWmcCUoiEWhnYdSpZNmvLy4ra56Qtd6eRW574=.4e7c9278-3e0d-457d-9c15-eef45bae9755@github.com> Message-ID: <2b2X1iZw_GYV7GXESupACrX-WP976FrhMnr7ETJV_rM=.7a0b1359-1377-4c49-b737-f374caca58cd@github.com> On Thu, 15 Oct 2020 16:10:22 GMT, Robbin Ehn wrote: >>> A question, ISB don't flush the I-cache which I thought was needed? >>> I would have expected something more similar to gcc clear_cache. >> >> (Possible I've missed something in your question...) >> Any cache flushing would be performed by the thread that modifies the code. >> The cross_modify_fence is for the other threads after the code has been modified. The only thing they need to do is an >> isb. > >> > A question, ISB don't flush the I-cache which I thought was needed? >> > I would have expected something more similar to gcc clear_cache. >> >> (Possible I've missed something in your question...) >> Any cache flushing would be performed by the thread that modifies the code. >> The cross_modify_fence is for the other threads after the code has been modified. The only thing they need to do is an >> isb. > > Correct me if I'm wrong: > We write the updated instructions to D-cache which is not coherent with the I-cache. > To synchronize the I-cache to what we wrote in the D-cache, we first need make sure the D-cache is committed to memory. > Then each processor must invalidate it's I-cache, thus fetching the new instructions from memory. > So yes the thread that modifies the code should make sure the D-cache is consistent with main memory. > Can the I-cache invalidation be done by the modifying thread in a distributed way ? > > When we do "fix_oop_relocations" we end up calling: MacroAssembler::patch_oop(..) in aarch64 case. > To me this indicate that we just changed an oop in a code segment, but I see no signs of above. > > Maybe I got something wrong about how this I-cache/D-cache works or missed something in our code? > > I think @theRealAph suggested having a indirect to the oops and thus move them of out code and avoiding I-cache. On 15/10/2020 17:10, Robbin Ehn wrote: > > Correct me if I'm wrong: > We write the updated instructions to D-cache which is not coherent with the I-cache. > To synchronize the I-cache to what we wrote in the D-cache, we first need make sure the D-cache is committed to memory. Yes. > Then each processor must invalidate it's I-cache, thus fetching the new instructions from memory. > So yes the thread that modifies the code should make sure the D-cache is consistent with main memory. > Can the I-cache invalidation be done by the modifying thread in a distributed way ? Yes, it is. In a multiprocessor system, IC IVAU instruction is broadcast to all cores in the same memory domain. > When we do "fix_oop_relocations" we end up calling: MacroAssembler::patch_oop(..) in aarch64 case. > To me this indicate that we just changed an oop in a code segment, but I see no signs of above. The call stack looks like this: #0 ICache::invalidate_range /home/aph/jdk-jdk/src/hotspot/os_cpu/linux_aarch64/icache_linux_aarch64.hpp:40 #1 Relocation::pd_set_data_value /home/aph/jdk-jdk/src/hotspot/cpu/aarch64/relocInfo_aarch64.cpp:58 #2 DataRelocation::set_value /home/aph/jdk-jdk/src/hotspot/share/code/relocInfo.hpp:849 #3 DataRelocation::set_value /home/aph/jdk-jdk/src/hotspot/share/code/relocInfo.hpp:844 #4 oop_Relocation::fix_oop_relocation /home/aph/jdk-jdk/src/hotspot/share/code/relocInfo.cpp:554 #5 nmethod::fix_oop_relocations /home/aph/jdk-jdk/src/hotspot/share/code/nmethod.cpp:1051 -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From mcimadamore at openjdk.java.net Thu Oct 15 17:08:28 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 15 Oct 2020 17:08:28 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v4] In-Reply-To: References: Message-ID: <0Zh0H5gSXzvHSstQ2w8NBM-P8yERRPouvhZJDNGvu4A=.6cde913f-7499-4c45-bc63-b717502b661e@github.com> > This patch contains the changes associated with the first incubation round of the foreign linker access API incubation > (see JEP 389 [1]). This work is meant to sit on top of the foreign memory access support (see JEP 393 [2] and > associated pull request [3]). > The main goal of this API is to provide a way to call native functions from Java code without the need of intermediate > JNI glue code. In order to do this, native calls are modeled through the MethodHandle API. I suggest reading the > writeup [4] I put together few weeks ago, which illustrates what the foreign linker support is, and how it should be > used by clients. Disclaimer: the pull request mechanism isn't great at managing *dependent* reviews. For this reasons, > I'm attaching a webrev which contains only the differences between this PR and the memory access PR. I will be > periodically uploading new webrevs, as new iterations come out, to try and make the life of reviewers as simple as > possible. A big thank to Jorn Vernee and Vladimir Ivanov - they are the main architects of all the hotspot changes you > see here, and without their help, the foreign linker support wouldn't be what it is today. As usual, a big thank to > Paul Sandoz, who provided many insights (often by trying the bits first hand). Thanks Maurizio > Webrev: > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/webrev > > Javadoc: > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/javadoc/jdk/incubator/foreign/package-summary.html > > Specdiff (relative to [3]): > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/specdiff_delta/overview-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254232 > > > > ### API Changes > > The API changes are actually rather slim: > > * `LibraryLookup` > * This class allows clients to lookup symbols in native libraries; the interface is fairly simple; you can load a library > by name, or absolute path, and then lookup symbols on that library. > * `FunctionDescriptor` > * This is an abstraction that is very similar, in spirit, to `MethodType`; it is, at its core, an aggregate of memory > layouts for the function arguments/return type. A function descriptor is used to describe the signature of a native > function. > * `CLinker` > * This is the real star of the show. A `CLinker` has two main methods: `downcallHandle` and `upcallStub`; the first takes > a native symbol (as obtained from `LibraryLookup`), a `MethodType` and a `FunctionDescriptor` and returns a > `MethodHandle` instance which can be used to call the target native symbol. The second takes an existing method handle, > and a `FunctionDescriptor` and returns a new `MemorySegment` corresponding to a code stub allocated by the VM which > acts as a trampoline from native code to the user-provided method handle. This is very useful for implementing upcalls. > * This class also contains the various layout constants that should be used by clients when describing native signatures > (e.g. `C_LONG` and friends); these layouts contain additional ABI classfication information (in the form of layout > attributes) which is used by the runtime to *infer* how Java arguments should be shuffled for the native call to take > place. > * Finally, this class provides some helper functions e.g. so that clients can convert Java strings into C strings and > back. > * `NativeScope` > * This is an helper class which allows clients to group together logically related allocations; that is, rather than > allocating separate memory segments using separate *try-with-resource* constructs, a `NativeScope` allows clients to > use a _single_ block, and allocate all the required segments there. This is not only an usability boost, but also a > performance boost, since not all allocation requests will be turned into `malloc` calls. > * `MemorySegment` > * Only one method added here - namely `handoff(NativeScope)` which allows a segment to be transferred onto an existing > native scope. > > ### Safety > > The foreign linker API is intrinsically unsafe; many things can go wrong when requesting a native method handle. For > instance, the description of the native signature might be wrong (e.g. have too many arguments) - and the runtime has, > in the general case, no way to detect such mismatches. For these reasons, obtaining a `CLinker` instance is > a *restricted* operation, which can be enabled by specifying the usual JDK property `-Dforeign.restricted=permit` (as > it's the case for other restricted method in the foreign memory API). ### Implementation changes The Java changes > associated with `LibraryLookup` are relative straightforward; the only interesting thing to note here is that library > loading does _not_ depend on class loaders, so `LibraryLookup` is not subject to the same restrictions which apply to > JNI library loading (e.g. same library cannot be loaded by different classloaders). As for `NativeScope` the changes > are again relatively straightforward; it is an API which sits neatly on top of the foreign meory access API, providing > some kind of allocation service which shares the same underlying memory segment(s), and turns an allocation request > into a segment slice, which is a much less expensive operation. `NativeScope` comes in two variants: there are native > scopes for which the allocation size is known a priori, and native scopes which can grow - these two schemes are > implemented by two separate subclasses of `AbstractNativeScopeImpl`. Of course the bulk of the changes are to support > the `CLinker` downcall/upcall routines. These changes cut pretty deep into the JVM; I'll briefly summarize the goal of > some of this changes - for further details, Jorn has put together a detailed writeup which explains the rationale > behind the VM support, with some references to the code [5]. The main idea behind foreign linker is to infer, given a > Java method type (expressed as a `MethodType` instance) and the description of the signature of a native function > (expressed as a `FunctionDescriptor` instance) a _recipe_ that can be used to turn a Java call into the corresponding > native call targeting the requested native function. This inference scheme can be defined in a pretty straightforward > fashion by looking at the various ABI specifications (for instance, see [6] for the SysV ABI, which is the one used on > Linux/Mac). The various `CallArranger` classes, of which we have a flavor for each supported platform, do exactly that > kind of inference. For the inference process to work, we need to attach extra information to memory layouts; it is no > longer sufficient to know e.g. that a layout is 32/64 bits - we need to know whether it is meant to represent a > floating point value, or an integral value; this knowledge is required because floating points are passed in different > registers by most ABIs. For this reason, `CLinker` offers a set of pre-baked, platform-dependent layout constants which > contain the required classification attributes (e.g. a `Clinker.TypeKind` enum value). The runtime extracts this > attribute, and performs classification accordingly. A native call is decomposed into a sequence of basic, primitive > operations, called `Binding` (see the great javadoc on the `Binding.java` class for more info). There are many such > bindings - for instance the `Move` binding is used to move a value into a specific machine register/stack slot. So, the > main job of the various `CallingArranger` classes is to determine, given a Java `MethodType` and `FunctionDescriptor` > what is the set of bindings associated with the downcall/upcall. At the heart of the foreign linker support is the > `ProgrammableInvoker` class. This class effectively generates a `MethodHandle` which follows the steps described by the > various bindings obtained by `CallArranger`. There are actually various strategies to interpret these bindings - listed > below: > * basic intepreted mode; in this mode, all bindings are interpreted using a stack-based machine written in Java (see > `BindingInterpreter`), except for the `Move` bindings. For these bindings, the move is implemented by allocating > a *buffer* (whose size is ABI specific) and by moving all the lowered values into positions within this buffer. The > buffer is then passed to a piece of assembly code inside the VM which takes values from the buffer and moves them in > their expected registers/stack slots (note that each position in the buffer corresponds to a different register). This > is the most general invocation mode, the more "customizable" one, but also the slowest - since for every call there is > some extra allocation which takes place. > > * specialized interpreted mode; same as before, but instead of interpreting the bindings with a stack-based interpreter, > we generate a method handle chain which effectively interprets all the bindings (again, except `Move` ones). > > * intrinsified mode; this is typically used in combination with the specialized interpreted mode described above > (although it can also be used with the Java-based binding interpreter). The goal here is to remove the buffer > allocation and copy by introducing an additional JVM intrinsic. If a native call recipe is constant (e.g. the set of > bindings is constant, which is probably the case if the native method handle is stored in a `static`, `final` field), > then the VM can generate specialized assembly code which interprets the `Move` binding without the need to go for an > intermediate buffer. This gives us back performances that are on par with JNI. > > For upcalls, the support is not (yet) as advanced, and only the basic interpreted mode is available there. We plan to > add support for intrinsified modes there as well, which should considerably boost perfomances (probably well beyond > what JNI can offer at the moment, since the upcall support in JNI is not very well optimized). Again, for more > readings on the internals of the foreign linker support, please refer to [5]. > #### Test changes > > Many new tests have been added to validate the foreign linker support; we have high level tests (see `StdLibTest`) > which aim at testing the linker from the perspective of code that clients could write. But we also have deeper > combinatorial tests (see `TestUpcall` and `TestDowncall`) which are meant to stress every corner of the ABI > implementation. There are also some great tests (see the `callarranger` folder) which test the various `CallArranger`s > for all the possible platforms; these tests adopt more of a white-box approach - that is, instead of treating the > linker machinery as a black box and verify that the support works by checking that the native call returned the results > we expected, these tests aims at checking that the set of bindings generated by the call arranger is correct. This also > mean that we can test the classification logic for Windows, Mac and Linux regardless of the platform we're executing > on. Some additional microbenchmarks have been added to compare the performances of downcall/upcall with JNI. [1] - > https://openjdk.java.net/jeps/389 [2] - https://openjdk.java.net/jeps/393 [3] - > https://git.openjdk.java.net/jdk/pull/548 [4] - > https://github.com/openjdk/panama-foreign/blob/foreign-jextract/doc/panama_ffi.md [5] - > http://cr.openjdk.java.net/~jvernee/docs/Foreign-abi%20downcall%20intrinsics%20technical%20description.html Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Re-add file erroneously deleted (detected as rename) ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/634/files - new: https://git.openjdk.java.net/jdk/pull/634/files/2184831e..830c5cea Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=634&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=634&range=02-03 Stats: 35 lines in 1 file changed: 35 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/634.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/634/head:pull/634 PR: https://git.openjdk.java.net/jdk/pull/634 From smonteith at openjdk.java.net Thu Oct 15 17:28:23 2020 From: smonteith at openjdk.java.net (Stuart Monteith) Date: Thu, 15 Oct 2020 17:28:23 GMT Subject: RFR: 8254072: AArch64: Get rid of --disable-warnings-as-errors on Windows+ARM64 build [v3] In-Reply-To: References: Message-ID: On Thu, 15 Oct 2020 08:57:27 GMT, Bernhard Urban-Forster wrote: >> I organized this PR so that each commit contains the warning emitted by MSVC as commit message and its relevant fix. >> >> Verified on >> * Linux+ARM64: `{hotspot,jdk,langtools}:tier1`, no failures. >> * Windows+ARM64: `{hotspot,jdk,langtools}:tier1`, no (new) failures. >> * internal macOS+ARM64 port: build without `--disable-warnings-as-errors` still works. Just mentioning this here, because >> it's yet another toolchain (Xcode / clang) that needs to be kept happy [going >> forward](https://openjdk.java.net/jeps/391). > > Bernhard Urban-Forster has updated the pull request with a new target base due to a merge or a rebase. The pull request > now contains 20 commits: > - disable warning only for hotspot > - Merge remote-tracking branch 'upstream/master' into 8254072-fix-windows-arm64-warnings > - Merge remote-tracking branch 'upstream/master' into 8254072-fix-windows-arm64-warnings > - ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1441): warning C4267: 'argument': conversion from 'size_t' to > 'int', possible loss of data > ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1446): warning C4267: 'argument': conversion from 'size_t' to > 'int', possible loss of data ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1654): warning C4267: 'argument': > conversion from 'size_t' to 'int', possible loss of data > - Revert changes for "warning C4146: unary minus operator applied to unsigned type, result still unsigned" > - msvc: disable unary minus warning for unsigned types > - ./src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp(1123): warning C4267: 'initializing': conversion > from 'size_t' to 'int', possible loss of data > ./src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp(1123): warning C4267: 'initializing': conversion > from 'size_t' to 'const int', possible loss of data > - ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1312): warning C4267: 'argument': conversion from 'size_t' to > 'unsigned int', possible loss of data > ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1370): warning C4267: 'argument': conversion from 'size_t' to > 'int', possible loss of data ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1441): warning C4146: unary minus > operator applied to unsigned type, result still unsigned ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1441): > warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data > - ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp(2472): warning C4312: 'type cast': conversion from 'unsigned int' > to 'address' of greater size > - ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp(1527): warning C4267: 'argument': conversion from 'size_t' to > 'int', possible loss of data > - ... and 10 more: https://git.openjdk.java.net/jdk/compare/9359ff03...32e922da src/hotspot/cpu/aarch64/nativeInst_aarch64.hpp line 658: > 656: } > 657: } > 658: size_t size_in_bytes() { return 1ull << size(); } Capital ULL - I find that easer to search for and it is more consistent. ------------- PR: https://git.openjdk.java.net/jdk/pull/530 From github.com+51754783+coreyashford at openjdk.java.net Thu Oct 15 18:01:18 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Thu, 15 Oct 2020 18:01:18 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v4] In-Reply-To: <9y5m4zfsDZMdIZ6CT38BzO0tpFMuFxUswAbpjfDny-w=.44c7ada7-bf77-45f1-b5a6-b542731d6685@github.com> References: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> <7-p-Kc9lQyyuoWdNtmgbXbwkxsgk4oQGKmFSCcMpvnU=.97810c01-3200-4767-bbd4-35d53c2bc5ca@github.com> <6Voyfr_s-ieyRA-8Rtvvpz7tkhhicA8sY2d2KTp3Kmw=.fa256bae-2143-4b43-bfea-5837ad31eb6a@github.com> <7XjzEn5DggliDrvjhrGwXZL5r4lsqeGF9SGLmRr5L84=.a4481a62-4ecf-4e3f-98f3-70e548c67b52@github.com> <9y5m4zfsDZMdIZ6CT38BzO0tpFMuFxUswAb pjfDny-w=.44c7ada7-bf77-45f1-b5a6-b542731d6685@github.com> Message-ID: On Thu, 15 Oct 2020 15:59:23 GMT, Paul Sandoz wrote: > Please update > [compiler/graalunit/HotspotTest.java](https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/compiler/graalunit/HotspotTest.java), > and add the intrinsic signature. It looks like that is auto-generated, but I will figure out what to modify so that the signature is added. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From jvernee at openjdk.java.net Thu Oct 15 18:24:15 2020 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Thu, 15 Oct 2020 18:24:15 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) In-Reply-To: References: Message-ID: <9f-iShxyaRbjn74wlAqBPH3ycRWisVwlWuB81Kv8Q6A=.be903417-dda9-434a-96a4-d9ea6a1e1737@github.com> On Tue, 13 Oct 2020 13:08:14 GMT, Maurizio Cimadamore wrote: > This patch contains the changes associated with the first incubation round of the foreign linker access API incubation > (see JEP 389 [1]). This work is meant to sit on top of the foreign memory access support (see JEP 393 [2] and > associated pull request [3]). > The main goal of this API is to provide a way to call native functions from Java code without the need of intermediate > JNI glue code. In order to do this, native calls are modeled through the MethodHandle API. I suggest reading the > writeup [4] I put together few weeks ago, which illustrates what the foreign linker support is, and how it should be > used by clients. Disclaimer: the pull request mechanism isn't great at managing *dependent* reviews. For this reasons, > I'm attaching a webrev which contains only the differences between this PR and the memory access PR. I will be > periodically uploading new webrevs, as new iterations come out, to try and make the life of reviewers as simple as > possible. A big thank to Jorn Vernee and Vladimir Ivanov - they are the main architects of all the hotspot changes you > see here, and without their help, the foreign linker support wouldn't be what it is today. As usual, a big thank to > Paul Sandoz, who provided many insights (often by trying the bits first hand). Thanks Maurizio > Webrev: > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/webrev > > Javadoc: > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/javadoc/jdk/incubator/foreign/package-summary.html > > Specdiff (relative to [3]): > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/specdiff_delta/overview-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254232 > > > > ### API Changes > > The API changes are actually rather slim: > > * `LibraryLookup` > * This class allows clients to lookup symbols in native libraries; the interface is fairly simple; you can load a library > by name, or absolute path, and then lookup symbols on that library. > * `FunctionDescriptor` > * This is an abstraction that is very similar, in spirit, to `MethodType`; it is, at its core, an aggregate of memory > layouts for the function arguments/return type. A function descriptor is used to describe the signature of a native > function. > * `CLinker` > * This is the real star of the show. A `CLinker` has two main methods: `downcallHandle` and `upcallStub`; the first takes > a native symbol (as obtained from `LibraryLookup`), a `MethodType` and a `FunctionDescriptor` and returns a > `MethodHandle` instance which can be used to call the target native symbol. The second takes an existing method handle, > and a `FunctionDescriptor` and returns a new `MemorySegment` corresponding to a code stub allocated by the VM which > acts as a trampoline from native code to the user-provided method handle. This is very useful for implementing upcalls. > * This class also contains the various layout constants that should be used by clients when describing native signatures > (e.g. `C_LONG` and friends); these layouts contain additional ABI classfication information (in the form of layout > attributes) which is used by the runtime to *infer* how Java arguments should be shuffled for the native call to take > place. > * Finally, this class provides some helper functions e.g. so that clients can convert Java strings into C strings and > back. > * `NativeScope` > * This is an helper class which allows clients to group together logically related allocations; that is, rather than > allocating separate memory segments using separate *try-with-resource* constructs, a `NativeScope` allows clients to > use a _single_ block, and allocate all the required segments there. This is not only an usability boost, but also a > performance boost, since not all allocation requests will be turned into `malloc` calls. > * `MemorySegment` > * Only one method added here - namely `handoff(NativeScope)` which allows a segment to be transferred onto an existing > native scope. > > ### Safety > > The foreign linker API is intrinsically unsafe; many things can go wrong when requesting a native method handle. For > instance, the description of the native signature might be wrong (e.g. have too many arguments) - and the runtime has, > in the general case, no way to detect such mismatches. For these reasons, obtaining a `CLinker` instance is > a *restricted* operation, which can be enabled by specifying the usual JDK property `-Dforeign.restricted=permit` (as > it's the case for other restricted method in the foreign memory API). ### Implementation changes The Java changes > associated with `LibraryLookup` are relative straightforward; the only interesting thing to note here is that library > loading does _not_ depend on class loaders, so `LibraryLookup` is not subject to the same restrictions which apply to > JNI library loading (e.g. same library cannot be loaded by different classloaders). As for `NativeScope` the changes > are again relatively straightforward; it is an API which sits neatly on top of the foreign meory access API, providing > some kind of allocation service which shares the same underlying memory segment(s), and turns an allocation request > into a segment slice, which is a much less expensive operation. `NativeScope` comes in two variants: there are native > scopes for which the allocation size is known a priori, and native scopes which can grow - these two schemes are > implemented by two separate subclasses of `AbstractNativeScopeImpl`. Of course the bulk of the changes are to support > the `CLinker` downcall/upcall routines. These changes cut pretty deep into the JVM; I'll briefly summarize the goal of > some of this changes - for further details, Jorn has put together a detailed writeup which explains the rationale > behind the VM support, with some references to the code [5]. The main idea behind foreign linker is to infer, given a > Java method type (expressed as a `MethodType` instance) and the description of the signature of a native function > (expressed as a `FunctionDescriptor` instance) a _recipe_ that can be used to turn a Java call into the corresponding > native call targeting the requested native function. This inference scheme can be defined in a pretty straightforward > fashion by looking at the various ABI specifications (for instance, see [6] for the SysV ABI, which is the one used on > Linux/Mac). The various `CallArranger` classes, of which we have a flavor for each supported platform, do exactly that > kind of inference. For the inference process to work, we need to attach extra information to memory layouts; it is no > longer sufficient to know e.g. that a layout is 32/64 bits - we need to know whether it is meant to represent a > floating point value, or an integral value; this knowledge is required because floating points are passed in different > registers by most ABIs. For this reason, `CLinker` offers a set of pre-baked, platform-dependent layout constants which > contain the required classification attributes (e.g. a `Clinker.TypeKind` enum value). The runtime extracts this > attribute, and performs classification accordingly. A native call is decomposed into a sequence of basic, primitive > operations, called `Binding` (see the great javadoc on the `Binding.java` class for more info). There are many such > bindings - for instance the `Move` binding is used to move a value into a specific machine register/stack slot. So, the > main job of the various `CallingArranger` classes is to determine, given a Java `MethodType` and `FunctionDescriptor` > what is the set of bindings associated with the downcall/upcall. At the heart of the foreign linker support is the > `ProgrammableInvoker` class. This class effectively generates a `MethodHandle` which follows the steps described by the > various bindings obtained by `CallArranger`. There are actually various strategies to interpret these bindings - listed > below: > * basic intepreted mode; in this mode, all bindings are interpreted using a stack-based machine written in Java (see > `BindingInterpreter`), except for the `Move` bindings. For these bindings, the move is implemented by allocating > a *buffer* (whose size is ABI specific) and by moving all the lowered values into positions within this buffer. The > buffer is then passed to a piece of assembly code inside the VM which takes values from the buffer and moves them in > their expected registers/stack slots (note that each position in the buffer corresponds to a different register). This > is the most general invocation mode, the more "customizable" one, but also the slowest - since for every call there is > some extra allocation which takes place. > > * specialized interpreted mode; same as before, but instead of interpreting the bindings with a stack-based interpreter, > we generate a method handle chain which effectively interprets all the bindings (again, except `Move` ones). > > * intrinsified mode; this is typically used in combination with the specialized interpreted mode described above > (although it can also be used with the Java-based binding interpreter). The goal here is to remove the buffer > allocation and copy by introducing an additional JVM intrinsic. If a native call recipe is constant (e.g. the set of > bindings is constant, which is probably the case if the native method handle is stored in a `static`, `final` field), > then the VM can generate specialized assembly code which interprets the `Move` binding without the need to go for an > intermediate buffer. This gives us back performances that are on par with JNI. > > For upcalls, the support is not (yet) as advanced, and only the basic interpreted mode is available there. We plan to > add support for intrinsified modes there as well, which should considerably boost perfomances (probably well beyond > what JNI can offer at the moment, since the upcall support in JNI is not very well optimized). Again, for more > readings on the internals of the foreign linker support, please refer to [5]. > #### Test changes > > Many new tests have been added to validate the foreign linker support; we have high level tests (see `StdLibTest`) > which aim at testing the linker from the perspective of code that clients could write. But we also have deeper > combinatorial tests (see `TestUpcall` and `TestDowncall`) which are meant to stress every corner of the ABI > implementation. There are also some great tests (see the `callarranger` folder) which test the various `CallArranger`s > for all the possible platforms; these tests adopt more of a white-box approach - that is, instead of treating the > linker machinery as a black box and verify that the support works by checking that the native call returned the results > we expected, these tests aims at checking that the set of bindings generated by the call arranger is correct. This also > mean that we can test the classification logic for Windows, Mac and Linux regardless of the platform we're executing > on. Some additional microbenchmarks have been added to compare the performances of downcall/upcall with JNI. [1] - > https://openjdk.java.net/jeps/389 [2] - https://openjdk.java.net/jeps/393 [3] - > https://git.openjdk.java.net/jdk/pull/548 [4] - > https://github.com/openjdk/panama-foreign/blob/foreign-jextract/doc/panama_ffi.md [5] - > http://cr.openjdk.java.net/~jvernee/docs/Foreign-abi%20downcall%20intrinsics%20technical%20description.html Here is the updated partial webrev against https://github.com/openjdk/jdk/pull/548 : http://cr.openjdk.java.net/~jvernee/linker_rfr/v2/ ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From burban at openjdk.java.net Thu Oct 15 18:35:30 2020 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Thu, 15 Oct 2020 18:35:30 GMT Subject: RFR: 8254072: AArch64: Get rid of --disable-warnings-as-errors on Windows+ARM64 build [v4] In-Reply-To: References: Message-ID: > I organized this PR so that each commit contains the warning emitted by MSVC as commit message and its relevant fix. > > Verified on > * Linux+ARM64: `{hotspot,jdk,langtools}:tier1`, no failures. > * Windows+ARM64: `{hotspot,jdk,langtools}:tier1`, no (new) failures. > * internal macOS+ARM64 port: build without `--disable-warnings-as-errors` still works. Just mentioning this here, because > it's yet another toolchain (Xcode / clang) that needs to be kept happy [going > forward](https://openjdk.java.net/jeps/391). Bernhard Urban-Forster has updated the pull request incrementally with two additional commits since the last revision: - uppercase suffix - add assert ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/530/files - new: https://git.openjdk.java.net/jdk/pull/530/files/32e922da..901bbd48 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=530&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=530&range=02-03 Stats: 2 lines in 2 files changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/530.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/530/head:pull/530 PR: https://git.openjdk.java.net/jdk/pull/530 From burban at openjdk.java.net Thu Oct 15 18:42:16 2020 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Thu, 15 Oct 2020 18:42:16 GMT Subject: RFR: 8254072: AArch64: Get rid of --disable-warnings-as-errors on Windows+ARM64 build [v3] In-Reply-To: References: Message-ID: <5rYFvJf-Fyz8v_n1_91FKnd_dnMxP3ta7GqYAHA8QGI=.61870dce-015f-4940-b540-35bcd0890f81@github.com> On Thu, 15 Oct 2020 17:24:56 GMT, Stuart Monteith wrote: >> Bernhard Urban-Forster has updated the pull request with a new target base due to a merge or a rebase. The pull request >> now contains 20 commits: >> - disable warning only for hotspot >> - Merge remote-tracking branch 'upstream/master' into 8254072-fix-windows-arm64-warnings >> - Merge remote-tracking branch 'upstream/master' into 8254072-fix-windows-arm64-warnings >> - ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1441): warning C4267: 'argument': conversion from 'size_t' to >> 'int', possible loss of data >> ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1446): warning C4267: 'argument': conversion from 'size_t' to >> 'int', possible loss of data ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1654): warning C4267: 'argument': >> conversion from 'size_t' to 'int', possible loss of data >> - Revert changes for "warning C4146: unary minus operator applied to unsigned type, result still unsigned" >> - msvc: disable unary minus warning for unsigned types >> - ./src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp(1123): warning C4267: 'initializing': conversion >> from 'size_t' to 'int', possible loss of data >> ./src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp(1123): warning C4267: 'initializing': conversion >> from 'size_t' to 'const int', possible loss of data >> - ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1312): warning C4267: 'argument': conversion from 'size_t' to >> 'unsigned int', possible loss of data >> ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1370): warning C4267: 'argument': conversion from 'size_t' to >> 'int', possible loss of data ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1441): warning C4146: unary minus >> operator applied to unsigned type, result still unsigned ./src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp(1441): >> warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data >> - ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp(2472): warning C4312: 'type cast': conversion from 'unsigned int' >> to 'address' of greater size >> - ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp(1527): warning C4267: 'argument': conversion from 'size_t' to >> 'int', possible loss of data >> - ... and 10 more: https://git.openjdk.java.net/jdk/compare/9359ff03...32e922da > > src/hotspot/cpu/aarch64/nativeInst_aarch64.hpp line 658: > >> 656: } >> 657: } >> 658: size_t size_in_bytes() { return 1ull << size(); } > > Capital ULL - I find that easer to search for and it is more consistent. Thank you! Fixed. ------------- PR: https://git.openjdk.java.net/jdk/pull/530 From burban at openjdk.java.net Thu Oct 15 18:42:10 2020 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Thu, 15 Oct 2020 18:42:10 GMT Subject: RFR: 8254072: AArch64: Get rid of --disable-warnings-as-errors on Windows+ARM64 build [v2] In-Reply-To: References: <5UFJWe28-kzXCq0aP2gC4AA_JTLety2CjKFDLI-rtkA=.72f68b0f-4347-486c-9039-534ba569f34c@github.com> Message-ID: On Thu, 15 Oct 2020 09:57:14 GMT, Andrew Haley wrote: > Fine, but please assert JavaThread::stack_shadow_zone_size() == (int)JavaThread::stack_shadow_zone_size(). Done. > Adding casts to shut up compilers is a very risky business, because often (if not in this case) the programmer doesn't > understand the code well, and sprinkles casts everywhere. But casts disable compile-time type checking, and this leads > to risks for future maintainability. Full ACK and I appreciate your comments on this! > I wonder if we should fix it in a better way, and use something like > this in the future for all compiler warnings: > > ``` > template > T1 checked_cast(T2 thing) { > T1 result = static_cast(thing); > guarantee(static_cast(result) == thing, "must be"); > return result; > } > ``` > > I know this is additional work, but I promise we'll never need to have this conversation again. This sounds like a great idea to me. I assume it doesn't fit into the scope of this PR, therefore I've created [JDK-8254856](https://bugs.openjdk.java.net/browse/JDK-8254856) to track it. ------------- PR: https://git.openjdk.java.net/jdk/pull/530 From kvn at openjdk.java.net Thu Oct 15 18:59:11 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 15 Oct 2020 18:59:11 GMT Subject: RFR: 8254793: [JVMCI] improve speculation encoding [v3] In-Reply-To: <1I4nYb122752fKhe92W8XvGpo3BtTJ3LxUoK-oH2hus=.82161ac3-f1b9-4997-a2a3-5517eda94a45@github.com> References: <1I4nYb122752fKhe92W8XvGpo3BtTJ3LxUoK-oH2hus=.82161ac3-f1b9-4997-a2a3-5517eda94a45@github.com> Message-ID: On Thu, 15 Oct 2020 09:30:21 GMT, Doug Simon wrote: >> This PR changes the encoding of a `jdk.vm.ci.hotspot.HotSpotSpeculationLog.HotSpotSpeculation` from a long to an int. >> The `Thread::_pending_failed_speculation` field remains as a `jlong` since it is already exposed to JVMCI Java code >> already via VMStructs and this PR does not update its usage in Graal. > > Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental > views will show differences compared to the previous content of the PR. src/hotspot/share/runtime/thread.hpp line 1173: > 1171: // JVMCI as a jlong, it needs to be kept as a long to maintain backwards compatibility > 1172: // with JVMCI based compilers that emit code to update the field directly. > 1173: jlong _pending_failed_speculation; I am confusing about backword compatibility comment. It said that old Graal (link in current JDK) generate code which writes 64 bits into this word. Will it use [32:32] index:length format or it will use new [0:27:5] format? I don't see changes to Graal in this PR. ------------- PR: https://git.openjdk.java.net/jdk/pull/667 From coleenp at openjdk.java.net Thu Oct 15 23:18:14 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 15 Oct 2020 23:18:14 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v4] In-Reply-To: <0Zh0H5gSXzvHSstQ2w8NBM-P8yERRPouvhZJDNGvu4A=.6cde913f-7499-4c45-bc63-b717502b661e@github.com> References: <0Zh0H5gSXzvHSstQ2w8NBM-P8yERRPouvhZJDNGvu4A=.6cde913f-7499-4c45-bc63-b717502b661e@github.com> Message-ID: <2moJ2056gzwWoleYccv21TpFYQHw5h9bA-IZCImplhs=.763198bf-06b0-4589-b01e-217ba84af94a@github.com> On Thu, 15 Oct 2020 17:08:28 GMT, Maurizio Cimadamore wrote: >> This patch contains the changes associated with the first incubation round of the foreign linker access API incubation >> (see JEP 389 [1]). This work is meant to sit on top of the foreign memory access support (see JEP 393 [2] and >> associated pull request [3]). >> The main goal of this API is to provide a way to call native functions from Java code without the need of intermediate >> JNI glue code. In order to do this, native calls are modeled through the MethodHandle API. I suggest reading the >> writeup [4] I put together few weeks ago, which illustrates what the foreign linker support is, and how it should be >> used by clients. Disclaimer: the pull request mechanism isn't great at managing *dependent* reviews. For this reasons, >> I'm attaching a webrev which contains only the differences between this PR and the memory access PR. I will be >> periodically uploading new webrevs, as new iterations come out, to try and make the life of reviewers as simple as >> possible. A big thank to Jorn Vernee and Vladimir Ivanov - they are the main architects of all the hotspot changes you >> see here, and without their help, the foreign linker support wouldn't be what it is today. As usual, a big thank to >> Paul Sandoz, who provided many insights (often by trying the bits first hand). Thanks Maurizio >> Webrev: >> http://cr.openjdk.java.net/~mcimadamore/8254231_v1/webrev >> >> Javadoc: >> >> http://cr.openjdk.java.net/~mcimadamore/8254231_v1/javadoc/jdk/incubator/foreign/package-summary.html >> >> Specdiff (relative to [3]): >> >> http://cr.openjdk.java.net/~mcimadamore/8254231_v1/specdiff_delta/overview-summary.html >> >> CSR: >> >> https://bugs.openjdk.java.net/browse/JDK-8254232 >> >> >> >> ### API Changes >> >> The API changes are actually rather slim: >> >> * `LibraryLookup` >> * This class allows clients to lookup symbols in native libraries; the interface is fairly simple; you can load a library >> by name, or absolute path, and then lookup symbols on that library. >> * `FunctionDescriptor` >> * This is an abstraction that is very similar, in spirit, to `MethodType`; it is, at its core, an aggregate of memory >> layouts for the function arguments/return type. A function descriptor is used to describe the signature of a native >> function. >> * `CLinker` >> * This is the real star of the show. A `CLinker` has two main methods: `downcallHandle` and `upcallStub`; the first takes >> a native symbol (as obtained from `LibraryLookup`), a `MethodType` and a `FunctionDescriptor` and returns a >> `MethodHandle` instance which can be used to call the target native symbol. The second takes an existing method handle, >> and a `FunctionDescriptor` and returns a new `MemorySegment` corresponding to a code stub allocated by the VM which >> acts as a trampoline from native code to the user-provided method handle. This is very useful for implementing upcalls. >> * This class also contains the various layout constants that should be used by clients when describing native signatures >> (e.g. `C_LONG` and friends); these layouts contain additional ABI classfication information (in the form of layout >> attributes) which is used by the runtime to *infer* how Java arguments should be shuffled for the native call to take >> place. >> * Finally, this class provides some helper functions e.g. so that clients can convert Java strings into C strings and >> back. >> * `NativeScope` >> * This is an helper class which allows clients to group together logically related allocations; that is, rather than >> allocating separate memory segments using separate *try-with-resource* constructs, a `NativeScope` allows clients to >> use a _single_ block, and allocate all the required segments there. This is not only an usability boost, but also a >> performance boost, since not all allocation requests will be turned into `malloc` calls. >> * `MemorySegment` >> * Only one method added here - namely `handoff(NativeScope)` which allows a segment to be transferred onto an existing >> native scope. >> >> ### Safety >> >> The foreign linker API is intrinsically unsafe; many things can go wrong when requesting a native method handle. For >> instance, the description of the native signature might be wrong (e.g. have too many arguments) - and the runtime has, >> in the general case, no way to detect such mismatches. For these reasons, obtaining a `CLinker` instance is >> a *restricted* operation, which can be enabled by specifying the usual JDK property `-Dforeign.restricted=permit` (as >> it's the case for other restricted method in the foreign memory API). ### Implementation changes The Java changes >> associated with `LibraryLookup` are relative straightforward; the only interesting thing to note here is that library >> loading does _not_ depend on class loaders, so `LibraryLookup` is not subject to the same restrictions which apply to >> JNI library loading (e.g. same library cannot be loaded by different classloaders). As for `NativeScope` the changes >> are again relatively straightforward; it is an API which sits neatly on top of the foreign meory access API, providing >> some kind of allocation service which shares the same underlying memory segment(s), and turns an allocation request >> into a segment slice, which is a much less expensive operation. `NativeScope` comes in two variants: there are native >> scopes for which the allocation size is known a priori, and native scopes which can grow - these two schemes are >> implemented by two separate subclasses of `AbstractNativeScopeImpl`. Of course the bulk of the changes are to support >> the `CLinker` downcall/upcall routines. These changes cut pretty deep into the JVM; I'll briefly summarize the goal of >> some of this changes - for further details, Jorn has put together a detailed writeup which explains the rationale >> behind the VM support, with some references to the code [5]. The main idea behind foreign linker is to infer, given a >> Java method type (expressed as a `MethodType` instance) and the description of the signature of a native function >> (expressed as a `FunctionDescriptor` instance) a _recipe_ that can be used to turn a Java call into the corresponding >> native call targeting the requested native function. This inference scheme can be defined in a pretty straightforward >> fashion by looking at the various ABI specifications (for instance, see [6] for the SysV ABI, which is the one used on >> Linux/Mac). The various `CallArranger` classes, of which we have a flavor for each supported platform, do exactly that >> kind of inference. For the inference process to work, we need to attach extra information to memory layouts; it is no >> longer sufficient to know e.g. that a layout is 32/64 bits - we need to know whether it is meant to represent a >> floating point value, or an integral value; this knowledge is required because floating points are passed in different >> registers by most ABIs. For this reason, `CLinker` offers a set of pre-baked, platform-dependent layout constants which >> contain the required classification attributes (e.g. a `Clinker.TypeKind` enum value). The runtime extracts this >> attribute, and performs classification accordingly. A native call is decomposed into a sequence of basic, primitive >> operations, called `Binding` (see the great javadoc on the `Binding.java` class for more info). There are many such >> bindings - for instance the `Move` binding is used to move a value into a specific machine register/stack slot. So, the >> main job of the various `CallingArranger` classes is to determine, given a Java `MethodType` and `FunctionDescriptor` >> what is the set of bindings associated with the downcall/upcall. At the heart of the foreign linker support is the >> `ProgrammableInvoker` class. This class effectively generates a `MethodHandle` which follows the steps described by the >> various bindings obtained by `CallArranger`. There are actually various strategies to interpret these bindings - listed >> below: >> * basic intepreted mode; in this mode, all bindings are interpreted using a stack-based machine written in Java (see >> `BindingInterpreter`), except for the `Move` bindings. For these bindings, the move is implemented by allocating >> a *buffer* (whose size is ABI specific) and by moving all the lowered values into positions within this buffer. The >> buffer is then passed to a piece of assembly code inside the VM which takes values from the buffer and moves them in >> their expected registers/stack slots (note that each position in the buffer corresponds to a different register). This >> is the most general invocation mode, the more "customizable" one, but also the slowest - since for every call there is >> some extra allocation which takes place. >> >> * specialized interpreted mode; same as before, but instead of interpreting the bindings with a stack-based interpreter, >> we generate a method handle chain which effectively interprets all the bindings (again, except `Move` ones). >> >> * intrinsified mode; this is typically used in combination with the specialized interpreted mode described above >> (although it can also be used with the Java-based binding interpreter). The goal here is to remove the buffer >> allocation and copy by introducing an additional JVM intrinsic. If a native call recipe is constant (e.g. the set of >> bindings is constant, which is probably the case if the native method handle is stored in a `static`, `final` field), >> then the VM can generate specialized assembly code which interprets the `Move` binding without the need to go for an >> intermediate buffer. This gives us back performances that are on par with JNI. >> >> For upcalls, the support is not (yet) as advanced, and only the basic interpreted mode is available there. We plan to >> add support for intrinsified modes there as well, which should considerably boost perfomances (probably well beyond >> what JNI can offer at the moment, since the upcall support in JNI is not very well optimized). Again, for more >> readings on the internals of the foreign linker support, please refer to [5]. >> #### Test changes >> >> Many new tests have been added to validate the foreign linker support; we have high level tests (see `StdLibTest`) >> which aim at testing the linker from the perspective of code that clients could write. But we also have deeper >> combinatorial tests (see `TestUpcall` and `TestDowncall`) which are meant to stress every corner of the ABI >> implementation. There are also some great tests (see the `callarranger` folder) which test the various `CallArranger`s >> for all the possible platforms; these tests adopt more of a white-box approach - that is, instead of treating the >> linker machinery as a black box and verify that the support works by checking that the native call returned the results >> we expected, these tests aims at checking that the set of bindings generated by the call arranger is correct. This also >> mean that we can test the classification logic for Windows, Mac and Linux regardless of the platform we're executing >> on. Some additional microbenchmarks have been added to compare the performances of downcall/upcall with JNI. [1] - >> https://openjdk.java.net/jeps/389 [2] - https://openjdk.java.net/jeps/393 [3] - >> https://git.openjdk.java.net/jdk/pull/548 [4] - >> https://github.com/openjdk/panama-foreign/blob/foreign-jextract/doc/panama_ffi.md [5] - >> http://cr.openjdk.java.net/~jvernee/docs/Foreign-abi%20downcall%20intrinsics%20technical%20description.html > > Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: > > Re-add file erroneously deleted (detected as rename) I looked through some Hotspot runtime code and that looks ok. I saw a couple of strange things on my way through the code. See comments. src/hotspot/cpu/x86/foreign_globals_x86.cpp line 2: > 1: /* > 2: * Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved. Copyright should be 2020. All the new files should have 2020 as the copyright, a bunch don't. src/hotspot/cpu/x86/foreign_globals_x86.cpp line 56: > 54: } > 55: > 56: const ABIDescriptor parseABIDescriptor(JNIEnv* env, jobject jabi) { I don't know if you care about performance but of these env->calls transition into the VM and back out again. You should prefix all the code that comes from java to native with JNI_ENTRY and just use native JVM code to implement these. src/hotspot/cpu/x86/foreign_globals_x86.hpp line 32: > 30: #define __ _masm-> > 31: > 32: struct VectorRegister { Why are these structs and not classes? src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp line 3885: > 3883: > 3884: __ flush(); > 3885: } I think as a future RFE we should refactor this function and generate_native_wrapper since they're similar (this is nicer to read). If I can remove is_critical_native code they will be more similar. ------------- Changes requested by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/634 From rehn at openjdk.java.net Fri Oct 16 06:51:14 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Fri, 16 Oct 2020 06:51:14 GMT Subject: RFR: 8221554: aarch64 cross-modifying code In-Reply-To: <2b2X1iZw_GYV7GXESupACrX-WP976FrhMnr7ETJV_rM=.7a0b1359-1377-4c49-b737-f374caca58cd@github.com> References: <35eLsMpWmcCUoiEWhnYdSpZNmvLy4ra56Qtd6eRW574=.4e7c9278-3e0d-457d-9c15-eef45bae9755@github.com> <2b2X1iZw_GYV7GXESupACrX-WP976FrhMnr7ETJV_rM=.7a0b1359-1377-4c49-b737-f374caca58cd@github.com> Message-ID: On Thu, 15 Oct 2020 16:58:33 GMT, Andrew Haley wrote: > #0 ICache::invalidate_range Thanks for the pointer! ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From pliden at openjdk.java.net Fri Oct 16 09:33:18 2020 From: pliden at openjdk.java.net (Per Liden) Date: Fri, 16 Oct 2020 09:33:18 GMT Subject: RFR: 8254878: Move last piece of ZArray to GrowableArray Message-ID: ZArray used to be a separate implementation of a dynamically allocated/growable array. It now instead inherits from GrowableCHeapArray, and extends it with a transfer() function. I propose we rename this function to swap() and move it to GrowableArrayWithAllocator, since this function could be generally useful. It would also mean that ZArray could be just a typedef/using of GrowableCHeapArray. ------------- Commit messages: - 8254878: Move last piece of ZArray to GrowableArray Changes: https://git.openjdk.java.net/jdk/pull/694/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=694&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254878 Stats: 32 lines in 4 files changed: 6 ins; 23 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/694.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/694/head:pull/694 PR: https://git.openjdk.java.net/jdk/pull/694 From jvernee at openjdk.java.net Fri Oct 16 10:01:11 2020 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Fri, 16 Oct 2020 10:01:11 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v4] In-Reply-To: <2moJ2056gzwWoleYccv21TpFYQHw5h9bA-IZCImplhs=.763198bf-06b0-4589-b01e-217ba84af94a@github.com> References: <0Zh0H5gSXzvHSstQ2w8NBM-P8yERRPouvhZJDNGvu4A=.6cde913f-7499-4c45-bc63-b717502b661e@github.com> <2moJ2056gzwWoleYccv21TpFYQHw5h9bA-IZCImplhs=.763198bf-06b0-4589-b01e-217ba84af94a@github.com> Message-ID: On Thu, 15 Oct 2020 22:39:50 GMT, Coleen Phillimore wrote: >> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: >> >> Re-add file erroneously deleted (detected as rename) > > src/hotspot/cpu/x86/foreign_globals_x86.cpp line 2: > >> 1: /* >> 2: * Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved. > > Copyright should be 2020. All the new files should have 2020 as the copyright, a bunch don't. Ok, will go and check them. FWIW, this file was added back in 2018 in the panama repo. But, I suppose it is considered new here? ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From mcimadamore at openjdk.java.net Fri Oct 16 10:54:21 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Fri, 16 Oct 2020 10:54:21 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v12] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the third incubation round of the foreign memory access API incubation > (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: > * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from > multiple threads > * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee > that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class > has been added, which defines several useful dereference routines; these are really just thin wrappers around memory > access var handles, but they make the barrier of entry for using this API somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not > the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link > to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit > of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which > wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as > dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability > in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; > secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can > use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done > by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided > below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be > happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, > Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd > like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio > Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative > to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a > carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access > base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte > offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. > `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which > it is easy to derive all the other handles using plain var handle combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both > `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients > can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. > * `MemoryLayouts` > * A new layout, for machine addresses, has been added to the mix. > > > > ### Implementation changes > > There are two main things to discuss here: support for shared segments, and the general simplification of the memory > access var handle support. > #### Shared segments > > The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to > achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment > is shared, it would be possible for a thread to close it while another is accessing it. After considering several > options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he > reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world > (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a > close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and > the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). > It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of > these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, > we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to > whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of > stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, > once we detect that a thread is accessing the very segment we're about to close, what should happen? We first > experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it > fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the > machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to > minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread > is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and > try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should > be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single > place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in > addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` > annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) > class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, > like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is > tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during > access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory > access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead > of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a > `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed > successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two > implementations, one for confined segments and one for shared segments; the main difference between the two is what > happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared > segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or > `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` > state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. #### > Memory access var handles overhaul The key realization here was that if all memory access var handles took a > coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle > form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var > handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that > e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the > implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level > access on the innards of the memory access var handle. All that code is now gone. #### Test changes Not much to see > here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, > since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` > functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the > microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared > segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - > https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: - Back-port of TestByteBuffer fix - Merge branch 'master' into 8254162 - Merge branch 'master' into 8254162 - Merge branch 'master' into 8254162 - Remove spurious check on MemoryScope::confineTo Added tests to make sure no spurious exception is thrown when: * handing off a segment from A to A * sharing an already shared segment - Merge branch 'master' into 8254162 - Simplify example in the toplevel javadoc - Tweak support for mapped memory segments - Tweak referenced to MemoryAddressProxy in Utils.java - Fix performance issue with "small" segment mismatch - ... and 5 more: https://git.openjdk.java.net/jdk/compare/1742c44a...6091ed0f ------------- Changes: https://git.openjdk.java.net/jdk/pull/548/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=11 Stats: 8110 lines in 79 files changed: 5403 ins; 1530 del; 1177 mod Patch: https://git.openjdk.java.net/jdk/pull/548.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/548/head:pull/548 PR: https://git.openjdk.java.net/jdk/pull/548 From jvernee at openjdk.java.net Fri Oct 16 10:57:12 2020 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Fri, 16 Oct 2020 10:57:12 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v4] In-Reply-To: <2moJ2056gzwWoleYccv21TpFYQHw5h9bA-IZCImplhs=.763198bf-06b0-4589-b01e-217ba84af94a@github.com> References: <0Zh0H5gSXzvHSstQ2w8NBM-P8yERRPouvhZJDNGvu4A=.6cde913f-7499-4c45-bc63-b717502b661e@github.com> <2moJ2056gzwWoleYccv21TpFYQHw5h9bA-IZCImplhs=.763198bf-06b0-4589-b01e-217ba84af94a@github.com> Message-ID: <_goqdHgVLP0vMwrSHTTdfsHui0Etdcl5rBGB_8ksII8=.8f2fdc3f-7f67-4434-9096-69b9a64b50d9@github.com> On Thu, 15 Oct 2020 22:44:54 GMT, Coleen Phillimore wrote: >> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: >> >> Re-add file erroneously deleted (detected as rename) > > src/hotspot/cpu/x86/foreign_globals_x86.hpp line 32: > >> 30: #define __ _masm-> >> 31: >> 32: struct VectorRegister { > > Why are these structs and not classes? The fields are meant to be accessed directly, so I went with `struct` since the members default to public. ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From jvernee at openjdk.java.net Fri Oct 16 11:01:14 2020 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Fri, 16 Oct 2020 11:01:14 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v4] In-Reply-To: <2moJ2056gzwWoleYccv21TpFYQHw5h9bA-IZCImplhs=.763198bf-06b0-4589-b01e-217ba84af94a@github.com> References: <0Zh0H5gSXzvHSstQ2w8NBM-P8yERRPouvhZJDNGvu4A=.6cde913f-7499-4c45-bc63-b717502b661e@github.com> <2moJ2056gzwWoleYccv21TpFYQHw5h9bA-IZCImplhs=.763198bf-06b0-4589-b01e-217ba84af94a@github.com> Message-ID: On Thu, 15 Oct 2020 22:52:14 GMT, Coleen Phillimore wrote: >> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: >> >> Re-add file erroneously deleted (detected as rename) > > src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp line 3885: > >> 3883: >> 3884: __ flush(); >> 3885: } > > I think as a future RFE we should refactor this function and generate_native_wrapper since they're similar (this is > nicer to read). If I can remove is_critical_native code they will be more similar. Yes, I've had similar thoughts as well. This is meant to be temporary code any ways. ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From jvernee at openjdk.java.net Fri Oct 16 11:15:11 2020 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Fri, 16 Oct 2020 11:15:11 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v4] In-Reply-To: <2moJ2056gzwWoleYccv21TpFYQHw5h9bA-IZCImplhs=.763198bf-06b0-4589-b01e-217ba84af94a@github.com> References: <0Zh0H5gSXzvHSstQ2w8NBM-P8yERRPouvhZJDNGvu4A=.6cde913f-7499-4c45-bc63-b717502b661e@github.com> <2moJ2056gzwWoleYccv21TpFYQHw5h9bA-IZCImplhs=.763198bf-06b0-4589-b01e-217ba84af94a@github.com> Message-ID: On Thu, 15 Oct 2020 22:42:49 GMT, Coleen Phillimore wrote: >> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: >> >> Re-add file erroneously deleted (detected as rename) > > src/hotspot/cpu/x86/foreign_globals_x86.cpp line 56: > >> 54: } >> 55: >> 56: const ABIDescriptor parseABIDescriptor(JNIEnv* env, jobject jabi) { > > I don't know if you care about performance but of these env->calls transition into the VM and back out again. You > should prefix all the code that comes from java to native with JNI_ENTRY and just use native JVM code to implement > these. Currently this is prefixed with `JVM_ENTRY` e.g. like: JVM_ENTRY(jlong, PI_generateAdapter(JNIEnv* env, jclass _unused, jobject abi, jobject layout)) { ThreadToNativeFromVM ttnfvm(thread); return ProgrammableInvoker::generate_adapter(env, abi, layout); } JVM_END (where `generate_adapter` ends up calling `parseABIDescriptor`) JVM_ENTYRY seems to be mostly the same except for JNI_ENTRY having a `WeakPreserverExceptionMark` as well. Do we need to switch these? Also, I guess if we want to use VM code directly, we should get rid of the `ThreadToNativeFromVM` RAII handle. ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From rrich at openjdk.java.net Fri Oct 16 12:08:27 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Fri, 16 Oct 2020 12:08:27 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v11] In-Reply-To: References: Message-ID: > Hi, > > this is the continuation of the review of the implementation for: > > https://bugs.openjdk.java.net/browse/JDK-8227745 > https://bugs.openjdk.java.net/browse/JDK-8233915 > > It allows for JIT optimizations based on escape analysis even if JVMTI agents acquire capabilities to access references > to objects that are subject to such optimizations, e.g. scalar replacement. The implementation reverts such > optimizations just before access very much as when switching from JIT compiled execution to the interpreter, aka > "deoptimization". Webrev.8 was the last one before before the transition to Git/Github: > > http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ > > Thanks, Richard. Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 26 commits: - Removed unused parameter from EscapeBarrierSuspendHandshake. - Adaptations to JDK-8254263: Remove special_runtime_exit_condition() check from ~ThreadInVMForHandshake() With JDK-8254263 the special_runtime_exit_condition() check was removed from ~ThreadInVMForHandshake() because now a thread never becomes unsafe when processing its own handshakes. EscapeBarrier uses handshakes to sync with the target thread for object deoptimization so we add a check for object deoptimization to ThreadSafepointState::handle_polling_page_exception(). In JavaThread::wait_for_object_deoptimization() we must check is_obj_deopt_suspend() again after handshake/safepoint processing because a handshake for obj. deopt suspend could have been processed. - Adaptions to lazy/concurrent thread stack processing for ZGC (JEP 376) - EATests.java improvements - Merge branch 'master' into JDK-8227745 - The constructor of StackFrameStream takes more parameters after JDK-8253180 - Merge branch 'master' into JDK-8227745 - Merge branch 'master' into JDK-8227745 - Merge branch 'master' into JDK-8227745 - Factorized fragment out of EscapeBarrier::deoptimize_objects_internal into new method in compiledVFrame. - ... and 16 more: https://git.openjdk.java.net/jdk/compare/9359ff03...f02f07b6 ------------- Changes: https://git.openjdk.java.net/jdk/pull/119/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=119&range=10 Stats: 5863 lines in 53 files changed: 5645 ins; 116 del; 102 mod Patch: https://git.openjdk.java.net/jdk/pull/119.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/119/head:pull/119 PR: https://git.openjdk.java.net/jdk/pull/119 From coleen.phillimore at oracle.com Fri Oct 16 12:09:10 2020 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Fri, 16 Oct 2020 08:09:10 -0400 Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v4] In-Reply-To: References: <0Zh0H5gSXzvHSstQ2w8NBM-P8yERRPouvhZJDNGvu4A=.6cde913f-7499-4c45-bc63-b717502b661e@github.com> <2moJ2056gzwWoleYccv21TpFYQHw5h9bA-IZCImplhs=.763198bf-06b0-4589-b01e-217ba84af94a@github.com> Message-ID: On 10/16/20 7:15 AM, Jorn Vernee wrote: > On Thu, 15 Oct 2020 22:42:49 GMT, Coleen Phillimore wrote: > >>> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: >>> >>> Re-add file erroneously deleted (detected as rename) >> src/hotspot/cpu/x86/foreign_globals_x86.cpp line 56: >> >>> 54: } >>> 55: >>> 56: const ABIDescriptor parseABIDescriptor(JNIEnv* env, jobject jabi) { >> I don't know if you care about performance but of these env->calls transition into the VM and back out again. You >> should prefix all the code that comes from java to native with JNI_ENTRY and just use native JVM code to implement >> these. > Currently this is prefixed with `JVM_ENTRY` e.g. like: > JVM_ENTRY(jlong, PI_generateAdapter(JNIEnv* env, jclass _unused, jobject abi, jobject layout)) > { > ThreadToNativeFromVM ttnfvm(thread); > return ProgrammableInvoker::generate_adapter(env, abi, layout); > } > JVM_END > (where `generate_adapter` ends up calling `parseABIDescriptor`) > > JVM_ENTYRY seems to be mostly the same except for JNI_ENTRY having a `WeakPreserverExceptionMark` as well. Do we need > to switch these? Also, I guess if we want to use VM code directly, we should get rid of the `ThreadToNativeFromVM` RAII > handle. Yes, that would be so much nicer. Coleen > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/634 From mbaesken at openjdk.java.net Fri Oct 16 12:34:15 2020 From: mbaesken at openjdk.java.net (Matthias Baesken) Date: Fri, 16 Oct 2020 12:34:15 GMT Subject: RFR: JDK-8254889: name_and_sig_as_C_string usages in frame coding without ResourceMark Message-ID: Hello, seems we have some usages of name_and_sig_as_C_string() in frame related HS coding without using a ResourceMark. Please review. Thanks, Matthias ------------- Commit messages: - JDK-8254889 Changes: https://git.openjdk.java.net/jdk/pull/698/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=698&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254889 Stats: 2 lines in 2 files changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/698.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/698/head:pull/698 PR: https://git.openjdk.java.net/jdk/pull/698 From stuefe at openjdk.java.net Fri Oct 16 12:53:11 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 16 Oct 2020 12:53:11 GMT Subject: RFR: JDK-8254889: name_and_sig_as_C_string usages in frame coding without ResourceMark In-Reply-To: References: Message-ID: On Fri, 16 Oct 2020 12:24:43 GMT, Matthias Baesken wrote: > Hello, seems we have some usages of name_and_sig_as_C_string() in frame related HS coding without using a ResourceMark. > Please review. > Thanks, Matthias Changes requested by stuefe (Reviewer). src/hotspot/share/runtime/frame.cpp line 1145: > 1143: #ifndef PRODUCT > 1144: void frame::describe(FrameValues& values, int frame_no) { > 1145: ResourceMark rm; Not sure this works. RA allocated memory escapes the function in FrameValues::describe() (via FrameValue::description): void FrameValues::describe(int owner, intptr_t* location, const char* description, int priority) { FrameValue fv; fv.location = location; fv.owner = owner; fv.priority = priority; fv.description = NEW_RESOURCE_ARRAY(char, strlen(description) + 1); strcpy(fv.description, description); _values.append(fv); } ------------- PR: https://git.openjdk.java.net/jdk/pull/698 From mbaesken at openjdk.java.net Fri Oct 16 13:18:09 2020 From: mbaesken at openjdk.java.net (Matthias Baesken) Date: Fri, 16 Oct 2020 13:18:09 GMT Subject: RFR: JDK-8254889: name_and_sig_as_C_string usages in frame coding without ResourceMark In-Reply-To: References: Message-ID: On Fri, 16 Oct 2020 12:46:32 GMT, Thomas Stuefe wrote: >> Hello, seems we have some usages of name_and_sig_as_C_string() in frame related HS coding without using a ResourceMark. >> Please review. >> Thanks, Matthias > > src/hotspot/share/runtime/frame.cpp line 1145: > >> 1143: #ifndef PRODUCT >> 1144: void frame::describe(FrameValues& values, int frame_no) { >> 1145: ResourceMark rm; > > Not sure this works. RA allocated memory escapes the function in FrameValues::describe() (via FrameValue::description): > > void FrameValues::describe(int owner, intptr_t* location, const char* description, int priority) { > FrameValue fv; > fv.location = location; > fv.owner = owner; > fv.priority = priority; > fv.description = NEW_RESOURCE_ARRAY(char, strlen(description) + 1); > strcpy(fv.description, description); > _values.append(fv); > } Okay thanks , we have to take escaping memory indeed into consideration . I think I should omit the frame_zero.cpp change. In frame.cpp I could use a local buffer (any idea how large?) and the name_and_sig_as_C_string-version taking a pre-allocated buffer : name_and_sig_as_C_string(buffer, buf_len) . Then there is not need to add the ResourceMark , correct ? ------------- PR: https://git.openjdk.java.net/jdk/pull/698 From mbaesken at openjdk.java.net Fri Oct 16 13:52:24 2020 From: mbaesken at openjdk.java.net (Matthias Baesken) Date: Fri, 16 Oct 2020 13:52:24 GMT Subject: RFR: JDK-8254889: name_and_sig_as_C_string usages in frame coding without ResourceMark [v2] In-Reply-To: References: Message-ID: > Hello, seems we have some usages of name_and_sig_as_C_string() in frame related HS coding without using a ResourceMark. > Please review. > Thanks, Matthias Matthias Baesken has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: JDK-8254889 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/698/files - new: https://git.openjdk.java.net/jdk/pull/698/files/a7b27146..6a13e8e2 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=698&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=698&range=00-01 Stats: 8 lines in 2 files changed: 3 ins; 1 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/698.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/698/head:pull/698 PR: https://git.openjdk.java.net/jdk/pull/698 From neliasso at openjdk.java.net Fri Oct 16 15:02:17 2020 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Fri, 16 Oct 2020 15:02:17 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v3] In-Reply-To: <94qadtiTzSkdsJAc_8IWrLxpBvmfiBXMf_W9Z965P80=.9a59a5db-2209-4007-94bb-16ccd8ff0b77@github.com> References: <94qadtiTzSkdsJAc_8IWrLxpBvmfiBXMf_W9Z965P80=.9a59a5db-2209-4007-94bb-16ccd8ff0b77@github.com> Message-ID: On Tue, 13 Oct 2020 18:03:27 GMT, Jatin Bhateja wrote: >> Summary: >> >> 1) Partial in-lining technique avoids call overhead penalty for small array copy operations with size less than 32 >> bytes. 2) At runtime, a conditional check based on copy length either calls an array-copy stub or executes an optimized >> instruction sequence using AVX-512 masked instructions emitted at the call site. 3) New runtime flag >> ArrayCopyPartialInlineSize=0/32(default)/64 bytes determines the maximum size for partial in-lining. 4) Based on the >> perf results seen in benchmarks currently partial in-lining is performed only for arraycopy involving sub-word types >> (bool/byte/char/short). Once PR-61 gets integrated we can extend this patch to cover all the primitive types. >> Performance Results: >> System : CascadeLake Server, Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz >> Micros : test/micro/org/openjdk/bench/java/lang/ArrayCopy*.java >> ArrayCopyPartialInlineSize : 32 >> >> JMH | Block Size | Baseline (ns/op) | Partial Inling (ns/op) | Gain >> -- | -- | -- | -- | -- >> ArrayCopyAligned.testByte | 1 | 5.417 | 2.696 | 2.009272997 >> ArrayCopyAligned.testByte | 3 | 5.494 | 2.702 | 2.03330866 >> ArrayCopyAligned.testByte | 5 | 5.417 | 2.637 | 2.05422829 >> ArrayCopyAligned.testByte | 10 | 5.343 | 2.703 | 1.976692564 >> ArrayCopyAligned.testByte | 20 | 5.837 | 2.636 | 2.214339909 >> ArrayCopyAligned.testByte | 70 | 5.86 | 6 | 0.976666667 >> ArrayCopyAligned.testByte | 150 | 6.766 | 6.906 | 0.979727773 >> ArrayCopyAligned.testByte | 300 | 7.605 | 7.952 | 0.956363179 >> ArrayCopyAligned.testByte | 600 | 11.989 | 12.007 | 0.998500874 >> ArrayCopyAligned.testByte | 1200 | 16.447 | 16.585 | 0.991679228 >> ArrayCopyAligned.testChar | 1 | 5.02 | 2.828 | 1.775106082 >> ArrayCopyAligned.testChar | 3 | 5.129 | 2.762 | 1.85698769 >> ArrayCopyAligned.testChar | 5 | 5.041 | 2.762 | 1.82512672 >> ArrayCopyAligned.testChar | 10 | 5.716 | 2.762 | 2.069514844 >> ArrayCopyAligned.testChar | 20 | 5.111 | 5.399 | 0.946656788 >> ArrayCopyAligned.testChar | 70 | 6.271 | 6.242 | 1.004645947 >> ArrayCopyAligned.testChar | 150 | 7.45 | 7.599 | 0.980392157 >> ArrayCopyAligned.testChar | 300 | 9.904 | 10.112 | 0.97943038 >> ArrayCopyAligned.testChar | 600 | 17.131 | 17.167 | 0.997902953 >> ArrayCopyAligned.testChar | 1200 | 29.556 | 29.851 | 0.990117584 >> ArrayCopyUnalignedBoth.testByte | 1 | 5.419 | 2.702 | 2.005551443 >> ArrayCopyUnalignedBoth.testByte | 3 | 5.558 | 2.636 | 2.108497724 >> ArrayCopyUnalignedBoth.testByte | 5 | 5.43 | 2.636 | 2.059939302 >> ArrayCopyUnalignedBoth.testByte | 10 | 5.378 | 2.637 | 2.039438756 >> ArrayCopyUnalignedBoth.testByte | 20 | 5.914 | 2.636 | 2.243550836 >> ArrayCopyUnalignedBoth.testByte | 70 | 5.882 | 5.954 | 0.987907289 >> ArrayCopyUnalignedBoth.testByte | 150 | 6.784 | 6.88 | 0.986046512 >> ArrayCopyUnalignedBoth.testByte | 300 | 7.635 | 7.968 | 0.958207831 >> ArrayCopyUnalignedBoth.testByte | 600 | 12.226 | 12.129 | 1.007997362 >> ArrayCopyUnalignedBoth.testByte | 1200 | 16.992 | 20.717 | 0.820195974 >> ArrayCopyUnalignedBoth.testChar | 1 | 5.019 | 2.828 | 1.774752475 >> ArrayCopyUnalignedBoth.testChar | 3 | 5.163 | 2.763 | 1.868621064 >> ArrayCopyUnalignedBoth.testChar | 5 | 5.042 | 2.827 | 1.783516095 >> ArrayCopyUnalignedBoth.testChar | 10 | 5.718 | 2.828 | 2.021923621 >> ArrayCopyUnalignedBoth.testChar | 20 | 5.111 | 5.404 | 0.945780903 >> ArrayCopyUnalignedBoth.testChar | 70 | 6.367 | 6.235 | 1.02117081 >> ArrayCopyUnalignedBoth.testChar | 150 | 7.367 | 8.269 | 0.890917886 >> ArrayCopyUnalignedBoth.testChar | 300 | 10.358 | 10.642 | 0.973313287 >> ArrayCopyUnalignedBoth.testChar | 600 | 20.84 | 17.522 | 1.189361945 >> ArrayCopyUnalignedBoth.testChar | 1200 | 31.895 | 31.892 | 1.000094067 >> ArrayCopyUnalignedDst.testByte | 1 | 5.455 | 2.637 | 2.068638604 >> ArrayCopyUnalignedDst.testByte | 3 | 5.562 | 2.702 | 2.058475204 >> ArrayCopyUnalignedDst.testByte | 5 | 5.427 | 2.702 | 2.008512213 >> ArrayCopyUnalignedDst.testByte | 10 | 5.367 | 2.696 | 1.990727003 >> ArrayCopyUnalignedDst.testByte | 20 | 5.839 | 2.637 | 2.214258627 >> ArrayCopyUnalignedDst.testByte | 70 | 5.888 | 5.968 | 0.986595174 >> ArrayCopyUnalignedDst.testByte | 150 | 6.785 | 6.773 | 1.001771741 >> ArrayCopyUnalignedDst.testByte | 300 | 7.606 | 7.972 | 0.954089313 >> ArrayCopyUnalignedDst.testByte | 600 | 11.986 | 21.195 | 0.565510734 >> ArrayCopyUnalignedDst.testByte | 1200 | 16.54 | 16.784 | 0.985462345 >> ArrayCopyUnalignedDst.testChar | 1 | 5.02 | 2.827 | 1.775733994 >> ArrayCopyUnalignedDst.testChar | 3 | 5.131 | 2.762 | 1.857711803 >> ArrayCopyUnalignedDst.testChar | 5 | 5.038 | 2.762 | 1.82404055 >> ArrayCopyUnalignedDst.testChar | 10 | 5.718 | 2.762 | 2.070238957 >> ArrayCopyUnalignedDst.testChar | 20 | 5.113 | 5.401 | 0.946676541 >> ArrayCopyUnalignedDst.testChar | 70 | 6.222 | 6.214 | 1.001287416 >> ArrayCopyUnalignedDst.testChar | 150 | 7.367 | 8.125 | 0.906707692 >> ArrayCopyUnalignedDst.testChar | 300 | 10.204 | 10.082 | 1.012100774 >> ArrayCopyUnalignedDst.testChar | 600 | 16.978 | 17.135 | 0.990837467 >> ArrayCopyUnalignedDst.testChar | 1200 | 32.351 | 31.996 | 1.011095137 >> ArrayCopyUnalignedSrc.testByte | 1 | 5.414 | 2.696 | 2.008160237 >> ArrayCopyUnalignedSrc.testByte | 3 | 5.494 | 2.637 | 2.083428138 >> ArrayCopyUnalignedSrc.testByte | 5 | 5.431 | 2.637 | 2.059537353 >> ArrayCopyUnalignedSrc.testByte | 10 | 5.344 | 2.703 | 1.977062523 >> ArrayCopyUnalignedSrc.testByte | 20 | 5.834 | 2.696 | 2.163946588 >> ArrayCopyUnalignedSrc.testByte | 70 | 5.883 | 6.009 | 0.979031453 >> ArrayCopyUnalignedSrc.testByte | 150 | 6.729 | 6.87 | 0.979475983 >> ArrayCopyUnalignedSrc.testByte | 300 | 7.603 | 7.97 | 0.953952321 >> ArrayCopyUnalignedSrc.testByte | 600 | 12.004 | 12.16 | 0.987171053 >> ArrayCopyUnalignedSrc.testByte | 1200 | 16.534 | 16.643 | 0.9934507 >> ArrayCopyUnalignedSrc.testChar | 1 | 5.021 | 2.762 | 1.81788559 >> ArrayCopyUnalignedSrc.testChar | 3 | 5.13 | 2.762 | 1.857349747 >> ArrayCopyUnalignedSrc.testChar | 5 | 5.042 | 2.827 | 1.783516095 >> ArrayCopyUnalignedSrc.testChar | 10 | 5.726 | 2.761 | 2.073886273 >> ArrayCopyUnalignedSrc.testChar | 20 | 5.112 | 5.401 | 0.94649139 >> ArrayCopyUnalignedSrc.testChar | 70 | 6.113 | 6.227 | 0.981692629 >> ArrayCopyUnalignedSrc.testChar | 150 | 7.493 | 7.888 | 0.949923935 >> ArrayCopyUnalignedSrc.testChar | 300 | 10.234 | 10.501 | 0.97457385 >> ArrayCopyUnalignedSrc.testChar | 600 | 17.175 | 17.142 | 1.001925096 >> ArrayCopyUnalignedSrc.testChar | 1200 | 31.926 | 31.987 | 0.998092975 >> >> Detailed Reports: >> Baseline : >> [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt) >> WithOpt : >> [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt) > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Replacing explicit type checks with existing type checking routines Changes requested by neliasso (Reviewer). src/hotspot/cpu/x86/x86.ad line 5473: > 5471: BasicType elmType = this->bottom_type()->is_vect()->element_basic_type(); > 5472: int vector_len = vector_length_encoding(this); > 5473: //TODO: KRegister to be made valid "bound" operand to promote sharing. Remove todo - create a RFE instead. src/hotspot/cpu/x86/x86.ad line 5506: > 5504: BasicType elmType = src_node->bottom_type()->is_vect()->element_basic_type(); > 5505: int vector_len = vector_length_encoding(src_node); > 5506: //TODO: KRegister to be made valid "bound" operand to promote sharing. Remove todo - create a RFE instead. src/hotspot/share/adlc/forms.cpp line 271: > 269: if( strcmp(opType,"LoadS")==0 ) return Form::idealS; > 270: if( strcmp(opType,"LoadVector")==0 ) return Form::idealV; > 271: if( strcmp(opType,"VectorMaskedLoad")==0 ) return Form::idealV; More of a bike shedding question: The patterns is LoadRange, LoadS, LoadVector - why not name it in the same style - LoadVectorMasked? src/hotspot/share/adlc/forms.cpp line 288: > 286: if( strcmp(opType,"StoreNKlass")==0) return Form::idealNKlass; > 287: if( strcmp(opType,"StoreVector")==0 ) return Form::idealV; > 288: if( strcmp(opType,"VectorMaskedStore")==0 ) return Form::idealV; Same comment but for store - what do you think about naming it StoreVectorMasked? src/hotspot/share/opto/cfgnode.cpp line 423: > 421: // If a two input non-loop region has dead input > 422: // edge[s] degenerate any phi node contained within it. > 423: bool RegionNode::try_phi_disintegration(PhaseGVN *phase) { RegionNode::try_phi_disintegration - is it a requirement for this enhancement? or a separate issue? Also - I know we already remove phis that only have one input. If the input is set to top - PhiNode::Ideal should reduce the phi. If you have found a case where this doesn't happen - we should investigate and fix. src/hotspot/share/opto/cfgnode.cpp line 396: > 394: } > 395: > 396: bool RegionNode::is_self_loop(Node* n) { A bit expensive to DFS the entire graph to find a self loop. You don't need to visit nodes outside the loop. But you might not need to do this at all - see my comments further down. src/hotspot/share/opto/cfgnode.cpp line 436: > 434: Node* rep_node = NULL; > 435: PhaseIterGVN *igvn = phase->is_IterGVN(); > 436: if (in(1)->is_top() && !in(2)->is_top()) { The Phi-nodes for loops are always normalized - in(1) will be loop-entry and in(2) is the backedge. So if in(1) is top - in(2) will be a self loop. ------------- PR: https://git.openjdk.java.net/jdk/pull/302 From dnsimon at openjdk.java.net Fri Oct 16 16:33:21 2020 From: dnsimon at openjdk.java.net (Doug Simon) Date: Fri, 16 Oct 2020 16:33:21 GMT Subject: RFR: 8254793: [JVMCI] improve speculation encoding [v3] In-Reply-To: References: <1I4nYb122752fKhe92W8XvGpo3BtTJ3LxUoK-oH2hus=.82161ac3-f1b9-4997-a2a3-5517eda94a45@github.com> Message-ID: <7RVgaYfbROo7kkzllVhIcq7L6jUk7kNEg0Cd72FfT7o=.4f95a36d-767f-4d12-8544-8c0d38bc2626@github.com> On Thu, 15 Oct 2020 18:56:38 GMT, Vladimir Kozlov wrote: >> Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental >> views will show differences compared to the previous content of the PR. > > src/hotspot/share/runtime/thread.hpp line 1171: > >> 1169: // uniquely identify the speculative optimization guarded by the uncommon trap. >> 1170: // The id value is only 32-bits but since this field is exposed via VMStructs to >> 1171: // JVMCI as a jlong, it needs to be kept as a long to maintain backwards compatibility > > I am confusing about backword compatibility comment. It said that old Graal (link in current JDK) generate code which > writes 64 bits into this word. Will it use [32:32] index:length format or it will use new [0:27:5] format? I don't see > changes to Graal in this PR. The version of Graal in the JDK does not change. It is agnostic about the encoding format. All is does is write a value to `Thread::_pending_failed_speculation` where said value is provided by JVMCI. The width of the write is determined by the width of the value. You can follow the code that does this [here](https://github.com/openjdk/jdk/blob/6c3bc71079bd9f4de005d005ded5a7cc3b7e373a/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.amd64/src/org/graalvm/compiler/hotspot/amd64/AMD64HotSpotLIRGenerator.java#L529). ------------- PR: https://git.openjdk.java.net/jdk/pull/667 From dnsimon at openjdk.java.net Fri Oct 16 16:37:09 2020 From: dnsimon at openjdk.java.net (Doug Simon) Date: Fri, 16 Oct 2020 16:37:09 GMT Subject: RFR: 8254793: [JVMCI] improve speculation encoding In-Reply-To: References: Message-ID: On Wed, 14 Oct 2020 22:01:19 GMT, Doug Simon wrote: > This PR changes the encoding of a `jdk.vm.ci.hotspot.HotSpotSpeculationLog.HotSpotSpeculation` from a long to an int. > The `Thread::_pending_failed_speculation` field remains as a `jlong` since it is already exposed to JVMCI Java code > already via VMStructs and this PR does not update its usage in Graal. I've updated this PR such that an encoded speculation value is always stored/transported in a long. The encoding still only uses 31 bits which means the instruction sequence emitted by Graal can still be optimized to a single store (e.g. on x86 a MOVESLQ can write a 32 bit value sign extended to a long into a long memory location). ------------- PR: https://git.openjdk.java.net/jdk/pull/667 From dnsimon at openjdk.java.net Fri Oct 16 16:33:20 2020 From: dnsimon at openjdk.java.net (Doug Simon) Date: Fri, 16 Oct 2020 16:33:20 GMT Subject: RFR: 8254793: [JVMCI] improve speculation encoding [v4] In-Reply-To: References: Message-ID: > This PR changes the encoding of a `jdk.vm.ci.hotspot.HotSpotSpeculationLog.HotSpotSpeculation` from a long to an int. > The `Thread::_pending_failed_speculation` field remains as a `jlong` since it is already exposed to JVMCI Java code > already via VMStructs and this PR does not update its usage in Graal. Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: 8254793: encode a HotSpotSpeculation in 31 bits ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/667/files - new: https://git.openjdk.java.net/jdk/pull/667/files/2e4e4521..b0d93bdc Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=667&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=667&range=02-03 Stats: 62 lines in 11 files changed: 15 ins; 14 del; 33 mod Patch: https://git.openjdk.java.net/jdk/pull/667.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/667/head:pull/667 PR: https://git.openjdk.java.net/jdk/pull/667 From jbhateja at openjdk.java.net Fri Oct 16 17:24:13 2020 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Fri, 16 Oct 2020 17:24:13 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v3] In-Reply-To: <9lPNMo1V33tQD6qp-1l78dII5Hfle8Ea5VWwuY1l_qA=.2e420c11-6e70-41f8-80b4-5992dcdd02eb@github.com> References: <94qadtiTzSkdsJAc_8IWrLxpBvmfiBXMf_W9Z965P80=.9a59a5db-2209-4007-94bb-16ccd8ff0b77@github.com> <9lPNMo1V33tQD6qp-1l78dII5Hfle8Ea5VWwuY1l_qA=.2e420c11-6e70-41f8-80b4-5992dcdd02eb@github.com> Message-ID: On Thu, 15 Oct 2020 14:54:26 GMT, Nils Eliasson wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Replacing explicit type checks with existing type checking routines > > src/hotspot/share/opto/memnode.hpp line 1188: > >> 1186: TrailingLoadStore, >> 1187: LeadingLoadStore, >> 1188: AfterPartialArrayCopy > > Change to keep consistent with the other names: > AfterPartialArrayCopy -> TrailingPartialArrayCopy > > Why is a special kind needed for partial array copy? Idea here is to prevent bypassing arraycopy operation post expansion during memory chain discovery [and] optimization. Currently a memory barrier is inserted after array copy macro expansion into a stub call, this pattern is being checked during memory chain discovery, with partial in-lining we create additional control structure for selection b/w slow path (stub call) and fast path (partially in-lined code). To prevent increasing the complexity of patter matching introduced a flag in MemBarrier node which is set only if partial in-lining takes place. ------------- PR: https://git.openjdk.java.net/jdk/pull/302 From kvn at openjdk.java.net Fri Oct 16 17:35:13 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 16 Oct 2020 17:35:13 GMT Subject: RFR: 8254793: [JVMCI] improve speculation encoding [v4] In-Reply-To: References: Message-ID: On Fri, 16 Oct 2020 16:33:20 GMT, Doug Simon wrote: >> This PR changes the encoding of a `jdk.vm.ci.hotspot.HotSpotSpeculationLog.HotSpotSpeculation` from a long to an int. >> The `Thread::_pending_failed_speculation` field remains as a `jlong` since it is already exposed to JVMCI Java code >> already via VMStructs and this PR does not update its usage in Graal. > > Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental > views will show differences compared to the previous content of the PR. The pull request contains one new commit since > the last revision: > 8254793: encode a HotSpotSpeculation in 31 bits Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/667 From kvn at openjdk.java.net Fri Oct 16 17:35:14 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 16 Oct 2020 17:35:14 GMT Subject: RFR: 8254793: [JVMCI] improve speculation encoding [v3] In-Reply-To: <7RVgaYfbROo7kkzllVhIcq7L6jUk7kNEg0Cd72FfT7o=.4f95a36d-767f-4d12-8544-8c0d38bc2626@github.com> References: <1I4nYb122752fKhe92W8XvGpo3BtTJ3LxUoK-oH2hus=.82161ac3-f1b9-4997-a2a3-5517eda94a45@github.com> <7RVgaYfbROo7kkzllVhIcq7L6jUk7kNEg0Cd72FfT7o=.4f95a36d-767f-4d12-8544-8c0d38bc2626@github.com> Message-ID: On Fri, 16 Oct 2020 16:30:43 GMT, Doug Simon wrote: >> src/hotspot/share/runtime/thread.hpp line 1171: >> >>> 1169: // uniquely identify the speculative optimization guarded by the uncommon trap. >>> 1170: // The id value is only 32-bits but since this field is exposed via VMStructs to >>> 1171: // JVMCI as a jlong, it needs to be kept as a long to maintain backwards compatibility >> >> I am confusing about backword compatibility comment. It said that old Graal (link in current JDK) generate code which >> writes 64 bits into this word. Will it use [32:32] index:length format or it will use new [0:27:5] format? I don't see >> changes to Graal in this PR. > > The version of Graal in the JDK does not change. It is agnostic about the encoding format. All is does is write a value > to `Thread::_pending_failed_speculation` where said value is provided by JVMCI. The width of the write is determined by > the width of the value. You can follow the code that does this > [here](https://github.com/openjdk/jdk/blob/6c3bc71079bd9f4de005d005ded5a7cc3b7e373a/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.amd64/src/org/graalvm/compiler/hotspot/amd64/AMD64HotSpotLIRGenerator.java#L529). Okay, that it what I looked for - the value encoding is provided by JVMCI. Good. ------------- PR: https://git.openjdk.java.net/jdk/pull/667 From jbhateja at openjdk.java.net Fri Oct 16 18:09:11 2020 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Fri, 16 Oct 2020 18:09:11 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v3] In-Reply-To: References: <94qadtiTzSkdsJAc_8IWrLxpBvmfiBXMf_W9Z965P80=.9a59a5db-2209-4007-94bb-16ccd8ff0b77@github.com> Message-ID: On Fri, 16 Oct 2020 14:35:11 GMT, Nils Eliasson wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Replacing explicit type checks with existing type checking routines > > src/hotspot/share/opto/cfgnode.cpp line 423: > >> 421: // If a two input non-loop region has dead input >> 422: // edge[s] degenerate any phi node contained within it. >> 423: bool RegionNode::try_phi_disintegration(PhaseGVN *phase) { > > RegionNode::try_phi_disintegration - is it a requirement for this enhancement? or a separate issue? > > Also - I know we already remove phis that only have one input. If the input is set to top - PhiNode::Ideal should > reduce the phi. If you have found a case where this doesn't happen - we should investigate and fix. This transformation is being done during RegionNode idealization, A phi may be intact (have both valid inputs), but if its parent region has one control edge connected to top() in that case the phi-input corresponding to top() edge is being removed and phi is disintegrated. Currently for dead loops all its phi nodes are replaced by top() https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/cfgnode.cpp#L575 For partial in-lining I introduced a mem phi node at the convergence of fast and slow path during node expansion which was getting replaced by top() during RegionNode idealization in case of a dead loop. This mem_phi had a user outside loop. while ( ) { // dead loop detection if ( len < 32 ) fast_path else slow_path mem_Phi = Memory(fast_path, slow_path) } memory = MemMerge(mem_Phi); ------------- PR: https://git.openjdk.java.net/jdk/pull/302 From dlong at openjdk.java.net Fri Oct 16 19:48:10 2020 From: dlong at openjdk.java.net (Dean Long) Date: Fri, 16 Oct 2020 19:48:10 GMT Subject: RFR: 8254793: [JVMCI] improve speculation encoding [v4] In-Reply-To: References: Message-ID: On Fri, 16 Oct 2020 16:33:20 GMT, Doug Simon wrote: >> This PR changes the encoding of a `jdk.vm.ci.hotspot.HotSpotSpeculationLog.HotSpotSpeculation` from a long to an int. >> The `Thread::_pending_failed_speculation` field remains as a `jlong` since it is already exposed to JVMCI Java code >> already via VMStructs and this PR does not update its usage in Graal. > > Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental > views will show differences compared to the previous content of the PR. Marked as reviewed by dlong (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/667 From never at openjdk.java.net Fri Oct 16 20:03:21 2020 From: never at openjdk.java.net (Tom Rodriguez) Date: Fri, 16 Oct 2020 20:03:21 GMT Subject: RFR: 8254793: [JVMCI] improve speculation encoding [v3] In-Reply-To: <7RVgaYfbROo7kkzllVhIcq7L6jUk7kNEg0Cd72FfT7o=.4f95a36d-767f-4d12-8544-8c0d38bc2626@github.com> References: <1I4nYb122752fKhe92W8XvGpo3BtTJ3LxUoK-oH2hus=.82161ac3-f1b9-4997-a2a3-5517eda94a45@github.com> <7RVgaYfbROo7kkzllVhIcq7L6jUk7kNEg0Cd72FfT7o=.4f95a36d-767f-4d12-8544-8c0d38bc2626@github.com> Message-ID: <357OJNV4rqx_47PRDeWkIgxieMfHjdP5gF_d3bY8lvU=.a072be50-1301-4e8b-9a13-9372b6286f7c@github.com> On Fri, 16 Oct 2020 16:30:43 GMT, Doug Simon wrote: >> src/hotspot/share/runtime/thread.hpp line 1171: >> >>> 1169: // uniquely identify the speculative optimization guarded by the uncommon trap. >>> 1170: // The id value is only 32-bits but since this field is exposed via VMStructs to >>> 1171: // JVMCI as a jlong, it needs to be kept as a long to maintain backwards compatibility >> >> I am confusing about backword compatibility comment. It said that old Graal (link in current JDK) generate code which >> writes 64 bits into this word. Will it use [32:32] index:length format or it will use new [0:27:5] format? I don't see >> changes to Graal in this PR. > > The version of Graal in the JDK does not change. It is agnostic about the encoding format. All is does is write a value > to `Thread::_pending_failed_speculation` where said value is provided by JVMCI. The width of the write is determined by > the width of the value. You can follow the code that does this > [here](https://github.com/openjdk/jdk/blob/6c3bc71079bd9f4de005d005ded5a7cc3b7e373a/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.amd64/src/org/graalvm/compiler/hotspot/amd64/AMD64HotSpotLIRGenerator.java#L529). Based on some of my comments from elsewhere we've undone some of the original changes so it just produces int friendly long constants. Changing the actual encoding size poses some compatibility problems because we weren't careful enough to be completely size agnostic in Graal code. ------------- PR: https://git.openjdk.java.net/jdk/pull/667 From never at openjdk.java.net Fri Oct 16 20:03:19 2020 From: never at openjdk.java.net (Tom Rodriguez) Date: Fri, 16 Oct 2020 20:03:19 GMT Subject: RFR: 8254793: [JVMCI] improve speculation encoding [v4] In-Reply-To: References: Message-ID: <-3aG1VbRHkxdSp12UScaUxwGve0nAW_ok4hQw5FKnW8=.a01f4bef-7578-4d3b-a6bf-6cab461259d2@github.com> On Fri, 16 Oct 2020 16:33:20 GMT, Doug Simon wrote: >> This PR changes the encoding of a `jdk.vm.ci.hotspot.HotSpotSpeculationLog.HotSpotSpeculation` from a long to an int. >> The `Thread::_pending_failed_speculation` field remains as a `jlong` since it is already exposed to JVMCI Java code >> already via VMStructs and this PR does not update its usage in Graal. > > Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental > views will show differences compared to the previous content of the PR. Marked as reviewed by never (Reviewer). src/hotspot/share/jvmci/jvmciRuntime.hpp line 55: > 53: FailedSpeculation** _failed_speculations; > 54: > 55: // A speculation id is an index (high 26 bits) and a length (low 5 bits). We don't really have to enforce that it fits in an int any more. I think it would be more natural to allow to use all the remaining bits even though we'll never actually use that space in practice. Doing so makes the code look a little odd I think since there's no obvious reason for limit. We just want an encoding that's int friendly for the normal case. ------------- PR: https://git.openjdk.java.net/jdk/pull/667 From dholmes at openjdk.java.net Fri Oct 16 21:39:14 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 16 Oct 2020 21:39:14 GMT Subject: RFR: JDK-8254889: name_and_sig_as_C_string usages in frame coding without ResourceMark [v2] In-Reply-To: References: Message-ID: <2zlELXVF1ZB4uK35AHD6VrwfhHms2JdygTsv1qHmwVQ=.68dc2250-1329-457f-9964-8998d3e3c94f@github.com> On Fri, 16 Oct 2020 12:50:14 GMT, Thomas Stuefe wrote: >> Matthias Baesken has refreshed the contents of this pull request, and previous commits have been removed. The >> incremental views will show differences compared to the previous content of the PR. > > Changes requested by stuefe (Reviewer). @MBaesken Please do not force-push commits to an open PR as it break the commit history. You can just make your changes and push as a normal commit. Github will then show the differences between each commit and allow the history to be seen clearly and consistently for reviewers. The skara tooling will flatten things into a single simple coherent commit when integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/698 From dholmes at openjdk.java.net Fri Oct 16 21:39:15 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 16 Oct 2020 21:39:15 GMT Subject: RFR: JDK-8254889: name_and_sig_as_C_string usages in frame coding without ResourceMark [v2] In-Reply-To: <2zlELXVF1ZB4uK35AHD6VrwfhHms2JdygTsv1qHmwVQ=.68dc2250-1329-457f-9964-8998d3e3c94f@github.com> References: <2zlELXVF1ZB4uK35AHD6VrwfhHms2JdygTsv1qHmwVQ=.68dc2250-1329-457f-9964-8998d3e3c94f@github.com> Message-ID: On Fri, 16 Oct 2020 21:34:06 GMT, David Holmes wrote: >> Changes requested by stuefe (Reviewer). > > @MBaesken Please do not force-push commits to an open PR as it break the commit history. You can just make your changes > and push as a normal commit. Github will then show the differences between each commit and allow the history to be seen > clearly and consistently for reviewers. The skara tooling will flatten things into a single simple coherent commit when > integrated. As long as there is a ResourceMark in the caller there is no issue here - though the code should be documented in that case. ------------- PR: https://git.openjdk.java.net/jdk/pull/698 From rrich at openjdk.java.net Sun Oct 18 06:20:25 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Sun, 18 Oct 2020 06:20:25 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v12] In-Reply-To: References: Message-ID: > Hi, > > this is the continuation of the review of the implementation for: > > https://bugs.openjdk.java.net/browse/JDK-8227745 > https://bugs.openjdk.java.net/browse/JDK-8233915 > > It allows for JIT optimizations based on escape analysis even if JVMTI agents acquire capabilities to access references > to objects that are subject to such optimizations, e.g. scalar replacement. The implementation reverts such > optimizations just before access very much as when switching from JIT compiled execution to the interpreter, aka > "deoptimization". Webrev.8 was the last one before before the transition to Git/Github: > > http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ > > Thanks, Richard. Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: handle_special_runtime_exit_condition(): wait (blocked) for obj. deoptimization _before_ async ex. check. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/119/files - new: https://git.openjdk.java.net/jdk/pull/119/files/f02f07b6..272fb025 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=119&range=11 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=119&range=10-11 Stats: 10 lines in 1 file changed: 5 ins; 5 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/119.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/119/head:pull/119 PR: https://git.openjdk.java.net/jdk/pull/119 From ihse at openjdk.java.net Sun Oct 18 09:10:12 2020 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Sun, 18 Oct 2020 09:10:12 GMT Subject: RFR: 8254072: AArch64: Get rid of --disable-warnings-as-errors on Windows+ARM64 build [v4] In-Reply-To: References: Message-ID: <0kVzFMOZKbbvPLUlyE-VbpYSC5omD-nZoOqxBRt4s8s=.450fc58e-3aa5-401a-bce8-953a52892b87@github.com> On Thu, 15 Oct 2020 18:35:30 GMT, Bernhard Urban-Forster wrote: >> I organized this PR so that each commit contains the warning emitted by MSVC as commit message and its relevant fix. >> >> Verified on >> * Linux+ARM64: `{hotspot,jdk,langtools}:tier1`, no failures. >> * Windows+ARM64: `{hotspot,jdk,langtools}:tier1`, no (new) failures. >> * internal macOS+ARM64 port: build without `--disable-warnings-as-errors` still works. Just mentioning this here, because >> it's yet another toolchain (Xcode / clang) that needs to be kept happy [going >> forward](https://openjdk.java.net/jeps/391). > > Bernhard Urban-Forster has updated the pull request incrementally with two additional commits since the last revision: > > - uppercase suffix > - add assert Build changes look fine now. ------------- Marked as reviewed by ihse (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/530 From pliden at openjdk.java.net Sun Oct 18 13:24:28 2020 From: pliden at openjdk.java.net (Per Liden) Date: Sun, 18 Oct 2020 13:24:28 GMT Subject: RFR: 8254878: Move last piece of ZArray to GrowableArray [v2] In-Reply-To: References: Message-ID: > ZArray used to be a separate implementation of a dynamically allocated/growable array. It now instead inherits from > GrowableCHeapArray, and extends it with a transfer() function. I propose we rename this function to swap() and move it > to GrowableArrayWithAllocator, since this function could be generally useful. It would also mean that ZArray could be > just a typedef/using of GrowableCHeapArray. Per Liden has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last revision: 8254878: Move last piece of ZArray to GrowableArray ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/694/files - new: https://git.openjdk.java.net/jdk/pull/694/files/1d9807cc..2667d7dd Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=694&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=694&range=00-01 Stats: 528 lines in 25 files changed: 156 ins; 307 del; 65 mod Patch: https://git.openjdk.java.net/jdk/pull/694.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/694/head:pull/694 PR: https://git.openjdk.java.net/jdk/pull/694 From pliden at openjdk.java.net Sun Oct 18 13:24:28 2020 From: pliden at openjdk.java.net (Per Liden) Date: Sun, 18 Oct 2020 13:24:28 GMT Subject: RFR: 8254878: Move last piece of ZArray to GrowableArray In-Reply-To: References: Message-ID: On Fri, 16 Oct 2020 09:22:22 GMT, Per Liden wrote: > ZArray used to be a separate implementation of a dynamically allocated/growable array. It now instead inherits from > GrowableCHeapArray, and extends it with a transfer() function. I propose we rename this function to swap() and move it > to GrowableArrayWithAllocator, since this function could be generally useful. It would also mean that ZArray could be > just a typedef/using of GrowableCHeapArray. Rebased and updated a unit test. ------------- PR: https://git.openjdk.java.net/jdk/pull/694 From jbhateja at openjdk.java.net Sun Oct 18 18:39:18 2020 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Sun, 18 Oct 2020 18:39:18 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v4] In-Reply-To: References: Message-ID: > Summary: > > 1) Partial in-lining technique avoids call overhead penalty for small array copy operations with size less than 32 > bytes. 2) At runtime, a conditional check based on copy length either calls an array-copy stub or executes an optimized > instruction sequence using AVX-512 masked instructions emitted at the call site. 3) New runtime flag > ArrayCopyPartialInlineSize=0/32(default)/64 bytes determines the maximum size for partial in-lining. 4) Based on the > perf results seen in benchmarks currently partial in-lining is performed only for arraycopy involving sub-word types > (bool/byte/char/short). Once PR-61 gets integrated we can extend this patch to cover all the primitive types. > Performance Results: > System : CascadeLake Server, Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz > Micros : test/micro/org/openjdk/bench/java/lang/ArrayCopy*.java > ArrayCopyPartialInlineSize : 32 > > JMH | Block Size | Baseline (ns/op) | Partial Inling (ns/op) | Gain > -- | -- | -- | -- | -- > ArrayCopyAligned.testByte | 1 | 5.417 | 2.696 | 2.009272997 > ArrayCopyAligned.testByte | 3 | 5.494 | 2.702 | 2.03330866 > ArrayCopyAligned.testByte | 5 | 5.417 | 2.637 | 2.05422829 > ArrayCopyAligned.testByte | 10 | 5.343 | 2.703 | 1.976692564 > ArrayCopyAligned.testByte | 20 | 5.837 | 2.636 | 2.214339909 > ArrayCopyAligned.testByte | 70 | 5.86 | 6 | 0.976666667 > ArrayCopyAligned.testByte | 150 | 6.766 | 6.906 | 0.979727773 > ArrayCopyAligned.testByte | 300 | 7.605 | 7.952 | 0.956363179 > ArrayCopyAligned.testByte | 600 | 11.989 | 12.007 | 0.998500874 > ArrayCopyAligned.testByte | 1200 | 16.447 | 16.585 | 0.991679228 > ArrayCopyAligned.testChar | 1 | 5.02 | 2.828 | 1.775106082 > ArrayCopyAligned.testChar | 3 | 5.129 | 2.762 | 1.85698769 > ArrayCopyAligned.testChar | 5 | 5.041 | 2.762 | 1.82512672 > ArrayCopyAligned.testChar | 10 | 5.716 | 2.762 | 2.069514844 > ArrayCopyAligned.testChar | 20 | 5.111 | 5.399 | 0.946656788 > ArrayCopyAligned.testChar | 70 | 6.271 | 6.242 | 1.004645947 > ArrayCopyAligned.testChar | 150 | 7.45 | 7.599 | 0.980392157 > ArrayCopyAligned.testChar | 300 | 9.904 | 10.112 | 0.97943038 > ArrayCopyAligned.testChar | 600 | 17.131 | 17.167 | 0.997902953 > ArrayCopyAligned.testChar | 1200 | 29.556 | 29.851 | 0.990117584 > ArrayCopyUnalignedBoth.testByte | 1 | 5.419 | 2.702 | 2.005551443 > ArrayCopyUnalignedBoth.testByte | 3 | 5.558 | 2.636 | 2.108497724 > ArrayCopyUnalignedBoth.testByte | 5 | 5.43 | 2.636 | 2.059939302 > ArrayCopyUnalignedBoth.testByte | 10 | 5.378 | 2.637 | 2.039438756 > ArrayCopyUnalignedBoth.testByte | 20 | 5.914 | 2.636 | 2.243550836 > ArrayCopyUnalignedBoth.testByte | 70 | 5.882 | 5.954 | 0.987907289 > ArrayCopyUnalignedBoth.testByte | 150 | 6.784 | 6.88 | 0.986046512 > ArrayCopyUnalignedBoth.testByte | 300 | 7.635 | 7.968 | 0.958207831 > ArrayCopyUnalignedBoth.testByte | 600 | 12.226 | 12.129 | 1.007997362 > ArrayCopyUnalignedBoth.testByte | 1200 | 16.992 | 20.717 | 0.820195974 > ArrayCopyUnalignedBoth.testChar | 1 | 5.019 | 2.828 | 1.774752475 > ArrayCopyUnalignedBoth.testChar | 3 | 5.163 | 2.763 | 1.868621064 > ArrayCopyUnalignedBoth.testChar | 5 | 5.042 | 2.827 | 1.783516095 > ArrayCopyUnalignedBoth.testChar | 10 | 5.718 | 2.828 | 2.021923621 > ArrayCopyUnalignedBoth.testChar | 20 | 5.111 | 5.404 | 0.945780903 > ArrayCopyUnalignedBoth.testChar | 70 | 6.367 | 6.235 | 1.02117081 > ArrayCopyUnalignedBoth.testChar | 150 | 7.367 | 8.269 | 0.890917886 > ArrayCopyUnalignedBoth.testChar | 300 | 10.358 | 10.642 | 0.973313287 > ArrayCopyUnalignedBoth.testChar | 600 | 20.84 | 17.522 | 1.189361945 > ArrayCopyUnalignedBoth.testChar | 1200 | 31.895 | 31.892 | 1.000094067 > ArrayCopyUnalignedDst.testByte | 1 | 5.455 | 2.637 | 2.068638604 > ArrayCopyUnalignedDst.testByte | 3 | 5.562 | 2.702 | 2.058475204 > ArrayCopyUnalignedDst.testByte | 5 | 5.427 | 2.702 | 2.008512213 > ArrayCopyUnalignedDst.testByte | 10 | 5.367 | 2.696 | 1.990727003 > ArrayCopyUnalignedDst.testByte | 20 | 5.839 | 2.637 | 2.214258627 > ArrayCopyUnalignedDst.testByte | 70 | 5.888 | 5.968 | 0.986595174 > ArrayCopyUnalignedDst.testByte | 150 | 6.785 | 6.773 | 1.001771741 > ArrayCopyUnalignedDst.testByte | 300 | 7.606 | 7.972 | 0.954089313 > ArrayCopyUnalignedDst.testByte | 600 | 11.986 | 21.195 | 0.565510734 > ArrayCopyUnalignedDst.testByte | 1200 | 16.54 | 16.784 | 0.985462345 > ArrayCopyUnalignedDst.testChar | 1 | 5.02 | 2.827 | 1.775733994 > ArrayCopyUnalignedDst.testChar | 3 | 5.131 | 2.762 | 1.857711803 > ArrayCopyUnalignedDst.testChar | 5 | 5.038 | 2.762 | 1.82404055 > ArrayCopyUnalignedDst.testChar | 10 | 5.718 | 2.762 | 2.070238957 > ArrayCopyUnalignedDst.testChar | 20 | 5.113 | 5.401 | 0.946676541 > ArrayCopyUnalignedDst.testChar | 70 | 6.222 | 6.214 | 1.001287416 > ArrayCopyUnalignedDst.testChar | 150 | 7.367 | 8.125 | 0.906707692 > ArrayCopyUnalignedDst.testChar | 300 | 10.204 | 10.082 | 1.012100774 > ArrayCopyUnalignedDst.testChar | 600 | 16.978 | 17.135 | 0.990837467 > ArrayCopyUnalignedDst.testChar | 1200 | 32.351 | 31.996 | 1.011095137 > ArrayCopyUnalignedSrc.testByte | 1 | 5.414 | 2.696 | 2.008160237 > ArrayCopyUnalignedSrc.testByte | 3 | 5.494 | 2.637 | 2.083428138 > ArrayCopyUnalignedSrc.testByte | 5 | 5.431 | 2.637 | 2.059537353 > ArrayCopyUnalignedSrc.testByte | 10 | 5.344 | 2.703 | 1.977062523 > ArrayCopyUnalignedSrc.testByte | 20 | 5.834 | 2.696 | 2.163946588 > ArrayCopyUnalignedSrc.testByte | 70 | 5.883 | 6.009 | 0.979031453 > ArrayCopyUnalignedSrc.testByte | 150 | 6.729 | 6.87 | 0.979475983 > ArrayCopyUnalignedSrc.testByte | 300 | 7.603 | 7.97 | 0.953952321 > ArrayCopyUnalignedSrc.testByte | 600 | 12.004 | 12.16 | 0.987171053 > ArrayCopyUnalignedSrc.testByte | 1200 | 16.534 | 16.643 | 0.9934507 > ArrayCopyUnalignedSrc.testChar | 1 | 5.021 | 2.762 | 1.81788559 > ArrayCopyUnalignedSrc.testChar | 3 | 5.13 | 2.762 | 1.857349747 > ArrayCopyUnalignedSrc.testChar | 5 | 5.042 | 2.827 | 1.783516095 > ArrayCopyUnalignedSrc.testChar | 10 | 5.726 | 2.761 | 2.073886273 > ArrayCopyUnalignedSrc.testChar | 20 | 5.112 | 5.401 | 0.94649139 > ArrayCopyUnalignedSrc.testChar | 70 | 6.113 | 6.227 | 0.981692629 > ArrayCopyUnalignedSrc.testChar | 150 | 7.493 | 7.888 | 0.949923935 > ArrayCopyUnalignedSrc.testChar | 300 | 10.234 | 10.501 | 0.97457385 > ArrayCopyUnalignedSrc.testChar | 600 | 17.175 | 17.142 | 1.001925096 > ArrayCopyUnalignedSrc.testChar | 1200 | 31.926 | 31.987 | 0.998092975 > > Detailed Reports: > Baseline : > [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt) > WithOpt : > [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt) Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8252848 - Replacing explicit type checks with existing type checking routines - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8252848 - 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions. ------------- Changes: https://git.openjdk.java.net/jdk/pull/302/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=302&range=03 Stats: 518 lines in 23 files changed: 494 ins; 0 del; 24 mod Patch: https://git.openjdk.java.net/jdk/pull/302.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/302/head:pull/302 PR: https://git.openjdk.java.net/jdk/pull/302 From dnsimon at openjdk.java.net Sun Oct 18 21:51:23 2020 From: dnsimon at openjdk.java.net (Doug Simon) Date: Sun, 18 Oct 2020 21:51:23 GMT Subject: RFR: 8254793: [JVMCI] improve speculation encoding [v5] In-Reply-To: References: Message-ID: > This PR changes the encoding of a `jdk.vm.ci.hotspot.HotSpotSpeculationLog.HotSpotSpeculation` from a long to an int. > The `Thread::_pending_failed_speculation` field remains as a `jlong` since it is already exposed to JVMCI Java code > already via VMStructs and this PR does not update its usage in Graal. Doug Simon has updated the pull request incrementally with two additional commits since the last revision: - require SHA1 algorithm to be present - simplified changes such that only the length component of an encode speculation is reduced (to 5 bits) ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/667/files - new: https://git.openjdk.java.net/jdk/pull/667/files/b0d93bdc..c34e25de Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=667&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=667&range=03-04 Stats: 60 lines in 5 files changed: 17 ins; 21 del; 22 mod Patch: https://git.openjdk.java.net/jdk/pull/667.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/667/head:pull/667 PR: https://git.openjdk.java.net/jdk/pull/667 From dnsimon at openjdk.java.net Sun Oct 18 21:51:24 2020 From: dnsimon at openjdk.java.net (Doug Simon) Date: Sun, 18 Oct 2020 21:51:24 GMT Subject: RFR: 8254793: [JVMCI] improve speculation encoding [v4] In-Reply-To: <-3aG1VbRHkxdSp12UScaUxwGve0nAW_ok4hQw5FKnW8=.a01f4bef-7578-4d3b-a6bf-6cab461259d2@github.com> References: <-3aG1VbRHkxdSp12UScaUxwGve0nAW_ok4hQw5FKnW8=.a01f4bef-7578-4d3b-a6bf-6cab461259d2@github.com> Message-ID: On Fri, 16 Oct 2020 16:56:01 GMT, Tom Rodriguez wrote: >> Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental >> views will show differences compared to the previous content of the PR. > > src/hotspot/share/jvmci/jvmciRuntime.hpp line 55: > >> 53: FailedSpeculation** _failed_speculations; >> 54: >> 55: // A speculation id is an index (high 26 bits) and a length (low 5 bits). > > We don't really have to enforce that it fits in an int any more. I think it would be more natural to allow to use all > the remaining bits even though we'll never actually use that space in practice. Doing so makes the code look a little > odd I think since there's no obvious reason for limit. We just want an encoding that's int friendly for the normal > case. All good points. Please confirm that 2 most recent commits to this PR align with your suggestions. ------------- PR: https://git.openjdk.java.net/jdk/pull/667 From david.holmes at oracle.com Sun Oct 18 22:09:06 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 19 Oct 2020 08:09:06 +1000 Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v4] In-Reply-To: References: <0Zh0H5gSXzvHSstQ2w8NBM-P8yERRPouvhZJDNGvu4A=.6cde913f-7499-4c45-bc63-b717502b661e@github.com> <2moJ2056gzwWoleYccv21TpFYQHw5h9bA-IZCImplhs=.763198bf-06b0-4589-b01e-217ba84af94a@github.com> Message-ID: Hi Jorn, I'm not reviewing this but this exchange caught my attention ... On 16/10/2020 9:15 pm, Jorn Vernee wrote: > On Thu, 15 Oct 2020 22:42:49 GMT, Coleen Phillimore wrote: > >>> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: >>> >>> Re-add file erroneously deleted (detected as rename) >> >> src/hotspot/cpu/x86/foreign_globals_x86.cpp line 56: >> >>> 54: } >>> 55: >>> 56: const ABIDescriptor parseABIDescriptor(JNIEnv* env, jobject jabi) { >> >> I don't know if you care about performance but of these env->calls transition into the VM and back out again. You >> should prefix all the code that comes from java to native with JNI_ENTRY and just use native JVM code to implement >> these. > > Currently this is prefixed with `JVM_ENTRY` e.g. like: > JVM_ENTRY(jlong, PI_generateAdapter(JNIEnv* env, jclass _unused, jobject abi, jobject layout)) > { > ThreadToNativeFromVM ttnfvm(thread); > return ProgrammableInvoker::generate_adapter(env, abi, layout); > } > JVM_END > (where `generate_adapter` ends up calling `parseABIDescriptor`) > > JVM_ENTYRY seems to be mostly the same except for JNI_ENTRY having a `WeakPreserverExceptionMark` as well. Do we need > to switch these? Also, I guess if we want to use VM code directly, we should get rid of the `ThreadToNativeFromVM` RAII > handle. Why are you going from native to VM to native again with this code? You would use a JNI/JVM_ENTRY because you have to execute VM runtime code. But your code immediately switches back to native and doesn't execute any VM runtime code (other than that involved in the transition logic itself). ?? Cheers, David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/634 > From ysuenaga at openjdk.java.net Mon Oct 19 01:35:13 2020 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Mon, 19 Oct 2020 01:35:13 GMT Subject: RFR: 8252657: JVMTI agent is not unloaded when Agent_OnAttach is failed In-Reply-To: <80LJDTCsT_y-KlThryd5Bxu5RRyrjmKfs5p9vJUn61E=.68b594a0-fe58-4f4d-a49c-eec2e90f9373@github.com> References: <1H1wUQdxCLU2qddqEIYSx2iOhIKL3b5etUmjsS6NBlU=.0bf1fe0c-8dcf-4ca0-bd57-b8794d5f2810@github.com> <80LJDTCsT_y-KlThryd5Bxu5RRyrjmKfs5p9vJUn61E=.68b594a0-fe58-4f4d-a49c-eec2e90f9373@github.com> Message-ID: <-Xrp6c94000-jE1p6NvzjsxFUW5ILrH_F1eT1i7esw8=.9d609f81-1b61-4ebf-9afd-73b834c1b18c@github.com> On Fri, 2 Oct 2020 07:27:43 GMT, Yasumasa Suenaga wrote: >> I was initially wrong by supporting this, and now I share David's concerns about unclear semantics of this. The >> questions are: >> - Q1: Is it necessary to call the Agent_OnUnload()? >> - Q2: Would it be a JVMTI spec violation to call the Agent_OnAttach() multiple times? (It seems to be the case to me.) >> - Q3: What has to be done for statically linked agent? >> - Q4: Should the agent be correctly loadable in the first place? What were the reasons its loading to fail? >> >> Yes, at least, a CSR is needed for this. > >> * Q1: Is it necessary to call the Agent_OnUnload()? > > [JVMTI spec of Agent_OnUnload()](https://docs.oracle.com/en/java/javase/15/docs/specs/jvmti.html#onunload) says this > function will be called when the agent library will be unloaded by platform specific mechanism. OTOH it also says > `Agent_OnUnload()` will be called both at VM termination and **by other reasons**. The spec don't say for the case if > `Agent_OnAttach()` would be failed. IMHO `Agent_OnUnload()` should be called because this PR would unload library if > `Agent_OnAttach()` failed. >> * Q2: Would it be a JVMTI spec violation to call the Agent_OnAttach() multiple times? (It seems to be the case to me.) > > `Agent_OnAttach()` should be called only once per attach request, but VM should accept multiple attach request for same > agent library. > For example, we can add multiple `-agentlib` and `-agentpath` request as below. JVMTI agent might change behavior due > to arguments or configuration file. > -agentlib:test=profile=A -agentlib:test=profile=B -agentpath:/path/to/libtest=profile=C > > Agent developers should have responsibility for the behavior when more than one agent is loaded at a time. > >> * Q3: What has to be done for statically linked agent? > > JVMTI spec says "unless it is statically linked into the executable", so I think we can ignore about Agent_OnUnload_L() > in this PR. >> * Q4: Should the agent be correctly loadable in the first place? What were the reasons its loading to fail? > > Agent (`Agent_OnAttach()`) might fail due to error in agent logic. For example, some agents load configuration file at > initialization. If the user gives wrong value, it will fail. >> Yes, at least, a CSR is needed for this. > > I will file CSR for this PR after this discussion. If we can change the spec that agent library would not be unloaded when `Agent_OnAttach()` failed, we can change like [webrev.00](https://cr.openjdk.java.net/~ysuenaga/JDK-8252657/webrev.00/). It is simple, and similar behavior with `Agent_OnLoad()`. It might be prefer for JVMTI agent developers. ------------- PR: https://git.openjdk.java.net/jdk/pull/19 From rrich at openjdk.java.net Mon Oct 19 05:18:31 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Mon, 19 Oct 2020 05:18:31 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v13] In-Reply-To: References: Message-ID: > Hi, > > this is the continuation of the review of the implementation for: > > https://bugs.openjdk.java.net/browse/JDK-8227745 > https://bugs.openjdk.java.net/browse/JDK-8233915 > > It allows for JIT optimizations based on escape analysis even if JVMTI agents acquire capabilities to access references > to objects that are subject to such optimizations, e.g. scalar replacement. The implementation reverts such > optimizations just before access very much as when switching from JIT compiled execution to the interpreter, aka > "deoptimization". Webrev.8 was the last one before before the transition to Git/Github: > > http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ > > Thanks, Richard. Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: Removed cross_modify_fence from JT::wait_for_object_deoptimization(). See JDK-8254264. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/119/files - new: https://git.openjdk.java.net/jdk/pull/119/files/272fb025..2ca09188 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=119&range=12 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=119&range=11-12 Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/119.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/119/head:pull/119 PR: https://git.openjdk.java.net/jdk/pull/119 From shade at openjdk.java.net Mon Oct 19 06:38:16 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 19 Oct 2020 06:38:16 GMT Subject: RFR: 8253525: Implement getInstanceSize/sizeOf intrinsics [v2] In-Reply-To: References: Message-ID: <4PVuHaDhnUx61L4tg9slmrd6KDHuwjABffj2bVce1RM=.c574f639-2c21-44cb-bc78-3d69d54e68de@github.com> > This is fork off the SizeOf JEP, JDK-8249196. There is already the entry point in JDK that can use the intrinsic like > this: `Instrumentation.getInstanceSize`. Therefore, we can implement the C1/C2 intrinsic now, hook it up to > `Instrumentation`, and let the tools use that fast path today. With this patch, JOL is able to be close to > `deepSizeOf` implementation from SizeOf JEP. > Example performance improvements for sizing up a custom linked list: > > Benchmark (size) Mode Cnt Score Error Units > > # Default > LinkedChainBench.linkedChain 1 avgt 5 705.835 ? 8.051 ns/op > LinkedChainBench.linkedChain 10 avgt 5 3148.874 ? 37.856 ns/op > LinkedChainBench.linkedChain 100 avgt 5 28693.256 ? 142.254 ns/op > LinkedChainBench.linkedChain 1000 avgt 5 290161.590 ? 4594.631 ns/op > > # Instrumentation attached, no intrinsics > LinkedChainBench.linkedChain 1 avgt 5 159.659 ? 19.238 ns/op > LinkedChainBench.linkedChain 10 avgt 5 717.659 ? 22.540 ns/op > LinkedChainBench.linkedChain 100 avgt 5 7739.394 ? 111.683 ns/op > LinkedChainBench.linkedChain 1000 avgt 5 80724.238 ? 2887.794 ns/op > > # Instrumentation attached, new intrinsics > LinkedChainBench.linkedChain 1 avgt 5 95.254 ? 0.808 ns/op > LinkedChainBench.linkedChain 10 avgt 5 261.564 ? 8.524 ns/op > LinkedChainBench.linkedChain 100 avgt 5 3367.192 ? 21.128 ns/op > LinkedChainBench.linkedChain 1000 avgt 5 34148.851 ? 373.080 ns/op Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit: 8253525: Implement getInstanceSize/sizeOf intrinsics ------------- Changes: https://git.openjdk.java.net/jdk/pull/650/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=650&range=01 Stats: 612 lines in 10 files changed: 563 ins; 0 del; 49 mod Patch: https://git.openjdk.java.net/jdk/pull/650.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/650/head:pull/650 PR: https://git.openjdk.java.net/jdk/pull/650 From shade at openjdk.java.net Mon Oct 19 06:57:24 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 19 Oct 2020 06:57:24 GMT Subject: RFR: 8253525: Implement getInstanceSize/sizeOf intrinsics [v3] In-Reply-To: References: Message-ID: > This is fork off the SizeOf JEP, JDK-8249196. There is already the entry point in JDK that can use the intrinsic like > this: `Instrumentation.getInstanceSize`. Therefore, we can implement the C1/C2 intrinsic now, hook it up to > `Instrumentation`, and let the tools use that fast path today. With this patch, JOL is able to be close to > `deepSizeOf` implementation from SizeOf JEP. > Example performance improvements for sizing up a custom linked list: > > Benchmark (size) Mode Cnt Score Error Units > > # Default > LinkedChainBench.linkedChain 1 avgt 5 705.835 ? 8.051 ns/op > LinkedChainBench.linkedChain 10 avgt 5 3148.874 ? 37.856 ns/op > LinkedChainBench.linkedChain 100 avgt 5 28693.256 ? 142.254 ns/op > LinkedChainBench.linkedChain 1000 avgt 5 290161.590 ? 4594.631 ns/op > > # Instrumentation attached, no intrinsics > LinkedChainBench.linkedChain 1 avgt 5 159.659 ? 19.238 ns/op > LinkedChainBench.linkedChain 10 avgt 5 717.659 ? 22.540 ns/op > LinkedChainBench.linkedChain 100 avgt 5 7739.394 ? 111.683 ns/op > LinkedChainBench.linkedChain 1000 avgt 5 80724.238 ? 2887.794 ns/op > > # Instrumentation attached, new intrinsics > LinkedChainBench.linkedChain 1 avgt 5 95.254 ? 0.808 ns/op > LinkedChainBench.linkedChain 10 avgt 5 261.564 ? 8.524 ns/op > LinkedChainBench.linkedChain 100 avgt 5 3367.192 ? 21.128 ns/op > LinkedChainBench.linkedChain 1000 avgt 5 34148.851 ? 373.080 ns/op Aleksey Shipilev has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: 8253525: Implement getInstanceSize/sizeOf intrinsics ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/650/files - new: https://git.openjdk.java.net/jdk/pull/650/files/d744a913..6160f6a8 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=650&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=650&range=01-02 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/650.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/650/head:pull/650 PR: https://git.openjdk.java.net/jdk/pull/650 From rehn at openjdk.java.net Mon Oct 19 08:23:14 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 19 Oct 2020 08:23:14 GMT Subject: RFR: 8221554: aarch64 cross-modifying code [v4] In-Reply-To: References: Message-ID: <0qlO5eLJPQgvI1Lg3pO8YpUeoC2zGexNt3DgHptiQiA=.e7ce4706-a702-4e32-a976-eac4adc24771@github.com> On Tue, 13 Oct 2020 08:49:59 GMT, Alan Hayward wrote: >> The AArch64 port uses maybe_isb in places where an ISB might be required >> because the code may have safepointed. These maybe_isbs are very conservative >> and are used in many places are used when a safepoint has not happened. >> >> cross_modify_fence was added in common code to place a barrier in all the >> places after a safepoint has occurred. All the uses of it are in common code, >> yet it remains unimplemented on AArch64. >> >> This set of patches implements cross_modify_fence for AArch64 and reconsiders >> every uses of maybe_isb, discarding many of them. In addition, it introduces >> a new diagnostic option, which when enabled on AArch64 tests the correct >> usage of the barriers. >> >> Advantage of this patch is threefold: >> * Reducing the number of ISBs - giving a theoretical performance improvement. >> * Use of common code instead of backend specific code. >> * Additional test diagnostic options >> >> Patch 1: Split cross_modify_fence >> ================================= >> This is simply refactoring work split out to simplify the other two patches. >> >> instruction_fence() is provided by each target and simply places >> a fence for the instruction stream. >> >> cross_modify_fence() is now a member of JavaThread and just calls >> instruction_fence. This function will be extended in Patch 3. >> >> Patch 2: Use cross_modify_fence instead of maybe_isb >> ==================================================== >> >> The [n] References refer to the comments for cross_modify_fence in >> thread.hpp. >> >> This is all the existing uses of maybe_isb in the AArch64 target: >> >> 1) Instances of Java code calling a VM function >> * This encapsulates the changes to: >> ** MacroAssembler::call_VM_leaf_base() >> ** generate_fast_get_int_field0() >> ** stubGenerator_aarch64 generate_throw_exception() >> ** sharedRuntime_aarch64 generate_handler_blob() >> ** SharedRuntime::generate_resolve_blob() >> ** C1 LIR_Assembler::rt_call >> ** C1 StubAssembler::call_RT(): used by Used by generate_exception_throw, >> generate_handle_exception, generate_code_for. >> ** OptoRuntime::generate_exception_blob() >> * Any changes will be caught due to calls to [2] or [3] by the VM function. >> * Any calls that do not call [2] or [3] do not require an ISB. >> * This patch is more optimal for these cases. >> >> 2) Instances of Java code calling a JNI function >> * This encapsulates the changes to: >> ** SharedRuntime::generate_native_wrapper() >> ** TemplateInterpreterGenerator::generate_native_entry() >> * A safepoint still in progress after the call with be caught by [4]. >> * An ISB is still required for the case where there was a safepoint >> but it completed during the call. This happens if the code doesn't >> branch on safepoint_in_progress >> * In the SharedRuntime version, the two possible calls to >> reguard_yellow_pages and complete_monitor_unlocking_C are after the thread >> goes back into it's original state, so are covered by [2] and [3], the >> same as a normal VM call. >> * This patch is only more optimal for the two post-JNI calls. >> >> 3) Patching functions >> * This encapsulates the changes to: >> ** patch_callers_callsite() (called by gen_c2i_adapter()) >> * This results in code being patched, but does not safepoint >> * Therefore an ISB is required. >> * This patch introduces no change here. >> >> 4) C1 MacroAssembler::emit_static_call_stub() >> * Calls ISB (not maybe_isb) >> * By design, the patching doesn't require that the up-to-date >> destination is required for proper functioning. >> * However, the ISB makes it most likely that the new destination will >> be picked up. >> * This patch introduces no change here. >> >> Patch 3: Add cross modify fence verification >> ============================================ >> >> The VerifyCrossModifyFence diagnostic flag enables confirmation to the correct >> usage of instruction barriers. It can safely be enabled on any Java run. >> >> Enabling it will cause the following: >> >> * Once all threads have been brought to a safepoint, each thread will be >> marked. >> >> * On a cross_modify_fence and safepoint_fence the mark for that thread >> will be cleared. >> >> * On entry to a method and in a safepoint poll, then the thread is checked. >> If it is marked, then the code will error. > > Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: > > Remove inlasm_isb define > > Change-Id: I2d0ef8a78292dac875f3f65d2253981cdb7a497a Seems fine to me, mostly look at shared code part. ------------- Marked as reviewed by rehn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/428 From stefank at openjdk.java.net Mon Oct 19 08:28:15 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 19 Oct 2020 08:28:15 GMT Subject: RFR: 8254878: Move last piece of ZArray to GrowableArray [v2] In-Reply-To: References: Message-ID: On Sun, 18 Oct 2020 13:24:28 GMT, Per Liden wrote: >> ZArray used to be a separate implementation of a dynamically allocated/growable array. It now instead inherits from >> GrowableCHeapArray, and extends it with a transfer() function. I propose we rename this function to swap() and move it >> to GrowableArrayWithAllocator, since this function could be generally useful. It would also mean that ZArray could be >> just a typedef/using of GrowableCHeapArray. > > Per Liden has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev > excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since > the last revision: > 8254878: Move last piece of ZArray to GrowableArray Marked as reviewed by stefank (Reviewer). src/hotspot/share/utilities/growableArray.hpp line 696: > 694: > 695: public: > 696: GrowableArrayCHeap(int initial_max = 0) : Just noting that GrowableArray defaults to 2 and here we default to 0. ------------- PR: https://git.openjdk.java.net/jdk/pull/694 From shade at openjdk.java.net Mon Oct 19 10:30:19 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 19 Oct 2020 10:30:19 GMT Subject: RFR: 8254995: [x86] ControlWord::print(), rc/pc variables might not be initialized Message-ID: Static analyzers complain that in `ControlWord::print()`, `rc`/`pc` variables might not be initialized. This never happens in practice, because `rounding_control()` and `precision_control()` return the good values. We can make it cleaner to silence the compiler. Testing: - [x] Linux x86_64 tier1 ------------- Commit messages: - 8254995: [x86] ControlWord::print(), rc/pc variables might not be initialized Changes: https://git.openjdk.java.net/jdk/pull/731/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=731&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254995 Stats: 6 lines in 1 file changed: 6 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/731.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/731/head:pull/731 PR: https://git.openjdk.java.net/jdk/pull/731 From mcimadamore at openjdk.java.net Mon Oct 19 10:34:32 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Mon, 19 Oct 2020 10:34:32 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v13] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the third incubation round of the foreign memory access API incubation > (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: > * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from > multiple threads > * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee > that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class > has been added, which defines several useful dereference routines; these are really just thin wrappers around memory > access var handles, but they make the barrier of entry for using this API somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not > the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link > to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit > of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which > wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as > dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability > in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; > secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can > use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done > by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided > below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be > happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, > Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd > like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio > Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative > to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a > carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access > base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte > offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. > `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which > it is easy to derive all the other handles using plain var handle combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both > `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients > can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. > * `MemoryLayouts` > * A new layout, for machine addresses, has been added to the mix. > > > > ### Implementation changes > > There are two main things to discuss here: support for shared segments, and the general simplification of the memory > access var handle support. > #### Shared segments > > The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to > achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment > is shared, it would be possible for a thread to close it while another is accessing it. After considering several > options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he > reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world > (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a > close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and > the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). > It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. Sadly, none of > these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, > we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to > whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of > stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). The question is, then, > once we detect that a thread is accessing the very segment we're about to close, what should happen? We first > experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it > fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the > machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to > minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread > is accessing the segment being closed. As written in the javadoc, this doesn't mean that clients should just catch and > try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should > be treated as such. In terms of gritty implementation, we needed to centralize memory access routines in a single > place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in > addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` > annotation, which tells the VM that something important is going on. To achieve this, we created a new (autogenerated) > class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, > like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is > tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during > access (which is important when registering segments against cleaners). Of course, to make memory access safe, memory > access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead > of unsafe, so that a liveness check can be triggered (in case a scope is present). `ScopedMemoryAccess` has a > `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed > successfully. The implementation of `MemoryScope` (now significantly simplified from what we had before), has two > implementations, one for confined segments and one for shared segments; the main difference between the two is what > happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared > segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or > `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` > state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. #### > Memory access var handles overhaul The key realization here was that if all memory access var handles took a > coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle > form. This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var > handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that > e.g. additional offset is injected into a base memory access var handle. This also helped in simplifying the > implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level > access on the innards of the memory access var handle. All that code is now gone. #### Test changes Not much to see > here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, > since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` > functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the > microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared > segment case. [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] - > https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Address CSR comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/548/files - new: https://git.openjdk.java.net/jdk/pull/548/files/6091ed0f..31674311 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=12 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=11-12 Stats: 1224 lines in 2 files changed: 135 ins; 737 del; 352 mod Patch: https://git.openjdk.java.net/jdk/pull/548.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/548/head:pull/548 PR: https://git.openjdk.java.net/jdk/pull/548 From mcimadamore at openjdk.java.net Mon Oct 19 10:39:14 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Mon, 19 Oct 2020 10:39:14 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v6] In-Reply-To: References: Message-ID: On Mon, 12 Oct 2020 18:06:55 GMT, Maurizio Cimadamore wrote: >> Build changes look good. > > I've just uploaded a biggie update to the foreign memory access support. While doing performance evaluation, we have > realized that mixing a multi-level hierarchy (`MappedMemorySegment extends MemorySegments`) with exact invoke semantics > of `VarHandle` and `MethodHandle` is not a good match and can lead to great performance degradation for seemingly > "correct" code. While some of this can be attributed to the `VarHandle` API, or to the fact that the so called > "generic" invocation path should not be that slow in case where the parameters are clearly related, it seems smelly > that a primitive API such as `MemorySegment` should give raise to such issues. We have therefore decided to drop the > `MappedMemorySegment` - this means that there's only one memory segment type users can deal with: `MemorySegment` - and > no chance for mistakes. Of course `MappedMemorySegment` has been primarily introduces to allow for operations which > were previously possible on `MappedByteBuffer` such as `force`. To support these use cases, a separate class has been > introduced, namely `MappedMemorySegments` (note the trailing `S`). This class contains a bunch of static methods which > can be used to achieve the desired effects, without polluting the `MemorySegment` API. A new method has been added on > `MemorySegment` which returns an optional file descriptor; this might be useful for clients which want to guess whether > a segment is in fact a mapped segment, or if they need (e.g. in Windows) the file descriptor to do some other kind of > low level op. I think this approach is more true to the goals and spirit of the Foreign Memory Access API, and it also > offers some ways to improve over the existing API: for instance, the only reason why the `MemorySegment::spliterator` > method was a static method was that we needed inference, so that we could return either a `Spliterator` > or a `Spliterator`. All of that is gone now, so the method can return to be what it morally always > has been: an instance method on `MemorySegment`. Updated javadoc: > http://cr.openjdk.java.net/~mcimadamore/8254162_v2/javadoc/jdk/incubator/foreign/package-summary.html Updated > specdiff: http://cr.openjdk.java.net/~mcimadamore/8254162_v2/specdiff/overview-summary.html The latest iteration addresses a comment raised during CSR review; more specifically, the `MemoryAccess` class has several variants of dereference methods - e.g. `getInt`, `getInt_LE`, `getInt_BE`, to support different endianness. The comment was to just have two overloads, e.g. `getInt` and `getInt(ByteOrder)` instead of three. I've implemented the suggestion in this new iteration, as I think it makes the API a bit more compact. ------------- PR: https://git.openjdk.java.net/jdk/pull/548 From rehn at openjdk.java.net Mon Oct 19 10:41:15 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 19 Oct 2020 10:41:15 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended Message-ID: The main point of this change-set is to make it easier to implement S/R on top of handshakes. Which is a prerequisite for removing _suspend_flag (which duplicates the handshake functionality). But we also remove some complicated S/R methods. We basically just put in everything in the handshake closure, so the diff just looks much worse than what it is. TraceSuspendDebugBits have an ifdef, but in both cases it now just returns. But I was unsure if I should remove now or when is_ext_suspend_completed() is removed. Passes multiple t1-5 runs, locally it passes many jck:vm/nsk_jvmti/nsk_jdi/jdk-jdi runs. ------------- Commit messages: - Utilize handshakes instead of is_thread_fully_suspended Changes: https://git.openjdk.java.net/jdk/pull/729/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=729&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8223312 Stats: 480 lines in 6 files changed: 158 ins; 266 del; 56 mod Patch: https://git.openjdk.java.net/jdk/pull/729.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/729/head:pull/729 PR: https://git.openjdk.java.net/jdk/pull/729 From fyang at openjdk.java.net Mon Oct 19 11:14:22 2020 From: fyang at openjdk.java.net (Fei Yang) Date: Mon, 19 Oct 2020 11:14:22 GMT Subject: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v9] In-Reply-To: References: Message-ID: > Contributed-by: ard.biesheuvel at linaro.org, dongbo4 at huawei.com > > This added an intrinsic for SHA3 using aarch64 v8.2 SHA3 Crypto Extensions. > Reference implementation for core SHA-3 transform using ARMv8.2 Crypto Extensions: > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/arm64/crypto/sha3-ce-core.S?h=v5.4.52 > > Trivial adaptation in SHA3. implCompress is needed for the purpose of adding the intrinsic. > For SHA3, we need to pass one extra parameter "digestLength" to the stub for the calculation of block size. > "digestLength" is also used in for the EOR loop before keccak to differentiate different SHA3 variants. > > We added jtreg tests for SHA3 and used QEMU system emulator which supports SHA3 instructions to test the functionality. > Patch passed jtreg tier1-3 tests with QEMU system emulator. > Also verified with jtreg tier1-3 tests without SHA3 instructions on aarch64-linux-gnu and x86_64-linux-gnu, to make > sure that there's no regression. > We used one existing JMH test for performance test: test/micro/org/openjdk/bench/java/security/MessageDigests.java > We measured the performance benefit with an aarch64 cycle-accurate simulator. > Patch delivers 20% - 40% performance improvement depending on specific SHA3 digest length and size of the message. > > For now, this feature will not be enabled automatically for aarch64. We can auto-enable this when it is fully tested on > real hardware. But for the above testing purposes, this is auto-enabled when the corresponding hardware feature is > detected. Fei Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: - Merge master - Remove unnecessary code changes in vm_version_aarch64.cpp - Merge master - Merge master - Merge master - Merge master - Add sha3 instructions to cpu/aarch64/aarch64-asmtest.py and regenerate the test in assembler_aarch64.cpp:asm_check - Rebase - Merge master - Fix trailing whitespace issue - ... and 1 more: https://git.openjdk.java.net/jdk/compare/e9be2db7...05551701 ------------- Changes: https://git.openjdk.java.net/jdk/pull/207/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=207&range=08 Stats: 1262 lines in 36 files changed: 1007 ins; 22 del; 233 mod Patch: https://git.openjdk.java.net/jdk/pull/207.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/207/head:pull/207 PR: https://git.openjdk.java.net/jdk/pull/207 From jvernee at openjdk.java.net Mon Oct 19 11:29:23 2020 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Mon, 19 Oct 2020 11:29:23 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v4] In-Reply-To: References: <0Zh0H5gSXzvHSstQ2w8NBM-P8yERRPouvhZJDNGvu4A=.6cde913f-7499-4c45-bc63-b717502b661e@github.com> <2moJ2056gzwWoleYccv21TpFYQHw5h9bA-IZCImplhs=.763198bf-06b0-4589-b01e-217ba84af94a@github.com> Message-ID: On Fri, 16 Oct 2020 11:12:01 GMT, Jorn Vernee wrote: >> src/hotspot/cpu/x86/foreign_globals_x86.cpp line 56: >> >>> 54: } >>> 55: >>> 56: const ABIDescriptor parseABIDescriptor(JNIEnv* env, jobject jabi) { >> >> I don't know if you care about performance but of these env->calls transition into the VM and back out again. You >> should prefix all the code that comes from java to native with JNI_ENTRY and just use native JVM code to implement >> these. > > Currently this is prefixed with `JVM_ENTRY` e.g. like: > JVM_ENTRY(jlong, PI_generateAdapter(JNIEnv* env, jclass _unused, jobject abi, jobject layout)) > { > ThreadToNativeFromVM ttnfvm(thread); > return ProgrammableInvoker::generate_adapter(env, abi, layout); > } > JVM_END > (where `generate_adapter` ends up calling `parseABIDescriptor`) > > JVM_ENTYRY seems to be mostly the same except for JNI_ENTRY having a `WeakPreserverExceptionMark` as well. Do we need > to switch these? Also, I guess if we want to use VM code directly, we should get rid of the `ThreadToNativeFromVM` RAII > handle. re-wrote this code to use the VM internal APIs instead of JNI, changes are isolated in a sub-pr here: https://github.com/mcimadamore/jdk/pull/1 Could you take a look? ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From jvernee at openjdk.java.net Mon Oct 19 11:29:22 2020 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Mon, 19 Oct 2020 11:29:22 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v4] In-Reply-To: <2moJ2056gzwWoleYccv21TpFYQHw5h9bA-IZCImplhs=.763198bf-06b0-4589-b01e-217ba84af94a@github.com> References: <0Zh0H5gSXzvHSstQ2w8NBM-P8yERRPouvhZJDNGvu4A=.6cde913f-7499-4c45-bc63-b717502b661e@github.com> <2moJ2056gzwWoleYccv21TpFYQHw5h9bA-IZCImplhs=.763198bf-06b0-4589-b01e-217ba84af94a@github.com> Message-ID: <1qSzjGTeTsGkvOvOIiXjY8JP944k3uLaq6KhkUP1vHE=.068f0409-ef1d-486a-8d74-c587619893e9@github.com> On Thu, 15 Oct 2020 23:15:07 GMT, Coleen Phillimore wrote: >> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: >> >> Re-add file erroneously deleted (detected as rename) > > I looked through some Hotspot runtime code and that looks ok. I saw a couple of strange things on my way through the > code. See comments. Hi David, this code somewhat predates me, so I initially kept the JVM_ENTRY since that was what was already in place. IIRC the thread state transition was added later to be able to call JNI code, which checks that the thread state is native in some asserts. I've re-written this code, per @coleenp 's suggestion, to use VM code directly to replace what we were doing with JNI, so the thread state transition is also gone. I've looked at some of the *_ENTRY macros and the only one that seems to avoid the thread state transition is JVM_LEAF. I've switched the RegisterNatives functions we use to JVM_LEAF to avoid the redundant transitions. I also tried changing `PI_invokeNative` to JVM_LEAF, but since we can call back into Java from that, I run into a missing handle mark assert for some of the tests, so I've left that one as JVM_ENTRY (but removed some redundant braces). I've created a separate sub-pr against this PR's branch to make it easier to see what I've changed: https://github.com/mcimadamore/jdk/pull/1 (feel free to take a look). Thanks for the comments. ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From dnsimon at openjdk.java.net Mon Oct 19 12:09:10 2020 From: dnsimon at openjdk.java.net (Doug Simon) Date: Mon, 19 Oct 2020 12:09:10 GMT Subject: RFR: 8254793: [JVMCI] improve speculation encoding [v4] In-Reply-To: References: Message-ID: On Fri, 16 Oct 2020 19:45:51 GMT, Dean Long wrote: >> Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental >> views will show differences compared to the previous content of the PR. The pull request contains one new commit since >> the last revision: >> 8254793: encode a HotSpotSpeculation in 31 bits > > Marked as reviewed by dlong (Reviewer). @dean-long @vnkozlov I've made a few new changes based on @tkrodriguez 's comments. Please let me know if it still looks good. I've also rerun mach5 testing on the new changes and everything passes. ------------- PR: https://git.openjdk.java.net/jdk/pull/667 From mgronlun at openjdk.java.net Mon Oct 19 12:20:17 2020 From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 19 Oct 2020 12:20:17 GMT Subject: RFR: 8249675: x86: frequency extraction from cpu brand string is incomplete Message-ID: <4cLDGQwOXZqws-dr9QulUFnxYJ646Y_ObUzfjEfSfD4=.4f10936d-286c-406e-b72d-4849e3d4a5a6@github.com> Greetings, `VM_Version_Ext::max_qualified_cpu_freq_from_brand_string()` attempts to extract the CPU frequency, by inspecting the CPU brand string (as per the document "Intel? Processor Identification and the CPUID Instruction. Application note 845 May 2012"). There is a bug with the current implementation, because it is naive in using the following construct: const char* Hz_location = strchr(brand_string, 'H'); This likely works for most CPU models / brands, but not when the brand string is for example of the following form: "Intel(R) Core(TM) i7-9850H CPU @ 2.60GHz" The 'H' in "9850H" will be matched, but since there is no 'z' in the next position, the code will fall through and report a frequency of 0. Testing: - [x] jdk_jfr - [x] debug verification for brand string "Intel(R) Core(TM) i7-9850H CPU @ 2.60GHz" Comment: The doc link is stale and therefore removed. No stable links for the Application note 845 versions could be located, hence the doc is here referenced by name instead. Thanks Markus ------------- Commit messages: - 8249675: x86: frequency extraction from cpu brand string is incomplete Changes: https://git.openjdk.java.net/jdk/pull/736/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=736&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8249675 Stats: 50 lines in 2 files changed: 6 ins; 12 del; 32 mod Patch: https://git.openjdk.java.net/jdk/pull/736.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/736/head:pull/736 PR: https://git.openjdk.java.net/jdk/pull/736 From pliden at openjdk.java.net Mon Oct 19 12:25:14 2020 From: pliden at openjdk.java.net (Per Liden) Date: Mon, 19 Oct 2020 12:25:14 GMT Subject: RFR: 8254878: Move last piece of ZArray to GrowableArray [v2] In-Reply-To: References: Message-ID: On Mon, 19 Oct 2020 08:22:47 GMT, Stefan Karlsson wrote: >> Per Liden has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev >> excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since >> the last revision: >> 8254878: Move last piece of ZArray to GrowableArray > > src/hotspot/share/utilities/growableArray.hpp line 696: > >> 694: >> 695: public: >> 696: GrowableArrayCHeap(int initial_max = 0) : > > Just noting that GrowableArray defaults to 2 and here we default to 0. Thanks for reviewing. For a CHeap allocater array I think an initial_max of 0 makes more sense, since the cost of doing an allocation is typically higher, and lazily allocating the array means we will avoid the allocation cost all together in cases where an array was created but elements were never added. ------------- PR: https://git.openjdk.java.net/jdk/pull/694 From egahlin at openjdk.java.net Mon Oct 19 12:51:16 2020 From: egahlin at openjdk.java.net (Erik Gahlin) Date: Mon, 19 Oct 2020 12:51:16 GMT Subject: RFR: 8249675: x86: frequency extraction from cpu brand string is incomplete In-Reply-To: <4cLDGQwOXZqws-dr9QulUFnxYJ646Y_ObUzfjEfSfD4=.4f10936d-286c-406e-b72d-4849e3d4a5a6@github.com> References: <4cLDGQwOXZqws-dr9QulUFnxYJ646Y_ObUzfjEfSfD4=.4f10936d-286c-406e-b72d-4849e3d4a5a6@github.com> Message-ID: On Mon, 19 Oct 2020 12:12:51 GMT, Markus Gr?nlund wrote: > Greetings, > > `VM_Version_Ext::max_qualified_cpu_freq_from_brand_string()` attempts to extract the CPU frequency, by inspecting the > CPU brand string (as per the document "Intel? Processor Identification and the CPUID Instruction. Application note 845 > May 2012"). There is a bug with the current implementation, because it is naive in using the following construct: > > const char* Hz_location = strchr(brand_string, 'H'); > > This likely works for most CPU models / brands, but not when the brand string is for example of the following form: > > "Intel(R) Core(TM) i7-9850H CPU @ 2.60GHz" > > The 'H' in "9850H" will be matched, but since there is no 'z' in the next position, the code will fall through and > report a frequency of 0. > Testing: > - [x] jdk_jfr > - [x] debug verification for brand string "Intel(R) Core(TM) i7-9850H CPU @ 2.60GHz" > > Comment: The doc link is stale and therefore removed. No stable links for the Application note 845 versions could be > located, hence the doc is here referenced by name instead. > Thanks > Markus Marked as reviewed by egahlin (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/736 From mcimadamore at openjdk.java.net Mon Oct 19 13:13:27 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Mon, 19 Oct 2020 13:13:27 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v5] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the first incubation round of the foreign linker access API incubation > (see JEP 389 [1]). This work is meant to sit on top of the foreign memory access support (see JEP 393 [2] and > associated pull request [3]). > The main goal of this API is to provide a way to call native functions from Java code without the need of intermediate > JNI glue code. In order to do this, native calls are modeled through the MethodHandle API. I suggest reading the > writeup [4] I put together few weeks ago, which illustrates what the foreign linker support is, and how it should be > used by clients. Disclaimer: the pull request mechanism isn't great at managing *dependent* reviews. For this reasons, > I'm attaching a webrev which contains only the differences between this PR and the memory access PR. I will be > periodically uploading new webrevs, as new iterations come out, to try and make the life of reviewers as simple as > possible. A big thank to Jorn Vernee and Vladimir Ivanov - they are the main architects of all the hotspot changes you > see here, and without their help, the foreign linker support wouldn't be what it is today. As usual, a big thank to > Paul Sandoz, who provided many insights (often by trying the bits first hand). Thanks Maurizio > Webrev: > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/webrev > > Javadoc: > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/javadoc/jdk/incubator/foreign/package-summary.html > > Specdiff (relative to [3]): > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/specdiff_delta/overview-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254232 > > > > ### API Changes > > The API changes are actually rather slim: > > * `LibraryLookup` > * This class allows clients to lookup symbols in native libraries; the interface is fairly simple; you can load a library > by name, or absolute path, and then lookup symbols on that library. > * `FunctionDescriptor` > * This is an abstraction that is very similar, in spirit, to `MethodType`; it is, at its core, an aggregate of memory > layouts for the function arguments/return type. A function descriptor is used to describe the signature of a native > function. > * `CLinker` > * This is the real star of the show. A `CLinker` has two main methods: `downcallHandle` and `upcallStub`; the first takes > a native symbol (as obtained from `LibraryLookup`), a `MethodType` and a `FunctionDescriptor` and returns a > `MethodHandle` instance which can be used to call the target native symbol. The second takes an existing method handle, > and a `FunctionDescriptor` and returns a new `MemorySegment` corresponding to a code stub allocated by the VM which > acts as a trampoline from native code to the user-provided method handle. This is very useful for implementing upcalls. > * This class also contains the various layout constants that should be used by clients when describing native signatures > (e.g. `C_LONG` and friends); these layouts contain additional ABI classfication information (in the form of layout > attributes) which is used by the runtime to *infer* how Java arguments should be shuffled for the native call to take > place. > * Finally, this class provides some helper functions e.g. so that clients can convert Java strings into C strings and > back. > * `NativeScope` > * This is an helper class which allows clients to group together logically related allocations; that is, rather than > allocating separate memory segments using separate *try-with-resource* constructs, a `NativeScope` allows clients to > use a _single_ block, and allocate all the required segments there. This is not only an usability boost, but also a > performance boost, since not all allocation requests will be turned into `malloc` calls. > * `MemorySegment` > * Only one method added here - namely `handoff(NativeScope)` which allows a segment to be transferred onto an existing > native scope. > > ### Safety > > The foreign linker API is intrinsically unsafe; many things can go wrong when requesting a native method handle. For > instance, the description of the native signature might be wrong (e.g. have too many arguments) - and the runtime has, > in the general case, no way to detect such mismatches. For these reasons, obtaining a `CLinker` instance is > a *restricted* operation, which can be enabled by specifying the usual JDK property `-Dforeign.restricted=permit` (as > it's the case for other restricted method in the foreign memory API). ### Implementation changes The Java changes > associated with `LibraryLookup` are relative straightforward; the only interesting thing to note here is that library > loading does _not_ depend on class loaders, so `LibraryLookup` is not subject to the same restrictions which apply to > JNI library loading (e.g. same library cannot be loaded by different classloaders). As for `NativeScope` the changes > are again relatively straightforward; it is an API which sits neatly on top of the foreign meory access API, providing > some kind of allocation service which shares the same underlying memory segment(s), and turns an allocation request > into a segment slice, which is a much less expensive operation. `NativeScope` comes in two variants: there are native > scopes for which the allocation size is known a priori, and native scopes which can grow - these two schemes are > implemented by two separate subclasses of `AbstractNativeScopeImpl`. Of course the bulk of the changes are to support > the `CLinker` downcall/upcall routines. These changes cut pretty deep into the JVM; I'll briefly summarize the goal of > some of this changes - for further details, Jorn has put together a detailed writeup which explains the rationale > behind the VM support, with some references to the code [5]. The main idea behind foreign linker is to infer, given a > Java method type (expressed as a `MethodType` instance) and the description of the signature of a native function > (expressed as a `FunctionDescriptor` instance) a _recipe_ that can be used to turn a Java call into the corresponding > native call targeting the requested native function. This inference scheme can be defined in a pretty straightforward > fashion by looking at the various ABI specifications (for instance, see [6] for the SysV ABI, which is the one used on > Linux/Mac). The various `CallArranger` classes, of which we have a flavor for each supported platform, do exactly that > kind of inference. For the inference process to work, we need to attach extra information to memory layouts; it is no > longer sufficient to know e.g. that a layout is 32/64 bits - we need to know whether it is meant to represent a > floating point value, or an integral value; this knowledge is required because floating points are passed in different > registers by most ABIs. For this reason, `CLinker` offers a set of pre-baked, platform-dependent layout constants which > contain the required classification attributes (e.g. a `Clinker.TypeKind` enum value). The runtime extracts this > attribute, and performs classification accordingly. A native call is decomposed into a sequence of basic, primitive > operations, called `Binding` (see the great javadoc on the `Binding.java` class for more info). There are many such > bindings - for instance the `Move` binding is used to move a value into a specific machine register/stack slot. So, the > main job of the various `CallingArranger` classes is to determine, given a Java `MethodType` and `FunctionDescriptor` > what is the set of bindings associated with the downcall/upcall. At the heart of the foreign linker support is the > `ProgrammableInvoker` class. This class effectively generates a `MethodHandle` which follows the steps described by the > various bindings obtained by `CallArranger`. There are actually various strategies to interpret these bindings - listed > below: > * basic intepreted mode; in this mode, all bindings are interpreted using a stack-based machine written in Java (see > `BindingInterpreter`), except for the `Move` bindings. For these bindings, the move is implemented by allocating > a *buffer* (whose size is ABI specific) and by moving all the lowered values into positions within this buffer. The > buffer is then passed to a piece of assembly code inside the VM which takes values from the buffer and moves them in > their expected registers/stack slots (note that each position in the buffer corresponds to a different register). This > is the most general invocation mode, the more "customizable" one, but also the slowest - since for every call there is > some extra allocation which takes place. > > * specialized interpreted mode; same as before, but instead of interpreting the bindings with a stack-based interpreter, > we generate a method handle chain which effectively interprets all the bindings (again, except `Move` ones). > > * intrinsified mode; this is typically used in combination with the specialized interpreted mode described above > (although it can also be used with the Java-based binding interpreter). The goal here is to remove the buffer > allocation and copy by introducing an additional JVM intrinsic. If a native call recipe is constant (e.g. the set of > bindings is constant, which is probably the case if the native method handle is stored in a `static`, `final` field), > then the VM can generate specialized assembly code which interprets the `Move` binding without the need to go for an > intermediate buffer. This gives us back performances that are on par with JNI. > > For upcalls, the support is not (yet) as advanced, and only the basic interpreted mode is available there. We plan to > add support for intrinsified modes there as well, which should considerably boost perfomances (probably well beyond > what JNI can offer at the moment, since the upcall support in JNI is not very well optimized). Again, for more > readings on the internals of the foreign linker support, please refer to [5]. > #### Test changes > > Many new tests have been added to validate the foreign linker support; we have high level tests (see `StdLibTest`) > which aim at testing the linker from the perspective of code that clients could write. But we also have deeper > combinatorial tests (see `TestUpcall` and `TestDowncall`) which are meant to stress every corner of the ABI > implementation. There are also some great tests (see the `callarranger` folder) which test the various `CallArranger`s > for all the possible platforms; these tests adopt more of a white-box approach - that is, instead of treating the > linker machinery as a black box and verify that the support works by checking that the native call returned the results > we expected, these tests aims at checking that the set of bindings generated by the call arranger is correct. This also > mean that we can test the classification logic for Windows, Mac and Linux regardless of the platform we're executing > on. Some additional microbenchmarks have been added to compare the performances of downcall/upcall with JNI. [1] - > https://openjdk.java.net/jeps/389 [2] - https://openjdk.java.net/jeps/393 [3] - > https://git.openjdk.java.net/jdk/pull/548 [4] - > https://github.com/openjdk/panama-foreign/blob/foreign-jextract/doc/panama_ffi.md [5] - > http://cr.openjdk.java.net/~jvernee/docs/Foreign-abi%20downcall%20intrinsics%20technical%20description.html Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Use separate constants for native invoker code size ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/634/files - new: https://git.openjdk.java.net/jdk/pull/634/files/830c5cea..c595a8dd Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=634&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=634&range=03-04 Stats: 12 lines in 4 files changed: 8 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/634.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/634/head:pull/634 PR: https://git.openjdk.java.net/jdk/pull/634 From mcimadamore at openjdk.java.net Mon Oct 19 15:01:29 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Mon, 19 Oct 2020 15:01:29 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v6] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the first incubation round of the foreign linker access API incubation > (see JEP 389 [1]). This work is meant to sit on top of the foreign memory access support (see JEP 393 [2] and > associated pull request [3]). > The main goal of this API is to provide a way to call native functions from Java code without the need of intermediate > JNI glue code. In order to do this, native calls are modeled through the MethodHandle API. I suggest reading the > writeup [4] I put together few weeks ago, which illustrates what the foreign linker support is, and how it should be > used by clients. Disclaimer: the pull request mechanism isn't great at managing *dependent* reviews. For this reasons, > I'm attaching a webrev which contains only the differences between this PR and the memory access PR. I will be > periodically uploading new webrevs, as new iterations come out, to try and make the life of reviewers as simple as > possible. A big thank to Jorn Vernee and Vladimir Ivanov - they are the main architects of all the hotspot changes you > see here, and without their help, the foreign linker support wouldn't be what it is today. As usual, a big thank to > Paul Sandoz, who provided many insights (often by trying the bits first hand). Thanks Maurizio > Webrev: > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/webrev > > Javadoc: > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/javadoc/jdk/incubator/foreign/package-summary.html > > Specdiff (relative to [3]): > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/specdiff_delta/overview-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254232 > > > > ### API Changes > > The API changes are actually rather slim: > > * `LibraryLookup` > * This class allows clients to lookup symbols in native libraries; the interface is fairly simple; you can load a library > by name, or absolute path, and then lookup symbols on that library. > * `FunctionDescriptor` > * This is an abstraction that is very similar, in spirit, to `MethodType`; it is, at its core, an aggregate of memory > layouts for the function arguments/return type. A function descriptor is used to describe the signature of a native > function. > * `CLinker` > * This is the real star of the show. A `CLinker` has two main methods: `downcallHandle` and `upcallStub`; the first takes > a native symbol (as obtained from `LibraryLookup`), a `MethodType` and a `FunctionDescriptor` and returns a > `MethodHandle` instance which can be used to call the target native symbol. The second takes an existing method handle, > and a `FunctionDescriptor` and returns a new `MemorySegment` corresponding to a code stub allocated by the VM which > acts as a trampoline from native code to the user-provided method handle. This is very useful for implementing upcalls. > * This class also contains the various layout constants that should be used by clients when describing native signatures > (e.g. `C_LONG` and friends); these layouts contain additional ABI classfication information (in the form of layout > attributes) which is used by the runtime to *infer* how Java arguments should be shuffled for the native call to take > place. > * Finally, this class provides some helper functions e.g. so that clients can convert Java strings into C strings and > back. > * `NativeScope` > * This is an helper class which allows clients to group together logically related allocations; that is, rather than > allocating separate memory segments using separate *try-with-resource* constructs, a `NativeScope` allows clients to > use a _single_ block, and allocate all the required segments there. This is not only an usability boost, but also a > performance boost, since not all allocation requests will be turned into `malloc` calls. > * `MemorySegment` > * Only one method added here - namely `handoff(NativeScope)` which allows a segment to be transferred onto an existing > native scope. > > ### Safety > > The foreign linker API is intrinsically unsafe; many things can go wrong when requesting a native method handle. For > instance, the description of the native signature might be wrong (e.g. have too many arguments) - and the runtime has, > in the general case, no way to detect such mismatches. For these reasons, obtaining a `CLinker` instance is > a *restricted* operation, which can be enabled by specifying the usual JDK property `-Dforeign.restricted=permit` (as > it's the case for other restricted method in the foreign memory API). ### Implementation changes The Java changes > associated with `LibraryLookup` are relative straightforward; the only interesting thing to note here is that library > loading does _not_ depend on class loaders, so `LibraryLookup` is not subject to the same restrictions which apply to > JNI library loading (e.g. same library cannot be loaded by different classloaders). As for `NativeScope` the changes > are again relatively straightforward; it is an API which sits neatly on top of the foreign meory access API, providing > some kind of allocation service which shares the same underlying memory segment(s), and turns an allocation request > into a segment slice, which is a much less expensive operation. `NativeScope` comes in two variants: there are native > scopes for which the allocation size is known a priori, and native scopes which can grow - these two schemes are > implemented by two separate subclasses of `AbstractNativeScopeImpl`. Of course the bulk of the changes are to support > the `CLinker` downcall/upcall routines. These changes cut pretty deep into the JVM; I'll briefly summarize the goal of > some of this changes - for further details, Jorn has put together a detailed writeup which explains the rationale > behind the VM support, with some references to the code [5]. The main idea behind foreign linker is to infer, given a > Java method type (expressed as a `MethodType` instance) and the description of the signature of a native function > (expressed as a `FunctionDescriptor` instance) a _recipe_ that can be used to turn a Java call into the corresponding > native call targeting the requested native function. This inference scheme can be defined in a pretty straightforward > fashion by looking at the various ABI specifications (for instance, see [6] for the SysV ABI, which is the one used on > Linux/Mac). The various `CallArranger` classes, of which we have a flavor for each supported platform, do exactly that > kind of inference. For the inference process to work, we need to attach extra information to memory layouts; it is no > longer sufficient to know e.g. that a layout is 32/64 bits - we need to know whether it is meant to represent a > floating point value, or an integral value; this knowledge is required because floating points are passed in different > registers by most ABIs. For this reason, `CLinker` offers a set of pre-baked, platform-dependent layout constants which > contain the required classification attributes (e.g. a `Clinker.TypeKind` enum value). The runtime extracts this > attribute, and performs classification accordingly. A native call is decomposed into a sequence of basic, primitive > operations, called `Binding` (see the great javadoc on the `Binding.java` class for more info). There are many such > bindings - for instance the `Move` binding is used to move a value into a specific machine register/stack slot. So, the > main job of the various `CallingArranger` classes is to determine, given a Java `MethodType` and `FunctionDescriptor` > what is the set of bindings associated with the downcall/upcall. At the heart of the foreign linker support is the > `ProgrammableInvoker` class. This class effectively generates a `MethodHandle` which follows the steps described by the > various bindings obtained by `CallArranger`. There are actually various strategies to interpret these bindings - listed > below: > * basic intepreted mode; in this mode, all bindings are interpreted using a stack-based machine written in Java (see > `BindingInterpreter`), except for the `Move` bindings. For these bindings, the move is implemented by allocating > a *buffer* (whose size is ABI specific) and by moving all the lowered values into positions within this buffer. The > buffer is then passed to a piece of assembly code inside the VM which takes values from the buffer and moves them in > their expected registers/stack slots (note that each position in the buffer corresponds to a different register). This > is the most general invocation mode, the more "customizable" one, but also the slowest - since for every call there is > some extra allocation which takes place. > > * specialized interpreted mode; same as before, but instead of interpreting the bindings with a stack-based interpreter, > we generate a method handle chain which effectively interprets all the bindings (again, except `Move` ones). > > * intrinsified mode; this is typically used in combination with the specialized interpreted mode described above > (although it can also be used with the Java-based binding interpreter). The goal here is to remove the buffer > allocation and copy by introducing an additional JVM intrinsic. If a native call recipe is constant (e.g. the set of > bindings is constant, which is probably the case if the native method handle is stored in a `static`, `final` field), > then the VM can generate specialized assembly code which interprets the `Move` binding without the need to go for an > intermediate buffer. This gives us back performances that are on par with JNI. > > For upcalls, the support is not (yet) as advanced, and only the basic interpreted mode is available there. We plan to > add support for intrinsified modes there as well, which should considerably boost perfomances (probably well beyond > what JNI can offer at the moment, since the upcall support in JNI is not very well optimized). Again, for more > readings on the internals of the foreign linker support, please refer to [5]. > #### Test changes > > Many new tests have been added to validate the foreign linker support; we have high level tests (see `StdLibTest`) > which aim at testing the linker from the perspective of code that clients could write. But we also have deeper > combinatorial tests (see `TestUpcall` and `TestDowncall`) which are meant to stress every corner of the ABI > implementation. There are also some great tests (see the `callarranger` folder) which test the various `CallArranger`s > for all the possible platforms; these tests adopt more of a white-box approach - that is, instead of treating the > linker machinery as a black box and verify that the support works by checking that the native call returned the results > we expected, these tests aims at checking that the set of bindings generated by the call arranger is correct. This also > mean that we can test the classification logic for Windows, Mac and Linux regardless of the platform we're executing > on. Some additional microbenchmarks have been added to compare the performances of downcall/upcall with JNI. [1] - > https://openjdk.java.net/jeps/389 [2] - https://openjdk.java.net/jeps/393 [3] - > https://git.openjdk.java.net/jdk/pull/548 [4] - > https://github.com/openjdk/panama-foreign/blob/foreign-jextract/doc/panama_ffi.md [5] - > http://cr.openjdk.java.net/~jvernee/docs/Foreign-abi%20downcall%20intrinsics%20technical%20description.html Maurizio Cimadamore has updated the pull request incrementally with two additional commits since the last revision: - Fix incorrect capitalization in one copyright header - Update copyright years, and add classpath exception to files that were missing it ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/634/files - new: https://git.openjdk.java.net/jdk/pull/634/files/c595a8dd..7d6eadc7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=634&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=634&range=04-05 Stats: 117 lines in 56 files changed: 56 ins; 0 del; 61 mod Patch: https://git.openjdk.java.net/jdk/pull/634.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/634/head:pull/634 PR: https://git.openjdk.java.net/jdk/pull/634 From never at openjdk.java.net Mon Oct 19 18:13:17 2020 From: never at openjdk.java.net (Tom Rodriguez) Date: Mon, 19 Oct 2020 18:13:17 GMT Subject: RFR: 8254793: [JVMCI] improve speculation encoding [v4] In-Reply-To: References: Message-ID: <27ZvtQC3PmOThzKCdwEUF_IBjfjERYr8nML5YZtcg58=.22c3b997-2de3-4380-b053-fe0c363bb975@github.com> On Mon, 19 Oct 2020 12:06:07 GMT, Doug Simon wrote: >> Marked as reviewed by dlong (Reviewer). > > @dean-long @vnkozlov I've made a few new changes based on @tkrodriguez 's comments. Please let me know if it still > looks good. I've also rerun mach5 testing on the new changes and everything passes. New version looks good. ------------- PR: https://git.openjdk.java.net/jdk/pull/667 From kvn at openjdk.java.net Mon Oct 19 18:17:18 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 19 Oct 2020 18:17:18 GMT Subject: RFR: 8253525: Implement getInstanceSize/sizeOf intrinsics [v3] In-Reply-To: References: Message-ID: On Mon, 19 Oct 2020 06:57:24 GMT, Aleksey Shipilev wrote: >> This is fork off the SizeOf JEP, JDK-8249196. There is already the entry point in JDK that can use the intrinsic like >> this: `Instrumentation.getInstanceSize`. Therefore, we can implement the C1/C2 intrinsic now, hook it up to >> `Instrumentation`, and let the tools use that fast path today. With this patch, JOL is able to be close to >> `deepSizeOf` implementation from SizeOf JEP. >> Example performance improvements for sizing up a custom linked list: >> >> Benchmark (size) Mode Cnt Score Error Units >> >> # Default >> LinkedChainBench.linkedChain 1 avgt 5 705.835 ? 8.051 ns/op >> LinkedChainBench.linkedChain 10 avgt 5 3148.874 ? 37.856 ns/op >> LinkedChainBench.linkedChain 100 avgt 5 28693.256 ? 142.254 ns/op >> LinkedChainBench.linkedChain 1000 avgt 5 290161.590 ? 4594.631 ns/op >> >> # Instrumentation attached, no intrinsics >> LinkedChainBench.linkedChain 1 avgt 5 159.659 ? 19.238 ns/op >> LinkedChainBench.linkedChain 10 avgt 5 717.659 ? 22.540 ns/op >> LinkedChainBench.linkedChain 100 avgt 5 7739.394 ? 111.683 ns/op >> LinkedChainBench.linkedChain 1000 avgt 5 80724.238 ? 2887.794 ns/op >> >> # Instrumentation attached, new intrinsics >> LinkedChainBench.linkedChain 1 avgt 5 95.254 ? 0.808 ns/op >> LinkedChainBench.linkedChain 10 avgt 5 261.564 ? 8.524 ns/op >> LinkedChainBench.linkedChain 100 avgt 5 3367.192 ? 21.128 ns/op >> LinkedChainBench.linkedChain 1000 avgt 5 34148.851 ? 373.080 ns/op > > Aleksey Shipilev has refreshed the contents of this pull request, and previous commits have been removed. The > incremental views will show differences compared to the previous content of the PR. Always run graalunit testing with new intrinsics. You need to adjust Graal test: src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java ------------- Changes requested by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/650 From kvn at openjdk.java.net Mon Oct 19 18:22:11 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 19 Oct 2020 18:22:11 GMT Subject: RFR: 8254793: [JVMCI] improve speculation encoding [v5] In-Reply-To: References: Message-ID: On Sun, 18 Oct 2020 21:51:23 GMT, Doug Simon wrote: >> This PR changes the encoding of a `jdk.vm.ci.hotspot.HotSpotSpeculationLog.HotSpotSpeculation` from a long to an int. >> The `Thread::_pending_failed_speculation` field remains as a `jlong` since it is already exposed to JVMCI Java code >> already via VMStructs and this PR does not update its usage in Graal. > > Doug Simon has updated the pull request incrementally with two additional commits since the last revision: > > - require SHA1 algorithm to be present > - simplified changes such that only the length component of an encode speculation is reduced (to 5 bits) Still good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/667 From kvn at openjdk.java.net Mon Oct 19 18:36:15 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 19 Oct 2020 18:36:15 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v4] In-Reply-To: References: Message-ID: On Sun, 18 Oct 2020 18:39:18 GMT, Jatin Bhateja wrote: >> Summary: >> >> 1) Partial in-lining technique avoids call overhead penalty for small array copy operations with size less than 32 >> bytes. 2) At runtime, a conditional check based on copy length either calls an array-copy stub or executes an optimized >> instruction sequence using AVX-512 masked instructions emitted at the call site. 3) New runtime flag >> ArrayCopyPartialInlineSize=0/32(default)/64 bytes determines the maximum size for partial in-lining. 4) Based on the >> perf results seen in benchmarks currently partial in-lining is performed only for arraycopy involving sub-word types >> (bool/byte/char/short). Once PR-61 gets integrated we can extend this patch to cover all the primitive types. >> Performance Results: >> System : CascadeLake Server, Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz >> Micros : test/micro/org/openjdk/bench/java/lang/ArrayCopy*.java >> ArrayCopyPartialInlineSize : 32 >> >> JMH | Block Size | Baseline (ns/op) | Partial Inling (ns/op) | Gain >> -- | -- | -- | -- | -- >> ArrayCopyAligned.testByte | 1 | 5.417 | 2.696 | 2.009272997 >> ArrayCopyAligned.testByte | 3 | 5.494 | 2.702 | 2.03330866 >> ArrayCopyAligned.testByte | 5 | 5.417 | 2.637 | 2.05422829 >> ArrayCopyAligned.testByte | 10 | 5.343 | 2.703 | 1.976692564 >> ArrayCopyAligned.testByte | 20 | 5.837 | 2.636 | 2.214339909 >> ArrayCopyAligned.testByte | 70 | 5.86 | 6 | 0.976666667 >> ArrayCopyAligned.testByte | 150 | 6.766 | 6.906 | 0.979727773 >> ArrayCopyAligned.testByte | 300 | 7.605 | 7.952 | 0.956363179 >> ArrayCopyAligned.testByte | 600 | 11.989 | 12.007 | 0.998500874 >> ArrayCopyAligned.testByte | 1200 | 16.447 | 16.585 | 0.991679228 >> ArrayCopyAligned.testChar | 1 | 5.02 | 2.828 | 1.775106082 >> ArrayCopyAligned.testChar | 3 | 5.129 | 2.762 | 1.85698769 >> ArrayCopyAligned.testChar | 5 | 5.041 | 2.762 | 1.82512672 >> ArrayCopyAligned.testChar | 10 | 5.716 | 2.762 | 2.069514844 >> ArrayCopyAligned.testChar | 20 | 5.111 | 5.399 | 0.946656788 >> ArrayCopyAligned.testChar | 70 | 6.271 | 6.242 | 1.004645947 >> ArrayCopyAligned.testChar | 150 | 7.45 | 7.599 | 0.980392157 >> ArrayCopyAligned.testChar | 300 | 9.904 | 10.112 | 0.97943038 >> ArrayCopyAligned.testChar | 600 | 17.131 | 17.167 | 0.997902953 >> ArrayCopyAligned.testChar | 1200 | 29.556 | 29.851 | 0.990117584 >> ArrayCopyUnalignedBoth.testByte | 1 | 5.419 | 2.702 | 2.005551443 >> ArrayCopyUnalignedBoth.testByte | 3 | 5.558 | 2.636 | 2.108497724 >> ArrayCopyUnalignedBoth.testByte | 5 | 5.43 | 2.636 | 2.059939302 >> ArrayCopyUnalignedBoth.testByte | 10 | 5.378 | 2.637 | 2.039438756 >> ArrayCopyUnalignedBoth.testByte | 20 | 5.914 | 2.636 | 2.243550836 >> ArrayCopyUnalignedBoth.testByte | 70 | 5.882 | 5.954 | 0.987907289 >> ArrayCopyUnalignedBoth.testByte | 150 | 6.784 | 6.88 | 0.986046512 >> ArrayCopyUnalignedBoth.testByte | 300 | 7.635 | 7.968 | 0.958207831 >> ArrayCopyUnalignedBoth.testByte | 600 | 12.226 | 12.129 | 1.007997362 >> ArrayCopyUnalignedBoth.testByte | 1200 | 16.992 | 20.717 | 0.820195974 >> ArrayCopyUnalignedBoth.testChar | 1 | 5.019 | 2.828 | 1.774752475 >> ArrayCopyUnalignedBoth.testChar | 3 | 5.163 | 2.763 | 1.868621064 >> ArrayCopyUnalignedBoth.testChar | 5 | 5.042 | 2.827 | 1.783516095 >> ArrayCopyUnalignedBoth.testChar | 10 | 5.718 | 2.828 | 2.021923621 >> ArrayCopyUnalignedBoth.testChar | 20 | 5.111 | 5.404 | 0.945780903 >> ArrayCopyUnalignedBoth.testChar | 70 | 6.367 | 6.235 | 1.02117081 >> ArrayCopyUnalignedBoth.testChar | 150 | 7.367 | 8.269 | 0.890917886 >> ArrayCopyUnalignedBoth.testChar | 300 | 10.358 | 10.642 | 0.973313287 >> ArrayCopyUnalignedBoth.testChar | 600 | 20.84 | 17.522 | 1.189361945 >> ArrayCopyUnalignedBoth.testChar | 1200 | 31.895 | 31.892 | 1.000094067 >> ArrayCopyUnalignedDst.testByte | 1 | 5.455 | 2.637 | 2.068638604 >> ArrayCopyUnalignedDst.testByte | 3 | 5.562 | 2.702 | 2.058475204 >> ArrayCopyUnalignedDst.testByte | 5 | 5.427 | 2.702 | 2.008512213 >> ArrayCopyUnalignedDst.testByte | 10 | 5.367 | 2.696 | 1.990727003 >> ArrayCopyUnalignedDst.testByte | 20 | 5.839 | 2.637 | 2.214258627 >> ArrayCopyUnalignedDst.testByte | 70 | 5.888 | 5.968 | 0.986595174 >> ArrayCopyUnalignedDst.testByte | 150 | 6.785 | 6.773 | 1.001771741 >> ArrayCopyUnalignedDst.testByte | 300 | 7.606 | 7.972 | 0.954089313 >> ArrayCopyUnalignedDst.testByte | 600 | 11.986 | 21.195 | 0.565510734 >> ArrayCopyUnalignedDst.testByte | 1200 | 16.54 | 16.784 | 0.985462345 >> ArrayCopyUnalignedDst.testChar | 1 | 5.02 | 2.827 | 1.775733994 >> ArrayCopyUnalignedDst.testChar | 3 | 5.131 | 2.762 | 1.857711803 >> ArrayCopyUnalignedDst.testChar | 5 | 5.038 | 2.762 | 1.82404055 >> ArrayCopyUnalignedDst.testChar | 10 | 5.718 | 2.762 | 2.070238957 >> ArrayCopyUnalignedDst.testChar | 20 | 5.113 | 5.401 | 0.946676541 >> ArrayCopyUnalignedDst.testChar | 70 | 6.222 | 6.214 | 1.001287416 >> ArrayCopyUnalignedDst.testChar | 150 | 7.367 | 8.125 | 0.906707692 >> ArrayCopyUnalignedDst.testChar | 300 | 10.204 | 10.082 | 1.012100774 >> ArrayCopyUnalignedDst.testChar | 600 | 16.978 | 17.135 | 0.990837467 >> ArrayCopyUnalignedDst.testChar | 1200 | 32.351 | 31.996 | 1.011095137 >> ArrayCopyUnalignedSrc.testByte | 1 | 5.414 | 2.696 | 2.008160237 >> ArrayCopyUnalignedSrc.testByte | 3 | 5.494 | 2.637 | 2.083428138 >> ArrayCopyUnalignedSrc.testByte | 5 | 5.431 | 2.637 | 2.059537353 >> ArrayCopyUnalignedSrc.testByte | 10 | 5.344 | 2.703 | 1.977062523 >> ArrayCopyUnalignedSrc.testByte | 20 | 5.834 | 2.696 | 2.163946588 >> ArrayCopyUnalignedSrc.testByte | 70 | 5.883 | 6.009 | 0.979031453 >> ArrayCopyUnalignedSrc.testByte | 150 | 6.729 | 6.87 | 0.979475983 >> ArrayCopyUnalignedSrc.testByte | 300 | 7.603 | 7.97 | 0.953952321 >> ArrayCopyUnalignedSrc.testByte | 600 | 12.004 | 12.16 | 0.987171053 >> ArrayCopyUnalignedSrc.testByte | 1200 | 16.534 | 16.643 | 0.9934507 >> ArrayCopyUnalignedSrc.testChar | 1 | 5.021 | 2.762 | 1.81788559 >> ArrayCopyUnalignedSrc.testChar | 3 | 5.13 | 2.762 | 1.857349747 >> ArrayCopyUnalignedSrc.testChar | 5 | 5.042 | 2.827 | 1.783516095 >> ArrayCopyUnalignedSrc.testChar | 10 | 5.726 | 2.761 | 2.073886273 >> ArrayCopyUnalignedSrc.testChar | 20 | 5.112 | 5.401 | 0.94649139 >> ArrayCopyUnalignedSrc.testChar | 70 | 6.113 | 6.227 | 0.981692629 >> ArrayCopyUnalignedSrc.testChar | 150 | 7.493 | 7.888 | 0.949923935 >> ArrayCopyUnalignedSrc.testChar | 300 | 10.234 | 10.501 | 0.97457385 >> ArrayCopyUnalignedSrc.testChar | 600 | 17.175 | 17.142 | 1.001925096 >> ArrayCopyUnalignedSrc.testChar | 1200 | 31.926 | 31.987 | 0.998092975 >> >> Detailed Reports: >> Baseline : >> [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt) >> WithOpt : >> [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt) > > Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains four commits: > - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8252848 > - Replacing explicit type checks with existing type checking routines > - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8252848 > - 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions. There is regression after 8252847 changes: 8254890. It should be fixed before we proceed with these changes. ------------- Changes requested by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/302 From rrich at openjdk.java.net Mon Oct 19 18:48:29 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Mon, 19 Oct 2020 18:48:29 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v14] In-Reply-To: References: Message-ID: > Hi, > > this is the continuation of the review of the implementation for: > > https://bugs.openjdk.java.net/browse/JDK-8227745 > https://bugs.openjdk.java.net/browse/JDK-8233915 > > It allows for JIT optimizations based on escape analysis even if JVMTI agents acquire capabilities to access references > to objects that are subject to such optimizations, e.g. scalar replacement. The implementation reverts such > optimizations just before access very much as when switching from JIT compiled execution to the interpreter, aka > "deoptimization". Webrev.8 was the last one before before the transition to Git/Github: > > http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ > > Thanks, Richard. Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 29 commits: - Merge branch 'master' into JDK-8227745 - Removed cross_modify_fence from JT::wait_for_object_deoptimization(). See JDK-8254264. - handle_special_runtime_exit_condition(): wait (blocked) for obj. deoptimization _before_ async ex. check. - Removed unused parameter from EscapeBarrierSuspendHandshake. - Adaptations to JDK-8254263: Remove special_runtime_exit_condition() check from ~ThreadInVMForHandshake() With JDK-8254263 the special_runtime_exit_condition() check was removed from ~ThreadInVMForHandshake() because now a thread never becomes unsafe when processing its own handshakes. EscapeBarrier uses handshakes to sync with the target thread for object deoptimization so we add a check for object deoptimization to ThreadSafepointState::handle_polling_page_exception(). In JavaThread::wait_for_object_deoptimization() we must check is_obj_deopt_suspend() again after handshake/safepoint processing because a handshake for obj. deopt suspend could have been processed. - Adaptions to lazy/concurrent thread stack processing for ZGC (JEP 376) - EATests.java improvements - Merge branch 'master' into JDK-8227745 - The constructor of StackFrameStream takes more parameters after JDK-8253180 - Merge branch 'master' into JDK-8227745 - ... and 19 more: https://git.openjdk.java.net/jdk/compare/1da28de8...8d09747b ------------- Changes: https://git.openjdk.java.net/jdk/pull/119/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=119&range=13 Stats: 5860 lines in 53 files changed: 5642 ins; 116 del; 102 mod Patch: https://git.openjdk.java.net/jdk/pull/119.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/119/head:pull/119 PR: https://git.openjdk.java.net/jdk/pull/119 From dnsimon at openjdk.java.net Mon Oct 19 18:57:16 2020 From: dnsimon at openjdk.java.net (Doug Simon) Date: Mon, 19 Oct 2020 18:57:16 GMT Subject: RFR: 8254793: [JVMCI] improve speculation encoding [v5] In-Reply-To: References: Message-ID: <-63qOvCVLehglWFDatNpzX2PT3z3L2fW8dwmL5PTQV0=.14c5a73a-e7c6-4cb6-89b4-83f6bba3396f@github.com> On Mon, 19 Oct 2020 18:19:17 GMT, Vladimir Kozlov wrote: >> Doug Simon has updated the pull request incrementally with two additional commits since the last revision: >> >> - require SHA1 algorithm to be present >> - simplified changes such that only the length component of an encode speculation is reduced (to 5 bits) > > Still good. Thanks for the reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/667 From dnsimon at openjdk.java.net Mon Oct 19 19:09:19 2020 From: dnsimon at openjdk.java.net (Doug Simon) Date: Mon, 19 Oct 2020 19:09:19 GMT Subject: Integrated: 8254793: [JVMCI] improve speculation encoding In-Reply-To: References: Message-ID: On Wed, 14 Oct 2020 22:01:19 GMT, Doug Simon wrote: > This PR changes the encoding of a `jdk.vm.ci.hotspot.HotSpotSpeculationLog.HotSpotSpeculation` from a long to an int. > The `Thread::_pending_failed_speculation` field remains as a `jlong` since it is already exposed to JVMCI Java code > already via VMStructs and this PR does not update its usage in Graal. This pull request has now been integrated. Changeset: f42c0322 Author: Doug Simon URL: https://git.openjdk.java.net/jdk/commit/f42c0322 Stats: 70 lines in 9 files changed: 50 ins; 1 del; 19 mod 8254793: [JVMCI] improve speculation encoding Reviewed-by: kvn, dlong, never ------------- PR: https://git.openjdk.java.net/jdk/pull/667 From rrich at openjdk.java.net Mon Oct 19 19:41:14 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Mon, 19 Oct 2020 19:41:14 GMT Subject: RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v10] In-Reply-To: References: Message-ID: On Wed, 14 Oct 2020 20:50:45 GMT, Richard Reingruber wrote: >> Good. > >> >> >> Good. > > Thanks for the review, Vladimir (@vnkozlov)! > I'm still (stress) testing adaptations to lazy/concurrent thread stack processing for ZGC. > --Richard. Thanks once more @TheRealMDoerr, @GoeLin, @dholmes-ora, @sspitsyn, @vnkozlov, @robehn, @pchilano for feedback and reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/119 From kvn at openjdk.java.net Mon Oct 19 20:23:12 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 19 Oct 2020 20:23:12 GMT Subject: RFR: 8254995: [x86] ControlWord::print(), rc/pc variables might not be initialized In-Reply-To: References: Message-ID: On Mon, 19 Oct 2020 10:20:18 GMT, Aleksey Shipilev wrote: > Static analyzers complain that in `ControlWord::print()`, `rc`/`pc` variables might not be initialized. This never > happens in practice, because `rounding_control()` and `precision_control()` return the good values. We can make it > cleaner to silence the compiler. Testing: > - [x] Linux x86_64 tier1 We prefer to use fatal() with printing unexpected value. ------------- Changes requested by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/731 From kvn at openjdk.java.net Mon Oct 19 20:29:18 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 19 Oct 2020 20:29:18 GMT Subject: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v9] In-Reply-To: References: Message-ID: On Mon, 19 Oct 2020 11:14:22 GMT, Fei Yang wrote: >> Contributed-by: ard.biesheuvel at linaro.org, dongbo4 at huawei.com >> >> This added an intrinsic for SHA3 using aarch64 v8.2 SHA3 Crypto Extensions. >> Reference implementation for core SHA-3 transform using ARMv8.2 Crypto Extensions: >> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/arm64/crypto/sha3-ce-core.S?h=v5.4.52 >> >> Trivial adaptation in SHA3. implCompress is needed for the purpose of adding the intrinsic. >> For SHA3, we need to pass one extra parameter "digestLength" to the stub for the calculation of block size. >> "digestLength" is also used in for the EOR loop before keccak to differentiate different SHA3 variants. >> >> We added jtreg tests for SHA3 and used QEMU system emulator which supports SHA3 instructions to test the functionality. >> Patch passed jtreg tier1-3 tests with QEMU system emulator. >> Also verified with jtreg tier1-3 tests without SHA3 instructions on aarch64-linux-gnu and x86_64-linux-gnu, to make >> sure that there's no regression. >> We used one existing JMH test for performance test: test/micro/org/openjdk/bench/java/security/MessageDigests.java >> We measured the performance benefit with an aarch64 cycle-accurate simulator. >> Patch delivers 20% - 40% performance improvement depending on specific SHA3 digest length and size of the message. >> >> For now, this feature will not be enabled automatically for aarch64. We can auto-enable this when it is fully tested on >> real hardware. But for the above testing purposes, this is auto-enabled when the corresponding hardware feature is >> detected. > > Fei Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains > 11 commits: > - Merge master > - Remove unnecessary code changes in vm_version_aarch64.cpp > - Merge master > - Merge master > - Merge master > - Merge master > - Add sha3 instructions to cpu/aarch64/aarch64-asmtest.py and regenerate the test in assembler_aarch64.cpp:asm_check > - Rebase > - Merge master > - Fix trailing whitespace issue > - ... and 1 more: https://git.openjdk.java.net/jdk/compare/e9be2db7...05551701 Always run graalunit testing with new intrinsics. You need to adjust Graal test: src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java ------------- Changes requested by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/207 From vlivanov at openjdk.java.net Mon Oct 19 20:40:23 2020 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Mon, 19 Oct 2020 20:40:23 GMT Subject: RFR: 8255000: C2: Unify IGVN processing when loop opts are over Message-ID: <5hxd0p94QlH5wKSuHZOGXBfPxeuFUxTdLtIEH7Lxxxo=.e8fbd981-e866-486e-9e08-0c3376a2c529@github.com> There is a number of use cases when ideal nodes (either individual or even the whole classes) delay some transformations until loop optimizations are over. Unfortunately, there are multiple solutions in the code base with their pros and cons: either a custom per-node class logic (e.g., for range check dependent `CastII` or `Opaque4` nodes) or `Compile::major_progress() == 0` as a signal that loop opts are over. I propose to introduce a unified approach to reliably process nodes that require (or may benefit from) IGVN pass once loop opts are finally over and migrate existing use cases to it. After some experimentation, I decided not to rely on `Compile::major_progress()` because: * it's hard to reason about its properties (there are many places where it is adjusted); * attempts to verify its monotonicity using asserts triggered too many sporadic failures. So, I wasn't persuaded that `Compile::major_progress() == 0` is reliable enough and introduced a dedicated flag (`Compile::post_loop_opts_phase()`) which signals that loop opts are over. (The patch - 69a93d4 commit - is on top of 8255026 cleanup which is reviewed separately.) Testing: tier1-tier5 ------------- Commit messages: - Unify post loop opts IGVN - 8255026: C2: Miscellaneous cleanups in Compile and PhaseIdealLoop code Changes: https://git.openjdk.java.net/jdk/pull/751/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=751&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255000 Stats: 451 lines in 15 files changed: 111 ins; 156 del; 184 mod Patch: https://git.openjdk.java.net/jdk/pull/751.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/751/head:pull/751 PR: https://git.openjdk.java.net/jdk/pull/751 From dholmes at openjdk.java.net Mon Oct 19 22:03:21 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 19 Oct 2020 22:03:21 GMT Subject: RFR: 8249675: x86: frequency extraction from cpu brand string is incomplete In-Reply-To: <4cLDGQwOXZqws-dr9QulUFnxYJ646Y_ObUzfjEfSfD4=.4f10936d-286c-406e-b72d-4849e3d4a5a6@github.com> References: <4cLDGQwOXZqws-dr9QulUFnxYJ646Y_ObUzfjEfSfD4=.4f10936d-286c-406e-b72d-4849e3d4a5a6@github.com> Message-ID: On Mon, 19 Oct 2020 12:12:51 GMT, Markus Gr?nlund wrote: > Greetings, > > `VM_Version_Ext::max_qualified_cpu_freq_from_brand_string()` attempts to extract the CPU frequency, by inspecting the > CPU brand string (as per the document "Intel? Processor Identification and the CPUID Instruction. Application note 845 > May 2012"). There is a bug with the current implementation, because it is naive in using the following construct: > > const char* Hz_location = strchr(brand_string, 'H'); > > This likely works for most CPU models / brands, but not when the brand string is for example of the following form: > > "Intel(R) Core(TM) i7-9850H CPU @ 2.60GHz" > > The 'H' in "9850H" will be matched, but since there is no 'z' in the next position, the code will fall through and > report a frequency of 0. > Testing: > - [x] jdk_jfr > - [x] debug verification for brand string "Intel(R) Core(TM) i7-9850H CPU @ 2.60GHz" > > Comment: The doc link is stale and therefore removed. No stable links for the Application note 845 versions could be > located, hence the doc is here referenced by name instead. > Thanks > Markus The changes seem fine, but couldn't you just use strrchr() to find the last H in the string? ------------- PR: https://git.openjdk.java.net/jdk/pull/736 From dholmes at openjdk.java.net Tue Oct 20 02:31:11 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 20 Oct 2020 02:31:11 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended In-Reply-To: References: Message-ID: <0UtoOZMbgTY-1hGACS1n5swvRCa9za_e3uu-XEVOidM=.85baf76c-e91b-492a-8841-10d9e49e3ec4@github.com> On Mon, 19 Oct 2020 09:59:34 GMT, Robbin Ehn wrote: > The main point of this change-set is to make it easier to implement S/R on top of handshakes. > Which is a prerequisite for removing _suspend_flag (which duplicates the handshake functionality). > But we also remove some complicated S/R methods. > > We basically just put in everything in the handshake closure, so the diff just looks much worse than what it is. > > TraceSuspendDebugBits have an ifdef, but in both cases it now just returns. > But I was unsure if I should remove now or when is_ext_suspend_completed() is removed. > > Passes multiple t1-5 runs, locally it passes many jck:vm/nsk_jvmti/nsk_jdi/jdk-jdi runs. Hi Robbin, IIUC the "waiting" part of `wait_for_ext_suspend_completion` is now implicitly handled in the handshake - correct? Overall this seems like a great simplification. A few minor comments below. Thanks, David src/hotspot/share/runtime/thread.cpp line 579: > 577: // That trace is very chatty. > 578: return; > 579: #else Without the !is_wait check none of the code below line 583 is reachable now. I would remove this now. src/hotspot/share/prims/jvmtiEnv.cpp line 1648: > 1646: op.doit(java_thread, true); > 1647: } else { > 1648: Handshake::execute(&op, java_thread); This pattern is repeated a lot - we should be able to incorporate it into the op itself by passing in `java_thread`. src/hotspot/share/prims/jvmtiEnvBase.cpp line 1525: > 1523: Thread* current_thread = Thread::current(); > 1524: HandleMark hm(current_thread); > 1525: JavaThread* java_thread = target->as_Java_thread(); Contrast with the same three lines at L1390 - we should use the same boilerplate in each `doit`. And ideally refactor into some shared code somewhere (future RFE). ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/729 From rehn at openjdk.java.net Tue Oct 20 06:18:13 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 20 Oct 2020 06:18:13 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended In-Reply-To: <0UtoOZMbgTY-1hGACS1n5swvRCa9za_e3uu-XEVOidM=.85baf76c-e91b-492a-8841-10d9e49e3ec4@github.com> References: <0UtoOZMbgTY-1hGACS1n5swvRCa9za_e3uu-XEVOidM=.85baf76c-e91b-492a-8841-10d9e49e3ec4@github.com> Message-ID: On Tue, 20 Oct 2020 02:14:56 GMT, David Holmes wrote: >> The main point of this change-set is to make it easier to implement S/R on top of handshakes. >> Which is a prerequisite for removing _suspend_flag (which duplicates the handshake functionality). >> But we also remove some complicated S/R methods. >> >> We basically just put in everything in the handshake closure, so the diff just looks much worse than what it is. >> >> TraceSuspendDebugBits have an ifdef, but in both cases it now just returns. >> But I was unsure if I should remove now or when is_ext_suspend_completed() is removed. >> >> Passes multiple t1-5 runs, locally it passes many jck:vm/nsk_jvmti/nsk_jdi/jdk-jdi runs. > > src/hotspot/share/runtime/thread.cpp line 579: > >> 577: // That trace is very chatty. >> 578: return; >> 579: #else > > Without the !is_wait check none of the code below line 583 is reachable now. I would remove this now. Since only the destructor contains any actual functionality, which is now unreachable, should I remove the entire TraceSuspendDebugBits? ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From thartmann at openjdk.java.net Tue Oct 20 06:27:19 2020 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Tue, 20 Oct 2020 06:27:19 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v4] In-Reply-To: References: Message-ID: On Mon, 19 Oct 2020 18:33:22 GMT, Vladimir Kozlov wrote: >> Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now >> contains four commits: >> - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8252848 >> - Replacing explicit type checks with existing type checking routines >> - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8252848 >> - 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions. > > There is regression after 8252847 changes: 8254890. > It should be fixed before we proceed with these changes. [JDK-8254890](https://bugs.openjdk.java.net/browse/JDK-8254890) is a closed bug because it contains confidential information. I've filed [JDK-8255039](https://bugs.openjdk.java.net/browse/JDK-8255039). ------------- PR: https://git.openjdk.java.net/jdk/pull/302 From david.holmes at oracle.com Tue Oct 20 06:36:48 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 20 Oct 2020 16:36:48 +1000 Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended In-Reply-To: References: <0UtoOZMbgTY-1hGACS1n5swvRCa9za_e3uu-XEVOidM=.85baf76c-e91b-492a-8841-10d9e49e3ec4@github.com> Message-ID: <19c09caa-fb95-4508-46a0-15be2b4ad1fc@oracle.com> On 20/10/2020 4:18 pm, Robbin Ehn wrote: > On Tue, 20 Oct 2020 02:14:56 GMT, David Holmes wrote: > >>> The main point of this change-set is to make it easier to implement S/R on top of handshakes. >>> Which is a prerequisite for removing _suspend_flag (which duplicates the handshake functionality). >>> But we also remove some complicated S/R methods. >>> >>> We basically just put in everything in the handshake closure, so the diff just looks much worse than what it is. >>> >>> TraceSuspendDebugBits have an ifdef, but in both cases it now just returns. >>> But I was unsure if I should remove now or when is_ext_suspend_completed() is removed. >>> >>> Passes multiple t1-5 runs, locally it passes many jck:vm/nsk_jvmti/nsk_jdi/jdk-jdi runs. >> >> src/hotspot/share/runtime/thread.cpp line 579: >> >>> 577: // That trace is very chatty. >>> 578: return; >>> 579: #else >> >> Without the !is_wait check none of the code below line 583 is reachable now. I would remove this now. > > Since only the destructor contains any actual functionality, which is now unreachable, should I remove the entire > TraceSuspendDebugBits? Go for it! :) Davids > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/729 > From rehn at openjdk.java.net Tue Oct 20 07:15:25 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 20 Oct 2020 07:15:25 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended In-Reply-To: <0UtoOZMbgTY-1hGACS1n5swvRCa9za_e3uu-XEVOidM=.85baf76c-e91b-492a-8841-10d9e49e3ec4@github.com> References: <0UtoOZMbgTY-1hGACS1n5swvRCa9za_e3uu-XEVOidM=.85baf76c-e91b-492a-8841-10d9e49e3ec4@github.com> Message-ID: On Tue, 20 Oct 2020 02:28:35 GMT, David Holmes wrote: >> The main point of this change-set is to make it easier to implement S/R on top of handshakes. >> Which is a prerequisite for removing _suspend_flag (which duplicates the handshake functionality). >> But we also remove some complicated S/R methods. >> >> We basically just put in everything in the handshake closure, so the diff just looks much worse than what it is. >> >> TraceSuspendDebugBits have an ifdef, but in both cases it now just returns. >> But I was unsure if I should remove now or when is_ext_suspend_completed() is removed. >> >> Passes multiple t1-5 runs, locally it passes many jck:vm/nsk_jvmti/nsk_jdi/jdk-jdi runs. > > Hi Robbin, > > IIUC the "waiting" part of `wait_for_ext_suspend_completion` is now implicitly handled in the handshake - correct? > > Overall this seems like a great simplification. > > A few minor comments below. > > Thanks, > David Hi David, thanks for having a look. > Hi Robbin, > > IIUC the "waiting" part of `wait_for_ext_suspend_completion` is now implicitly handled in the handshake - correct? A suspended Java thread may never transition back to java, never execute any more bytecodes. The old 'fully' suspended gives more guarantees which we need to do some operations. When we are in a handshake, the handshake gives us those guarantees, so it is enough that thread is considered suspended. So the answer is we wait until we are allowed to execute those operations (thread handshake safe), which is not identical with fully suspended. But fully suspended is an implementation detail, the agent only knows about suspended. > > Overall this seems like a great simplification. > > A few minor comments below. > > Thanks, > David ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From rehn at openjdk.java.net Tue Oct 20 07:20:29 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 20 Oct 2020 07:20:29 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended In-Reply-To: <0UtoOZMbgTY-1hGACS1n5swvRCa9za_e3uu-XEVOidM=.85baf76c-e91b-492a-8841-10d9e49e3ec4@github.com> References: <0UtoOZMbgTY-1hGACS1n5swvRCa9za_e3uu-XEVOidM=.85baf76c-e91b-492a-8841-10d9e49e3ec4@github.com> Message-ID: On Tue, 20 Oct 2020 02:22:15 GMT, David Holmes wrote: >> The main point of this change-set is to make it easier to implement S/R on top of handshakes. >> Which is a prerequisite for removing _suspend_flag (which duplicates the handshake functionality). >> But we also remove some complicated S/R methods. >> >> We basically just put in everything in the handshake closure, so the diff just looks much worse than what it is. >> >> TraceSuspendDebugBits have an ifdef, but in both cases it now just returns. >> But I was unsure if I should remove now or when is_ext_suspend_completed() is removed. >> >> Passes multiple t1-5 runs, locally it passes many jck:vm/nsk_jvmti/nsk_jdi/jdk-jdi runs. > > src/hotspot/share/prims/jvmtiEnv.cpp line 1648: > >> 1646: op.doit(java_thread, true); >> 1647: } else { >> 1648: Handshake::execute(&op, java_thread); > > This pattern is repeated a lot - we should be able to incorporate it into the op itself by passing in `java_thread`. My suggestion here is that we fix this in Handshake::execute() in separate RFE. > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1525: > >> 1523: Thread* current_thread = Thread::current(); >> 1524: HandleMark hm(current_thread); >> 1525: JavaThread* java_thread = target->as_Java_thread(); > > Contrast with the same three lines at L1390 - we should use the same boilerplate in each `doit`. And ideally refactor > into some shared code somewhere (future RFE). Yes, that would be good. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From david.holmes at oracle.com Tue Oct 20 07:26:27 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 20 Oct 2020 17:26:27 +1000 Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended In-Reply-To: References: <0UtoOZMbgTY-1hGACS1n5swvRCa9za_e3uu-XEVOidM=.85baf76c-e91b-492a-8841-10d9e49e3ec4@github.com> Message-ID: <4dd48ddc-d4cf-a86b-4a36-f6c707db422a@oracle.com> On 20/10/2020 5:20 pm, Robbin Ehn wrote: > On Tue, 20 Oct 2020 02:22:15 GMT, David Holmes wrote: > >>> The main point of this change-set is to make it easier to implement S/R on top of handshakes. >>> Which is a prerequisite for removing _suspend_flag (which duplicates the handshake functionality). >>> But we also remove some complicated S/R methods. >>> >>> We basically just put in everything in the handshake closure, so the diff just looks much worse than what it is. >>> >>> TraceSuspendDebugBits have an ifdef, but in both cases it now just returns. >>> But I was unsure if I should remove now or when is_ext_suspend_completed() is removed. >>> >>> Passes multiple t1-5 runs, locally it passes many jck:vm/nsk_jvmti/nsk_jdi/jdk-jdi runs. >> >> src/hotspot/share/prims/jvmtiEnv.cpp line 1648: >> >>> 1646: op.doit(java_thread, true); >>> 1647: } else { >>> 1648: Handshake::execute(&op, java_thread); >> >> This pattern is repeated a lot - we should be able to incorporate it into the op itself by passing in `java_thread`. > > My suggestion here is that we fix this in Handshake::execute() in separate RFE. Not clear to me this is so generic that it applies to Handshake::execute rather than being part of the operation. ?? David ----- >> src/hotspot/share/prims/jvmtiEnvBase.cpp line 1525: >> >>> 1523: Thread* current_thread = Thread::current(); >>> 1524: HandleMark hm(current_thread); >>> 1525: JavaThread* java_thread = target->as_Java_thread(); >> >> Contrast with the same three lines at L1390 - we should use the same boilerplate in each `doit`. And ideally refactor >> into some shared code somewhere (future RFE). > > Yes, that would be good. > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/729 > From fyang at openjdk.java.net Tue Oct 20 08:07:24 2020 From: fyang at openjdk.java.net (Fei Yang) Date: Tue, 20 Oct 2020 08:07:24 GMT Subject: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v9] In-Reply-To: References: Message-ID: On Mon, 19 Oct 2020 20:26:22 GMT, Vladimir Kozlov wrote: > Always run graalunit testing with new intrinsics. You need to adjust Graal test: > src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java Thanks for looking at this. We did run graalunit testing and added the following change in our first commit: diff --git a/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java b/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java index f0e17947460..8f3f4ed9323 100644 --- a/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java +++ b/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java @@ -601,6 +601,7 @@ public class CheckGraalIntrinsics extends GraalTest { if (!config.useSHA512Intrinsics()) { add(ignore, "sun/security/provider/SHA5." + shaCompressName + "([BI)V"); } + add(toBeInvestigated, "sun/security/provider/SHA3." + shaCompressName + "([BI)V"); } ------------- PR: https://git.openjdk.java.net/jdk/pull/207 From shade at openjdk.java.net Tue Oct 20 08:10:18 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 20 Oct 2020 08:10:18 GMT Subject: RFR: 8255041: Zero: remove old JSR 292 support leftovers Message-ID: JDK-8000780 removed `ZeroInterpreter::method_handle_entry`, but left its helpers around. These have no uses, and can be eliminated. Attention @rkennke, who did the JDK-8000780 a while ago. Testing: - [x] Linux x86_64 fastdebug zero images - [ ] Linux x86_64 release zero bootcycle-images ------------- Commit messages: - 8255041: Zero: remove old JSR 292 support leftovers Changes: https://git.openjdk.java.net/jdk/pull/758/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=758&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255041 Stats: 76 lines in 3 files changed: 0 ins; 76 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/758.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/758/head:pull/758 PR: https://git.openjdk.java.net/jdk/pull/758 From shade at openjdk.java.net Tue Oct 20 09:07:24 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 20 Oct 2020 09:07:24 GMT Subject: RFR: 8254995: [x86] ControlWord::print(), rc/pc variables might not be initialized [v2] In-Reply-To: References: Message-ID: > Static analyzers complain that in `ControlWord::print()`, `rc`/`pc` variables might not be initialized. This never > happens in practice, because `rounding_control()` and `precision_control()` return the good values. We can make it > cleaner to silence the compiler. Testing: > - [x] Linux x86_64 tier1 Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Use fatal(), initialize to NULL and add comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/731/files - new: https://git.openjdk.java.net/jdk/pull/731/files/dc6c1187..eb1705ac Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=731&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=731&range=00-01 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/731.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/731/head:pull/731 PR: https://git.openjdk.java.net/jdk/pull/731 From mgronlun at openjdk.java.net Tue Oct 20 09:36:07 2020 From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 20 Oct 2020 09:36:07 GMT Subject: RFR: 8249675: x86: frequency extraction from cpu brand string is incomplete In-Reply-To: References: <4cLDGQwOXZqws-dr9QulUFnxYJ646Y_ObUzfjEfSfD4=.4f10936d-286c-406e-b72d-4849e3d4a5a6@github.com> Message-ID: On Mon, 19 Oct 2020 22:00:07 GMT, David Holmes wrote: > > > The changes seem fine, but couldn't you just use strrchr() to find the last H in the string? Hi David, Yes, that was my immediate change. I thought of two counterarguments, admittedly weak, that made me decide against it: 1. If a suffix is introduced to the brand string, it could invalidate strrchr() usage. 2. If the same seek strategy is used as described in Application note 485 (left-to-right two char matching) , it is easier to map the descriptions to the code. Thanks Markus ------------- PR: https://git.openjdk.java.net/jdk/pull/736 From shade at openjdk.java.net Tue Oct 20 09:45:20 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 20 Oct 2020 09:45:20 GMT Subject: RFR: 8253525: Implement getInstanceSize/sizeOf intrinsics [v3] In-Reply-To: References: Message-ID: On Mon, 19 Oct 2020 18:13:59 GMT, Vladimir Kozlov wrote: >> Aleksey Shipilev has refreshed the contents of this pull request, and previous commits have been removed. The >> incremental views will show differences compared to the previous content of the PR. > > Always run graalunit testing with new intrinsics. > You need to adjust Graal test: > src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java I just ran `CONF=linux-x86_64-server-fastdebug make clean run-test TEST=compiler/graalunit` without problems even without the `CheckGraalIntrinsics.java` changes. Does the test actually work? ------------- PR: https://git.openjdk.java.net/jdk/pull/650 From rehn at openjdk.java.net Tue Oct 20 09:56:12 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 20 Oct 2020 09:56:12 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended In-Reply-To: References: <0UtoOZMbgTY-1hGACS1n5swvRCa9za_e3uu-XEVOidM=.85baf76c-e91b-492a-8841-10d9e49e3ec4@github.com> Message-ID: On Tue, 20 Oct 2020 07:12:01 GMT, Robbin Ehn wrote: >> Hi Robbin, >> >> IIUC the "waiting" part of `wait_for_ext_suspend_completion` is now implicitly handled in the handshake - correct? >> >> Overall this seems like a great simplification. >> >> A few minor comments below. >> >> Thanks, >> David > > Hi David, thanks for having a look. > >> Hi Robbin, >> >> IIUC the "waiting" part of `wait_for_ext_suspend_completion` is now implicitly handled in the handshake - correct? > > A suspended Java thread may never transition back to java, never execute any more bytecodes. > The old 'fully' suspended gives more guarantees which we need to do some operations. > When we are in a handshake, the handshake gives us those guarantees, so it is enough that thread is considered > suspended. > So the answer is we wait until we are allowed to execute those operations (thread handshake safe), which is not > identical with fully suspended. But fully suspended is an implementation detail, the agent only knows about suspended. >> >> Overall this seems like a great simplification. >> >> A few minor comments below. >> >> Thanks, >> David > Not clear to me this is so generic that it applies to Handshake::execute > rather than being part of the operation. ?? We only have two other cases (other than JVM TI where we always do this). This one, async exception: if (thread == receiver) { // Exception is getting thrown at self so no VM_Operation needed. THROW_OOP(java_throwable); } else { // Use a VM_Operation to throw the exception. Thread::send_async_exception(java_thread, java_throwable); } And biased locking revoke where we don't do this, because it makes no sense revoking a lock if you have the bias :) In that case we could assert we never try to revoke a self owned bias. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From stefank at openjdk.java.net Tue Oct 20 10:22:18 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Tue, 20 Oct 2020 10:22:18 GMT Subject: RFR: 8255047: Add HotSpot flag to use with debuggers that restrict the =?UTF-8?B?Q1BV4oCm?= Message-ID: <_0YMtDNWAGXhg-_EyMf0c8GoNE4cf4wf1hpyr9j-sNs=.2ec5bf8b-023e-4bbd-9692-f74cd6c97f8d@github.com> Some debuggers don't work well with many threads, and/or incompletely restricts the number of used CPUs to one. This flag is intended as a catch-all for HotSpot developers (not available in product builds) to allow us to more easily use those debuggers. Currently, the proposal is to let the flag fix a few things: 1) Turn down the number of JVM threads 2) Turn off NUMA 3) Force processor_id() to return 0 instead of values above processor_count() (1) is purely ergonomics: gdb, rr, valgrind is faster and seems to work much better with fewer threads. The values would still be overridable by devs. (2) and (3) deals with the fact that some debuggers change the reported processor count, but don't change the processor ids returned by sched_getcpu. This causes problems for ZGC and NUMA, that both assumes that they can rely on os::processor_id() < os::processor_count(). The current proposed flag name is -XX:+LimitedCPUsDebugging. I'm not entirely happy with that name, but I been able to find a better name. An alternative to having one flag, is to split this into two flags, and maybe that would solve the naming problem. However, the usability aspects will be worse. If we can't find a suitable name, I rather introduce a flag called: -XX:DebuggerWorkarounds or -XX:DebuggerWorkaround1 Any suggestions / opinions? I really do want to at least fix the (2, 3) problem, because I keep having to add this to every single branch I'm working on. ------------- Commit messages: - 8255047: Add HotSpot flag to use with debuggers that restrict the CPU count Changes: https://git.openjdk.java.net/jdk/pull/763/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=763&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255047 Stats: 27 lines in 4 files changed: 25 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/763.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/763/head:pull/763 PR: https://git.openjdk.java.net/jdk/pull/763 From shade at openjdk.java.net Tue Oct 20 10:24:35 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 20 Oct 2020 10:24:35 GMT Subject: RFR: 8253525: Implement getInstanceSize/sizeOf intrinsics [v4] In-Reply-To: References: Message-ID: > This is fork off the SizeOf JEP, JDK-8249196. There is already the entry point in JDK that can use the intrinsic like > this: `Instrumentation.getInstanceSize`. Therefore, we can implement the C1/C2 intrinsic now, hook it up to > `Instrumentation`, and let the tools use that fast path today. With this patch, JOL is able to be close to > `deepSizeOf` implementation from SizeOf JEP. > Example performance improvements for sizing up a custom linked list: > > Benchmark (size) Mode Cnt Score Error Units > > # Default > LinkedChainBench.linkedChain 1 avgt 5 705.835 ? 8.051 ns/op > LinkedChainBench.linkedChain 10 avgt 5 3148.874 ? 37.856 ns/op > LinkedChainBench.linkedChain 100 avgt 5 28693.256 ? 142.254 ns/op > LinkedChainBench.linkedChain 1000 avgt 5 290161.590 ? 4594.631 ns/op > > # Instrumentation attached, no intrinsics > LinkedChainBench.linkedChain 1 avgt 5 159.659 ? 19.238 ns/op > LinkedChainBench.linkedChain 10 avgt 5 717.659 ? 22.540 ns/op > LinkedChainBench.linkedChain 100 avgt 5 7739.394 ? 111.683 ns/op > LinkedChainBench.linkedChain 1000 avgt 5 80724.238 ? 2887.794 ns/op > > # Instrumentation attached, new intrinsics > LinkedChainBench.linkedChain 1 avgt 5 95.254 ? 0.808 ns/op > LinkedChainBench.linkedChain 10 avgt 5 261.564 ? 8.524 ns/op > LinkedChainBench.linkedChain 100 avgt 5 3367.192 ? 21.128 ns/op > LinkedChainBench.linkedChain 1000 avgt 5 34148.851 ? 373.080 ns/op Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Add new intrinsics to toBeInvestigated list in CheckGraalIntrinsics.java ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/650/files - new: https://git.openjdk.java.net/jdk/pull/650/files/6160f6a8..132f2c50 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=650&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=650&range=02-03 Stats: 5 lines in 1 file changed: 5 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/650.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/650/head:pull/650 PR: https://git.openjdk.java.net/jdk/pull/650 From shade at openjdk.java.net Tue Oct 20 10:24:35 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 20 Oct 2020 10:24:35 GMT Subject: RFR: 8253525: Implement getInstanceSize/sizeOf intrinsics [v3] In-Reply-To: References: Message-ID: On Mon, 19 Oct 2020 18:13:59 GMT, Vladimir Kozlov wrote: >> Aleksey Shipilev has refreshed the contents of this pull request, and previous commits have been removed. The >> incremental views will show differences compared to the previous content of the PR. > > Always run graalunit testing with new intrinsics. > You need to adjust Graal test: > src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java @vnkozlov I added the new block in `CheckGraalIntrinsics.java`, and made sure it passes the entire `compiler/graalunit` with [JDK-8254785](https://bugs.openjdk.java.net/browse/JDK-8254785) applied. ------------- PR: https://git.openjdk.java.net/jdk/pull/650 From rehn at openjdk.java.net Tue Oct 20 10:32:26 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 20 Oct 2020 10:32:26 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v2] In-Reply-To: References: Message-ID: <5R147e7WLawIyLWxjBQUNK22Y19rDsT32HYDCUXep7g=.a2344642-ca05-429e-a733-4ead81eb4f20@github.com> > The main point of this change-set is to make it easier to implement S/R on top of handshakes. > Which is a prerequisite for removing _suspend_flag (which duplicates the handshake functionality). > But we also remove some complicated S/R methods. > > We basically just put in everything in the handshake closure, so the diff just looks much worse than what it is. > > TraceSuspendDebugBits have an ifdef, but in both cases it now just returns. > But I was unsure if I should remove now or when is_ext_suspend_completed() is removed. > > Passes multiple t1-5 runs, locally it passes many jck:vm/nsk_jvmti/nsk_jdi/jdk-jdi runs. Robbin Ehn has updated the pull request incrementally with two additional commits since the last revision: - Removed TraceSuspendDebugBits - Removed unused method is_ext_suspend_completed_with_lock ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/729/files - new: https://git.openjdk.java.net/jdk/pull/729/files/7e77a04f..386b930f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=729&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=729&range=00-01 Stats: 94 lines in 2 files changed: 0 ins; 88 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/729.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/729/head:pull/729 PR: https://git.openjdk.java.net/jdk/pull/729 From coleenp at openjdk.java.net Tue Oct 20 11:51:32 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 20 Oct 2020 11:51:32 GMT Subject: RFR: 8233343: Deprecate -XX:+CriticalJNINatives flag which implements =?UTF-8?B?SmF2YeKApg==?= Message-ID: This change deprecates the -XX:+CriticalJNINatives flag and removes the develop flag -XX:+StressCriticalJNINatives. See CSR for more details. This change also removes the lazy GC lock in the critical native transition, and runs the critical native function as thread_in_Java. I add a safepoint check at the end of the native function and transition to native and poll again for the safepoint after the function if a safepoint is requested. Tested with tier 1-6 (we have a few tests that use this). And built on linux-x86-open,linux-s390x-open,linux-arm32-debug,linux-ppc64le-debug. ------------- Commit messages: - 8233343: Deprecate -XX:+CriticalJNINatives flag which implements JavaCritical native functions Changes: https://git.openjdk.java.net/jdk/pull/764/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=764&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8233343 Stats: 1054 lines in 18 files changed: 106 ins; 898 del; 50 mod Patch: https://git.openjdk.java.net/jdk/pull/764.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/764/head:pull/764 PR: https://git.openjdk.java.net/jdk/pull/764 From eosterlund at openjdk.java.net Tue Oct 20 11:56:12 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 20 Oct 2020 11:56:12 GMT Subject: RFR: 8254878: Move last piece of ZArray to GrowableArray [v2] In-Reply-To: References: Message-ID: On Sun, 18 Oct 2020 13:24:28 GMT, Per Liden wrote: >> ZArray used to be a separate implementation of a dynamically allocated/growable array. It now instead inherits from >> GrowableCHeapArray, and extends it with a transfer() function. I propose we rename this function to swap() and move it >> to GrowableArrayWithAllocator, since this function could be generally useful. It would also mean that ZArray could be >> just a typedef/using of GrowableCHeapArray. > > Per Liden has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev > excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since > the last revision: > 8254878: Move last piece of ZArray to GrowableArray Looks good. ------------- Marked as reviewed by eosterlund (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/694 From dholmes at openjdk.java.net Tue Oct 20 11:58:22 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 20 Oct 2020 11:58:22 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v2] In-Reply-To: <5R147e7WLawIyLWxjBQUNK22Y19rDsT32HYDCUXep7g=.a2344642-ca05-429e-a733-4ead81eb4f20@github.com> References: <5R147e7WLawIyLWxjBQUNK22Y19rDsT32HYDCUXep7g=.a2344642-ca05-429e-a733-4ead81eb4f20@github.com> Message-ID: On Tue, 20 Oct 2020 10:32:26 GMT, Robbin Ehn wrote: >> The main point of this change-set is to make it easier to implement S/R on top of handshakes. >> Which is a prerequisite for removing _suspend_flag (which duplicates the handshake functionality). >> But we also remove some complicated S/R methods. >> >> We basically just put in everything in the handshake closure, so the diff just looks much worse than what it is. >> >> TraceSuspendDebugBits have an ifdef, but in both cases it now just returns. >> But I was unsure if I should remove now or when is_ext_suspend_completed() is removed. >> >> Passes multiple t1-5 runs, locally it passes many jck:vm/nsk_jvmti/nsk_jdi/jdk-jdi runs. > > Robbin Ehn has updated the pull request incrementally with two additional commits since the last revision: > > - Removed TraceSuspendDebugBits > - Removed unused method is_ext_suspend_completed_with_lock Still looks good. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/729 From rehn at openjdk.java.net Tue Oct 20 12:49:11 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 20 Oct 2020 12:49:11 GMT Subject: RFR: 8233343: Deprecate -XX:+CriticalJNINatives flag which implements =?UTF-8?B?SmF2YeKApg==?= In-Reply-To: References: Message-ID: On Tue, 20 Oct 2020 11:41:17 GMT, Coleen Phillimore wrote: > This change deprecates the -XX:+CriticalJNINatives flag and removes the develop flag -XX:+StressCriticalJNINatives. > See CSR for more details. > This change also removes the lazy GC lock in the critical native transition, and runs the critical native function as > thread_in_Java. I add a safepoint check at the end of the native function and transition to native and poll again for > the safepoint after the function if a safepoint is requested. Tested with tier 1-6 (we have a few tests that use > this). And built on linux-x86-open,linux-s390x-open,linux-arm32-debug,linux-ppc64le-debug. Thanks you @coleenp! Looks good, really nice delta of -800 LOC ! ------------- Marked as reviewed by rehn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/764 From rehn at openjdk.java.net Tue Oct 20 12:56:22 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 20 Oct 2020 12:56:22 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v2] In-Reply-To: References: <0UtoOZMbgTY-1hGACS1n5swvRCa9za_e3uu-XEVOidM=.85baf76c-e91b-492a-8841-10d9e49e3ec4@github.com> Message-ID: On Tue, 20 Oct 2020 12:06:57 GMT, Coleen Phillimore wrote: >> Yes, that would be good. > > Why don't you just do: > JavaThread* java_thread = JavaThread::current(); > HandleMark hm(java_thread); > > JavaThread::current is the same thing as what you have. Hi Coleen, we are in a handshake. Target java_thread must be treat as different from current thread. It can be same, but also another thread java thread or VM thread. So we only know the the thread executing is a Thread, but our target is a JavaThread. Here we want to create a HandleMark in current thread. Then we want a JavaThread pointer to the target for later use. Does that explain it? ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From pliden at openjdk.java.net Tue Oct 20 13:18:10 2020 From: pliden at openjdk.java.net (Per Liden) Date: Tue, 20 Oct 2020 13:18:10 GMT Subject: Integrated: 8254878: Move last piece of ZArray to GrowableArray In-Reply-To: References: Message-ID: On Fri, 16 Oct 2020 09:22:22 GMT, Per Liden wrote: > ZArray used to be a separate implementation of a dynamically allocated/growable array. It now instead inherits from > GrowableCHeapArray, and extends it with a transfer() function. I propose we rename this function to swap() and move it > to GrowableArrayWithAllocator, since this function could be generally useful. It would also mean that ZArray could be > just a typedef/using of GrowableCHeapArray. This pull request has now been integrated. Changeset: cdc8c401 Author: Per Liden URL: https://git.openjdk.java.net/jdk/commit/cdc8c401 Stats: 33 lines in 5 files changed: 6 ins; 23 del; 4 mod 8254878: Move last piece of ZArray to GrowableArray Reviewed-by: stefank, eosterlund ------------- PR: https://git.openjdk.java.net/jdk/pull/694 From phedlin at openjdk.java.net Tue Oct 20 13:37:25 2020 From: phedlin at openjdk.java.net (Patric Hedlin) Date: Tue, 20 Oct 2020 13:37:25 GMT Subject: RFR: 8248411: [aarch64] Insufficient error handling when CodeBuffer is exhausted Message-ID: Trampoline call generation (in the macro-assembler) may run out of CodeBuffer space without the proper error handling, resulting in asserts such as: # Internal Error (.../open/src/hotspot/share/asm/codeBuffer.hpp:198), pid=845, tid=859 # assert(allocates2(pc)) failed: relocation addr must be in this section This update extends the error handling for such error cases to cover all uses of `trampoline_call()`, direct and indirect. Failure registration/recording is retained in the "**aarch64.ad**" file. ------------- Commit messages: - 8248411: [aarch64] Insufficient error handling when CodeBuffer is exhausted Changes: https://git.openjdk.java.net/jdk/pull/765/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=765&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8248411 Stats: 148 lines in 5 files changed: 83 ins; 7 del; 58 mod Patch: https://git.openjdk.java.net/jdk/pull/765.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/765/head:pull/765 PR: https://git.openjdk.java.net/jdk/pull/765 From fyang at openjdk.java.net Tue Oct 20 13:42:27 2020 From: fyang at openjdk.java.net (Fei Yang) Date: Tue, 20 Oct 2020 13:42:27 GMT Subject: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v10] In-Reply-To: References: Message-ID: > Contributed-by: ard.biesheuvel at linaro.org, dongbo4 at huawei.com > > This added an intrinsic for SHA3 using aarch64 v8.2 SHA3 Crypto Extensions. > Reference implementation for core SHA-3 transform using ARMv8.2 Crypto Extensions: > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/arm64/crypto/sha3-ce-core.S?h=v5.4.52 > > Trivial adaptation in SHA3. implCompress is needed for the purpose of adding the intrinsic. > For SHA3, we need to pass one extra parameter "digestLength" to the stub for the calculation of block size. > "digestLength" is also used in for the EOR loop before keccak to differentiate different SHA3 variants. > > We added jtreg tests for SHA3 and used QEMU system emulator which supports SHA3 instructions to test the functionality. > Patch passed jtreg tier1-3 tests with QEMU system emulator. > Also verified with jtreg tier1-3 tests without SHA3 instructions on aarch64-linux-gnu and x86_64-linux-gnu, to make > sure that there's no regression. > We used one existing JMH test for performance test: test/micro/org/openjdk/bench/java/security/MessageDigests.java > We measured the performance benefit with an aarch64 cycle-accurate simulator. > Patch delivers 20% - 40% performance improvement depending on specific SHA3 digest length and size of the message. > > For now, this feature will not be enabled automatically for aarch64. We can auto-enable this when it is fully tested on > real hardware. But for the above testing purposes, this is auto-enabled when the corresponding hardware feature is > detected. Fei Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: - Fix trailing whitespace issue reported by jcheck - Merge master - Merge master - Remove unnecessary code changes in vm_version_aarch64.cpp - Merge master - Merge master - Merge master - Merge master - Add sha3 instructions to cpu/aarch64/aarch64-asmtest.py and regenerate the test in assembler_aarch64.cpp:asm_check - Rebase - ... and 3 more: https://git.openjdk.java.net/jdk/compare/cdc8c401...d32c8ad7 ------------- Changes: https://git.openjdk.java.net/jdk/pull/207/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=207&range=09 Stats: 1262 lines in 36 files changed: 1007 ins; 22 del; 233 mod Patch: https://git.openjdk.java.net/jdk/pull/207.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/207/head:pull/207 PR: https://git.openjdk.java.net/jdk/pull/207 From phedlin at openjdk.java.net Tue Oct 20 13:43:16 2020 From: phedlin at openjdk.java.net (Patric Hedlin) Date: Tue, 20 Oct 2020 13:43:16 GMT Subject: RFR: 8248411: [aarch64] Insufficient error handling when CodeBuffer is exhausted In-Reply-To: References: Message-ID: <5Ub9ckbkXf-ZT92nfaVNICNBg1yQmQ5AAGlmtOGFiC4=.702475a7-1d59-4ab2-a271-6306a5983c18@github.com> On Tue, 20 Oct 2020 13:31:59 GMT, Patric Hedlin wrote: > Trampoline call generation (in the macro-assembler) may run out of CodeBuffer space without the proper error handling, > resulting in asserts such as: # Internal Error (.../open/src/hotspot/share/asm/codeBuffer.hpp:198), pid=845, tid=859 > # assert(allocates2(pc)) failed: relocation addr must be in this section > This update extends the error handling for such error cases to cover all uses of `trampoline_call()`, direct and > indirect. Failure registration/recording is retained in the "**aarch64.ad**" file. Testing tier1-3 Repeated testing with hs-tier1-3 using `-XX:+StressCodeBuffers`. Repeated testing with `jtreg:compiler/profiling/spectrapredefineclass/Launcher.java` using `-XX:+StressCodeBuffers`. ------------- PR: https://git.openjdk.java.net/jdk/pull/765 From mbaesken at openjdk.java.net Tue Oct 20 13:45:11 2020 From: mbaesken at openjdk.java.net (Matthias Baesken) Date: Tue, 20 Oct 2020 13:45:11 GMT Subject: RFR: JDK-8254889: name_and_sig_as_C_string usages in frame coding without ResourceMark [v2] In-Reply-To: References: <2zlELXVF1ZB4uK35AHD6VrwfhHms2JdygTsv1qHmwVQ=.68dc2250-1329-457f-9964-8998d3e3c94f@github.com> Message-ID: On Fri, 16 Oct 2020 21:36:22 GMT, David Holmes wrote: > As long as there is a ResourceMark in the caller there is no issue here - though the code should be documented in that > case. hi David, I checked the callers of frame void describe(FrameValues& values, int frame_no); more closely and I think they are good (both JavaThread::print_frame_layout and trace_method_handle_stub have a ResourceMark) ; so should I place a comment in frame.cpp above describe (something like "Attention -callers need a ResourceMark") ? ------------- PR: https://git.openjdk.java.net/jdk/pull/698 From rrich at openjdk.java.net Tue Oct 20 14:13:21 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Tue, 20 Oct 2020 14:13:21 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v2] In-Reply-To: References: <5R147e7WLawIyLWxjBQUNK22Y19rDsT32HYDCUXep7g=.a2344642-ca05-429e-a733-4ead81eb4f20@github.com> Message-ID: <010C_PcZp7Ol1SpH7__fIb5SPJrbFaQMUa0DN_R2rtM=.49983d69-0901-4081-9043-fbfaef0e4c61@github.com> On Tue, 20 Oct 2020 11:55:49 GMT, David Holmes wrote: >> Robbin Ehn has updated the pull request incrementally with two additional commits since the last revision: >> >> - Removed TraceSuspendDebugBits >> - Removed unused method is_ext_suspend_completed_with_lock > > Still looks good. Hi, this is a good change, because it is a simplification and it it makes the stack walks safe by doing them as part of a handshake. The change conflicts with #119 though. This one is ready to be pushed since last week but was delayed due to other interferences. Would you mind me integrating #119 first? After integration it would be needed to pull 2 EscapeBarriers out of handshakes. Of course I would help do that. Thanks, Richard. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From coleenp at openjdk.java.net Tue Oct 20 14:17:17 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 20 Oct 2020 14:17:17 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v2] In-Reply-To: References: <0UtoOZMbgTY-1hGACS1n5swvRCa9za_e3uu-XEVOidM=.85baf76c-e91b-492a-8841-10d9e49e3ec4@github.com> Message-ID: On Tue, 20 Oct 2020 12:53:15 GMT, Robbin Ehn wrote: >> Why don't you just do: >> JavaThread* java_thread = JavaThread::current(); >> HandleMark hm(java_thread); >> >> JavaThread::current is the same thing as what you have. >> >> Oh I see there are two different threads. nevermind. > > Hi Coleen, we are in a handshake. > Target java_thread must be treat as different from current thread. > It can be same, but also another thread java thread or VM thread. > So we only know the the thread executing is a Thread, but our target is a JavaThread. > > Here we want to create a HandleMark in current thread. > Then we want a JavaThread pointer to the target for later use. > > Does that explain it? I see it now, different threads. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From rehn at openjdk.java.net Tue Oct 20 14:22:23 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 20 Oct 2020 14:22:23 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v2] In-Reply-To: <010C_PcZp7Ol1SpH7__fIb5SPJrbFaQMUa0DN_R2rtM=.49983d69-0901-4081-9043-fbfaef0e4c61@github.com> References: <5R147e7WLawIyLWxjBQUNK22Y19rDsT32HYDCUXep7g=.a2344642-ca05-429e-a733-4ead81eb4f20@github.com> <010C_PcZp7Ol1SpH7__fIb5SPJrbFaQMUa0DN_R2rtM=.49983d69-0901-4081-9043-fbfaef0e4c61@github.com> Message-ID: On Tue, 20 Oct 2020 14:10:49 GMT, Richard Reingruber wrote: > Hi, > > this is a good change, because it is a simplification and it it makes the stack walks safe by doing them as part of a > handshake. > The change conflicts with #119 though. This one is ready to be pushed since last week but was delayed due to other > interferences. Would you mind me integrating #119 first? After integration it would be needed to pull 2 EscapeBarriers > out of handshakes. Of course I would help do that. Thanks, Richard. Hey Richard, go ahead and integrate your 119 first, I'll hold off and do the merge once you integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From dcubed at openjdk.java.net Tue Oct 20 14:56:15 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Tue, 20 Oct 2020 14:56:15 GMT Subject: RFR: 8255047: Add HotSpot flag to use with debuggers that restrict the =?UTF-8?B?Q1BV4oCm?= In-Reply-To: <_0YMtDNWAGXhg-_EyMf0c8GoNE4cf4wf1hpyr9j-sNs=.2ec5bf8b-023e-4bbd-9692-f74cd6c97f8d@github.com> References: <_0YMtDNWAGXhg-_EyMf0c8GoNE4cf4wf1hpyr9j-sNs=.2ec5bf8b-023e-4bbd-9692-f74cd6c97f8d@github.com> Message-ID: On Tue, 20 Oct 2020 10:10:12 GMT, Stefan Karlsson wrote: > Some debuggers don't work well with many threads, and/or incompletely restricts the number of used CPUs to one. > > This flag is intended as a catch-all for HotSpot developers (not available in product builds) to allow us to more > easily use those debuggers. > Currently, the proposal is to let the flag fix a few things: > 1) Turn down the number of JVM threads > 2) Turn off NUMA > 3) Force processor_id() to return 0 instead of values above processor_count() > > (1) is purely ergonomics: gdb, rr, valgrind is faster and seems to work much better with fewer threads. The values > would still be overridable by devs. > (2) and (3) deals with the fact that some debuggers change the reported processor count, but don't change the processor > ids returned by sched_getcpu. This causes problems for ZGC and NUMA, that both assumes that they can rely on > os::processor_id() < os::processor_count(). The current proposed flag name is -XX:+LimitedCPUsDebugging. I'm not > entirely happy with that name, but I been able to find a better name. > An alternative to having one flag, is to split this into two flags, and maybe that would solve the naming problem. > However, the usability aspects will be worse. > If we can't find a suitable name, I rather introduce a flag called: > -XX:DebuggerWorkarounds or -XX:DebuggerWorkaround1 > > Any suggestions / opinions? I really do want to at least fix the (2, 3) problem, because I keep having to add this to > every single branch I'm working on. Perhaps: `UseDebuggerErgo` for the main option name with `UseDebuggerErgo1` ... `UseDebuggerErgoN` for the suboptions where `UseDebuggerErgo` enables all the numbered `UseDebuggerErgo` options in one go. Update: Yes, this is the way that I have always wished that `UseNewCode*` worked. ------------- PR: https://git.openjdk.java.net/jdk/pull/763 From rrich at openjdk.java.net Tue Oct 20 15:36:19 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Tue, 20 Oct 2020 15:36:19 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v2] In-Reply-To: References: <5R147e7WLawIyLWxjBQUNK22Y19rDsT32HYDCUXep7g=.a2344642-ca05-429e-a733-4ead81eb4f20@github.com> <010C_PcZp7Ol1SpH7__fIb5SPJrbFaQMUa0DN_R2rtM=.49983d69-0901-4081-9043-fbfaef0e4c61@github.com> Message-ID: On Tue, 20 Oct 2020 14:18:57 GMT, Robbin Ehn wrote: > > > > Hi, > > this is a good change, because it is a simplification and it it makes the stack walks safe by doing them as part of a > > handshake. The change conflicts with #119 though. This one is ready to be pushed since last week but was delayed due to > > other interferences. Would you mind me integrating #119 first? After integration it would be needed to pull 2 > > EscapeBarriers out of handshakes. Of course I would help do that. Thanks, Richard. > > Hey Richard, go ahead and integrate your 119 first, I'll hold off and do the merge once you integrated. Thanks Robbin! ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From rrich at openjdk.java.net Tue Oct 20 15:38:18 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Tue, 20 Oct 2020 15:38:18 GMT Subject: Integrated: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: Message-ID: On Thu, 10 Sep 2020 20:48:23 GMT, Richard Reingruber wrote: > Hi, > > this is the continuation of the review of the implementation for: > > https://bugs.openjdk.java.net/browse/JDK-8227745 > https://bugs.openjdk.java.net/browse/JDK-8233915 > > It allows for JIT optimizations based on escape analysis even if JVMTI agents acquire capabilities to access references > to objects that are subject to such optimizations, e.g. scalar replacement. The implementation reverts such > optimizations just before access very much as when switching from JIT compiled execution to the interpreter, aka > "deoptimization". Webrev.8 was the last one before before the transition to Git/Github: > > http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ > > Thanks, Richard. This pull request has now been integrated. Changeset: 40f847e2 Author: Richard Reingruber URL: https://git.openjdk.java.net/jdk/commit/40f847e2 Stats: 5860 lines in 53 files changed: 5642 ins; 116 del; 102 mod 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents 8233915: JVMTI FollowReferences: Java Heap Leak not found because of C2 Scalar Replacement Reviewed-by: mdoerr, goetz, sspitsyn, kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/119 From psandoz at openjdk.java.net Tue Oct 20 16:20:19 2020 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Tue, 20 Oct 2020 16:20:19 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v4] In-Reply-To: References: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> <7-p-Kc9lQyyuoWdNtmgbXbwkxsgk4oQGKmFSCcMpvnU=.97810c01-3200-4767-bbd4-35d53c2bc5ca@github.com> <6Voyfr_s-ieyRA-8Rtvvpz7tkhhicA8sY2d2KTp3Kmw=.fa256bae-2143-4b43-bfea-5837ad31eb6a@github.com> <7XjzEn5DggliDrvjhrGwXZL5r4lsqeGF9SGLmRr5L84=.a4481a62-4ecf-4e3f-98f3-70e548c67b52@github.com> <9y5m4zfsDZMdIZ6CT38BzO0tpFMuFxUswAb pjfDny-w=.44c7ada7-bf77-45f1-b5a6-b542731d6685@github.com> Message-ID: On Thu, 15 Oct 2020 17:58:29 GMT, CoreyAshford wrote: >> Please update >> [compiler/graalunit/HotspotTest.java](https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/compiler/graalunit/HotspotTest.java), >> and add the intrinsic signature. > >> Please update >> [compiler/graalunit/HotspotTest.java](https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/compiler/graalunit/HotspotTest.java), >> and add the intrinsic signature. > > It looks like that is auto-generated, but I will figure out what to modify so that the signature is added. @CoreyAshford apologies i pointed to the "umbrella" test that runs Graal unit tests, the actual test is [CheckGraalIntrinsics](https://github.com/openjdk/jdk/blob/master/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java) See this PR for an example: https://github.com/openjdk/jdk/pull/762 ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From mchung at openjdk.java.net Tue Oct 20 16:38:31 2020 From: mchung at openjdk.java.net (Mandy Chung) Date: Tue, 20 Oct 2020 16:38:31 GMT Subject: RFR: 8188055: (ref) Add Reference::refersTo predicate [v5] In-Reply-To: References: Message-ID: On Tue, 20 Oct 2020 05:22:57 GMT, Kim Barrett wrote: >> @kimbarrett your reworded text is okay. I think "if it initially had some other referent value" can be dropped. >> >> For a `Reference` constructed with a `null` referent, we can clarify in the spec that such reference object will never >> get cleared and enqueued. I suggest to file a separate issue to follow up. > >> @kimbarrett your reworded text is okay. I think "if it initially had some other referent value" can be dropped. >> >> For a `Reference` constructed with a `null` referent, we can clarify in the spec that such reference object will never >> get cleared and enqueued. I suggest to file a separate issue to follow up. > > I don't think that clause can be dropped, because of explicit clearing (by clear() or enqueue()) rather than by the > GC. If the reference was constructed with a null referent, ref.refersTo(null) cannot tell whether ref.clear() has been > called. > Mandy's comment implied that references with a null referent never get enqueued. Otherwise when would they get > enqueued? There would be nothing to trigger it. Sorry I should have been clearer. What I try to say is that `Reference(null)` will never be cleared and enqueued by GC since its referent is `null`. Kim is right that `Reference(null)` can be explicitly cleared and enqueued via `Reference::enqueue`. `Reference::clear` on such an "empty" reference object is essentially a no-op. > > But the more we discuss this the more I think allowing an initial null referent was a mistake in the first place. :( > > I agree, but here we are. Very hard to know what the compatibility impact of changing that would be. There are existing use cases depending on `Reference(null)` for example as a special instance be an empty reference or the head of a doubly-linked list of references. This was discussed two years ago [1]. [1] https://mail.openjdk.java.net/pipermail/core-libs-dev/2018-July/054325.html ------------- PR: https://git.openjdk.java.net/jdk/pull/498 From github.com+51754783+coreyashford at openjdk.java.net Tue Oct 20 16:45:23 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Tue, 20 Oct 2020 16:45:23 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v4] In-Reply-To: References: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> <7-p-Kc9lQyyuoWdNtmgbXbwkxsgk4oQGKmFSCcMpvnU=.97810c01-3200-4767-bbd4-35d53c2bc5ca@github.com> <6Voyfr_s-ieyRA-8Rtvvpz7tkhhicA8sY2d2KTp3Kmw=.fa256bae-2143-4b43-bfea-5837ad31eb6a@github.com> <7XjzEn5DggliDrvjhrGwXZL5r4lsqeGF9SGLmRr5L84=.a4481a62-4ecf-4e3f-98f3-70e548c67b52@github.com> <9y5m4zfsDZMdIZ6CT38BzO0tpFMuFxUswAb pjfDny-w=.44c7ada7-bf77-45f1-b5a6-b542731d6685@github.com> Message-ID: <_Kp2I32RPuFG2TsngEkrL1CEHpYFbT5k2yLSpJ1r4-w=.d6734741-0392-4cc7-acef-81367a598ff4@github.com> On Tue, 20 Oct 2020 16:17:15 GMT, Paul Sandoz wrote: >>> Please update >>> [compiler/graalunit/HotspotTest.java](https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/compiler/graalunit/HotspotTest.java), >>> and add the intrinsic signature. >> >> It looks like that is auto-generated, but I will figure out what to modify so that the signature is added. > > @CoreyAshford apologies i pointed to the "umbrella" test that runs Graal unit tests, the actual test is > [CheckGraalIntrinsics](https://github.com/openjdk/jdk/blob/master/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java) > See this PR for an example: https://github.com/openjdk/jdk/pull/762 > @CoreyAshford apologies i pointed to the "umbrella" test that runs Graal unit tests, the actual test is > [CheckGraalIntrinsics](https://github.com/openjdk/jdk/blob/master/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java) > See this PR for an example: #762 Thank you for providing the more accurate location. I will push commits to fix this a bit later today. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From shade at openjdk.java.net Tue Oct 20 17:16:18 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 20 Oct 2020 17:16:18 GMT Subject: RFR: 8255065: Zero: accessor_entry misses the IRIW case Message-ID: While doing a change in related area, I noticed there is no IRIW handling block in `ZeroInterpreter::accessor_entry` when reading volatile fields. This probably breaks PPC64 Zero. There is a block in `bytecodeInterpreter.cpp` for [common field access](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/interpreter/zero/bytecodeInterpreter.cpp#L1899-L1901): if (cache->is_volatile()) { if (support_IRIW_for_not_multiple_copy_atomic_cpu) { OrderAccess::fence(); } Attention @TheRealMDoerr ;) Testing: - [x] Linux x86_64 zero fastdebug build (includes jmod generation with Zero) ------------- Commit messages: - 8255065: Zero: accessor_entry misses the IRIW case Changes: https://git.openjdk.java.net/jdk/pull/766/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=766&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255065 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/766.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/766/head:pull/766 PR: https://git.openjdk.java.net/jdk/pull/766 From mcimadamore at openjdk.java.net Tue Oct 20 17:23:26 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Tue, 20 Oct 2020 17:23:26 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v7] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the first incubation round of the foreign linker access API incubation > (see JEP 389 [1]). This work is meant to sit on top of the foreign memory access support (see JEP 393 [2] and > associated pull request [3]). > The main goal of this API is to provide a way to call native functions from Java code without the need of intermediate > JNI glue code. In order to do this, native calls are modeled through the MethodHandle API. I suggest reading the > writeup [4] I put together few weeks ago, which illustrates what the foreign linker support is, and how it should be > used by clients. Disclaimer: the pull request mechanism isn't great at managing *dependent* reviews. For this reasons, > I'm attaching a webrev which contains only the differences between this PR and the memory access PR. I will be > periodically uploading new webrevs, as new iterations come out, to try and make the life of reviewers as simple as > possible. A big thank to Jorn Vernee and Vladimir Ivanov - they are the main architects of all the hotspot changes you > see here, and without their help, the foreign linker support wouldn't be what it is today. As usual, a big thank to > Paul Sandoz, who provided many insights (often by trying the bits first hand). Thanks Maurizio > Webrev: > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/webrev > > Javadoc: > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/javadoc/jdk/incubator/foreign/package-summary.html > > Specdiff (relative to [3]): > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/specdiff_delta/overview-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254232 > > > > ### API Changes > > The API changes are actually rather slim: > > * `LibraryLookup` > * This class allows clients to lookup symbols in native libraries; the interface is fairly simple; you can load a library > by name, or absolute path, and then lookup symbols on that library. > * `FunctionDescriptor` > * This is an abstraction that is very similar, in spirit, to `MethodType`; it is, at its core, an aggregate of memory > layouts for the function arguments/return type. A function descriptor is used to describe the signature of a native > function. > * `CLinker` > * This is the real star of the show. A `CLinker` has two main methods: `downcallHandle` and `upcallStub`; the first takes > a native symbol (as obtained from `LibraryLookup`), a `MethodType` and a `FunctionDescriptor` and returns a > `MethodHandle` instance which can be used to call the target native symbol. The second takes an existing method handle, > and a `FunctionDescriptor` and returns a new `MemorySegment` corresponding to a code stub allocated by the VM which > acts as a trampoline from native code to the user-provided method handle. This is very useful for implementing upcalls. > * This class also contains the various layout constants that should be used by clients when describing native signatures > (e.g. `C_LONG` and friends); these layouts contain additional ABI classfication information (in the form of layout > attributes) which is used by the runtime to *infer* how Java arguments should be shuffled for the native call to take > place. > * Finally, this class provides some helper functions e.g. so that clients can convert Java strings into C strings and > back. > * `NativeScope` > * This is an helper class which allows clients to group together logically related allocations; that is, rather than > allocating separate memory segments using separate *try-with-resource* constructs, a `NativeScope` allows clients to > use a _single_ block, and allocate all the required segments there. This is not only an usability boost, but also a > performance boost, since not all allocation requests will be turned into `malloc` calls. > * `MemorySegment` > * Only one method added here - namely `handoff(NativeScope)` which allows a segment to be transferred onto an existing > native scope. > > ### Safety > > The foreign linker API is intrinsically unsafe; many things can go wrong when requesting a native method handle. For > instance, the description of the native signature might be wrong (e.g. have too many arguments) - and the runtime has, > in the general case, no way to detect such mismatches. For these reasons, obtaining a `CLinker` instance is > a *restricted* operation, which can be enabled by specifying the usual JDK property `-Dforeign.restricted=permit` (as > it's the case for other restricted method in the foreign memory API). ### Implementation changes The Java changes > associated with `LibraryLookup` are relative straightforward; the only interesting thing to note here is that library > loading does _not_ depend on class loaders, so `LibraryLookup` is not subject to the same restrictions which apply to > JNI library loading (e.g. same library cannot be loaded by different classloaders). As for `NativeScope` the changes > are again relatively straightforward; it is an API which sits neatly on top of the foreign meory access API, providing > some kind of allocation service which shares the same underlying memory segment(s), and turns an allocation request > into a segment slice, which is a much less expensive operation. `NativeScope` comes in two variants: there are native > scopes for which the allocation size is known a priori, and native scopes which can grow - these two schemes are > implemented by two separate subclasses of `AbstractNativeScopeImpl`. Of course the bulk of the changes are to support > the `CLinker` downcall/upcall routines. These changes cut pretty deep into the JVM; I'll briefly summarize the goal of > some of this changes - for further details, Jorn has put together a detailed writeup which explains the rationale > behind the VM support, with some references to the code [5]. The main idea behind foreign linker is to infer, given a > Java method type (expressed as a `MethodType` instance) and the description of the signature of a native function > (expressed as a `FunctionDescriptor` instance) a _recipe_ that can be used to turn a Java call into the corresponding > native call targeting the requested native function. This inference scheme can be defined in a pretty straightforward > fashion by looking at the various ABI specifications (for instance, see [6] for the SysV ABI, which is the one used on > Linux/Mac). The various `CallArranger` classes, of which we have a flavor for each supported platform, do exactly that > kind of inference. For the inference process to work, we need to attach extra information to memory layouts; it is no > longer sufficient to know e.g. that a layout is 32/64 bits - we need to know whether it is meant to represent a > floating point value, or an integral value; this knowledge is required because floating points are passed in different > registers by most ABIs. For this reason, `CLinker` offers a set of pre-baked, platform-dependent layout constants which > contain the required classification attributes (e.g. a `Clinker.TypeKind` enum value). The runtime extracts this > attribute, and performs classification accordingly. A native call is decomposed into a sequence of basic, primitive > operations, called `Binding` (see the great javadoc on the `Binding.java` class for more info). There are many such > bindings - for instance the `Move` binding is used to move a value into a specific machine register/stack slot. So, the > main job of the various `CallingArranger` classes is to determine, given a Java `MethodType` and `FunctionDescriptor` > what is the set of bindings associated with the downcall/upcall. At the heart of the foreign linker support is the > `ProgrammableInvoker` class. This class effectively generates a `MethodHandle` which follows the steps described by the > various bindings obtained by `CallArranger`. There are actually various strategies to interpret these bindings - listed > below: > * basic intepreted mode; in this mode, all bindings are interpreted using a stack-based machine written in Java (see > `BindingInterpreter`), except for the `Move` bindings. For these bindings, the move is implemented by allocating > a *buffer* (whose size is ABI specific) and by moving all the lowered values into positions within this buffer. The > buffer is then passed to a piece of assembly code inside the VM which takes values from the buffer and moves them in > their expected registers/stack slots (note that each position in the buffer corresponds to a different register). This > is the most general invocation mode, the more "customizable" one, but also the slowest - since for every call there is > some extra allocation which takes place. > > * specialized interpreted mode; same as before, but instead of interpreting the bindings with a stack-based interpreter, > we generate a method handle chain which effectively interprets all the bindings (again, except `Move` ones). > > * intrinsified mode; this is typically used in combination with the specialized interpreted mode described above > (although it can also be used with the Java-based binding interpreter). The goal here is to remove the buffer > allocation and copy by introducing an additional JVM intrinsic. If a native call recipe is constant (e.g. the set of > bindings is constant, which is probably the case if the native method handle is stored in a `static`, `final` field), > then the VM can generate specialized assembly code which interprets the `Move` binding without the need to go for an > intermediate buffer. This gives us back performances that are on par with JNI. > > For upcalls, the support is not (yet) as advanced, and only the basic interpreted mode is available there. We plan to > add support for intrinsified modes there as well, which should considerably boost perfomances (probably well beyond > what JNI can offer at the moment, since the upcall support in JNI is not very well optimized). Again, for more > readings on the internals of the foreign linker support, please refer to [5]. > #### Test changes > > Many new tests have been added to validate the foreign linker support; we have high level tests (see `StdLibTest`) > which aim at testing the linker from the perspective of code that clients could write. But we also have deeper > combinatorial tests (see `TestUpcall` and `TestDowncall`) which are meant to stress every corner of the ABI > implementation. There are also some great tests (see the `callarranger` folder) which test the various `CallArranger`s > for all the possible platforms; these tests adopt more of a white-box approach - that is, instead of treating the > linker machinery as a black box and verify that the support works by checking that the native call returned the results > we expected, these tests aims at checking that the set of bindings generated by the call arranger is correct. This also > mean that we can test the classification logic for Windows, Mac and Linux regardless of the platform we're executing > on. Some additional microbenchmarks have been added to compare the performances of downcall/upcall with JNI. [1] - > https://openjdk.java.net/jeps/389 [2] - https://openjdk.java.net/jeps/393 [3] - > https://git.openjdk.java.net/jdk/pull/548 [4] - > https://github.com/openjdk/panama-foreign/blob/foreign-jextract/doc/panama_ffi.md [5] - > http://cr.openjdk.java.net/~jvernee/docs/Foreign-abi%20downcall%20intrinsics%20technical%20description.html Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 25 commits: - Merge branch 'master' into 8254231_linker - Fix incorrect capitalization in one copyright header - Update copyright years, and add classpath exception to files that were missing it - Use separate constants for native invoker code size - Re-add file erroneously deleted (detected as rename) - Re-add erroneously removed files - Merge branch 'master' into 8254231_linker - Fix tests - Fix more whitespaces - Fix whitespaces - Remove rejected file - ... and 15 more: https://git.openjdk.java.net/jdk/compare/cb6167b2...502bd980 ------------- Changes: https://git.openjdk.java.net/jdk/pull/634/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=634&range=06 Stats: 75609 lines in 264 files changed: 72724 ins; 1608 del; 1277 mod Patch: https://git.openjdk.java.net/jdk/pull/634.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/634/head:pull/634 PR: https://git.openjdk.java.net/jdk/pull/634 From kvn at openjdk.java.net Tue Oct 20 17:33:22 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 20 Oct 2020 17:33:22 GMT Subject: RFR: 8254995: [x86] ControlWord::print(), rc/pc variables might not be initialized [v2] In-Reply-To: References: Message-ID: <9u8J3j6-SMr2nud13Js4JcyVnVn8qBIv8JvfCPuzCJg=.0ba9acef-13d8-4952-ab60-aa1780118fd1@github.com> On Tue, 20 Oct 2020 09:07:24 GMT, Aleksey Shipilev wrote: >> Static analyzers complain that in `ControlWord::print()`, `rc`/`pc` variables might not be initialized. This never >> happens in practice, because `rounding_control()` and `precision_control()` return the good values. We can make it >> cleaner to silence the compiler. Testing: >> - [x] Linux x86_64 tier1 > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Use fatal(), initialize to NULL and add comments Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/731 From kvn at openjdk.java.net Tue Oct 20 17:41:20 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 20 Oct 2020 17:41:20 GMT Subject: RFR: 8253525: Implement getInstanceSize/sizeOf intrinsics [v3] In-Reply-To: References: Message-ID: <0E5sXAWBENg290o8HpKfufH-e69Ue4EfAR074HuNOt4=.3e8af6ad-4aaf-4427-b008-642ca7138f02@github.com> On Mon, 19 Oct 2020 06:57:24 GMT, Aleksey Shipilev wrote: >> This is fork off the SizeOf JEP, JDK-8249196. There is already the entry point in JDK that can use the intrinsic like >> this: `Instrumentation.getInstanceSize`. Therefore, we can implement the C1/C2 intrinsic now, hook it up to >> `Instrumentation`, and let the tools use that fast path today. With this patch, JOL is able to be close to >> `deepSizeOf` implementation from SizeOf JEP. >> Example performance improvements for sizing up a custom linked list: >> >> Benchmark (size) Mode Cnt Score Error Units >> >> # Default >> LinkedChainBench.linkedChain 1 avgt 5 705.835 ? 8.051 ns/op >> LinkedChainBench.linkedChain 10 avgt 5 3148.874 ? 37.856 ns/op >> LinkedChainBench.linkedChain 100 avgt 5 28693.256 ? 142.254 ns/op >> LinkedChainBench.linkedChain 1000 avgt 5 290161.590 ? 4594.631 ns/op >> >> # Instrumentation attached, no intrinsics >> LinkedChainBench.linkedChain 1 avgt 5 159.659 ? 19.238 ns/op >> LinkedChainBench.linkedChain 10 avgt 5 717.659 ? 22.540 ns/op >> LinkedChainBench.linkedChain 100 avgt 5 7739.394 ? 111.683 ns/op >> LinkedChainBench.linkedChain 1000 avgt 5 80724.238 ? 2887.794 ns/op >> >> # Instrumentation attached, new intrinsics >> LinkedChainBench.linkedChain 1 avgt 5 95.254 ? 0.808 ns/op >> LinkedChainBench.linkedChain 10 avgt 5 261.564 ? 8.524 ns/op >> LinkedChainBench.linkedChain 100 avgt 5 3367.192 ? 21.128 ns/op >> LinkedChainBench.linkedChain 1000 avgt 5 34148.851 ? 373.080 ns/op > > Aleksey Shipilev has refreshed the contents of this pull request, and previous commits have been removed. The > incremental views will show differences compared to the previous content of the PR. test/jdk/java/lang/instrument/GetObjectSizeTest.java line 2: > 1: /* > 2: * Copyright (c) 2020, Red Hat, Inc. All rights reserved. You can't replace Copyright lines of one company with other. ------------- PR: https://git.openjdk.java.net/jdk/pull/650 From kvn at openjdk.java.net Tue Oct 20 17:41:19 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 20 Oct 2020 17:41:19 GMT Subject: RFR: 8253525: Implement getInstanceSize/sizeOf intrinsics [v4] In-Reply-To: References: Message-ID: On Tue, 20 Oct 2020 10:24:35 GMT, Aleksey Shipilev wrote: >> This is fork off the SizeOf JEP, JDK-8249196. There is already the entry point in JDK that can use the intrinsic like >> this: `Instrumentation.getInstanceSize`. Therefore, we can implement the C1/C2 intrinsic now, hook it up to >> `Instrumentation`, and let the tools use that fast path today. With this patch, JOL is able to be close to >> `deepSizeOf` implementation from SizeOf JEP. >> Example performance improvements for sizing up a custom linked list: >> >> Benchmark (size) Mode Cnt Score Error Units >> >> # Default >> LinkedChainBench.linkedChain 1 avgt 5 705.835 ? 8.051 ns/op >> LinkedChainBench.linkedChain 10 avgt 5 3148.874 ? 37.856 ns/op >> LinkedChainBench.linkedChain 100 avgt 5 28693.256 ? 142.254 ns/op >> LinkedChainBench.linkedChain 1000 avgt 5 290161.590 ? 4594.631 ns/op >> >> # Instrumentation attached, no intrinsics >> LinkedChainBench.linkedChain 1 avgt 5 159.659 ? 19.238 ns/op >> LinkedChainBench.linkedChain 10 avgt 5 717.659 ? 22.540 ns/op >> LinkedChainBench.linkedChain 100 avgt 5 7739.394 ? 111.683 ns/op >> LinkedChainBench.linkedChain 1000 avgt 5 80724.238 ? 2887.794 ns/op >> >> # Instrumentation attached, new intrinsics >> LinkedChainBench.linkedChain 1 avgt 5 95.254 ? 0.808 ns/op >> LinkedChainBench.linkedChain 10 avgt 5 261.564 ? 8.524 ns/op >> LinkedChainBench.linkedChain 100 avgt 5 3367.192 ? 21.128 ns/op >> LinkedChainBench.linkedChain 1000 avgt 5 34148.851 ? 373.080 ns/op > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Add new intrinsics to toBeInvestigated list in CheckGraalIntrinsics.java Changes requested by kvn (Reviewer). src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java line 531: > 529: "sun/instrument/InstrumentationImpl.getObjectSize0(JLjava/lang/Object;)J"); > 530: } > 531: Agree with this. ------------- PR: https://git.openjdk.java.net/jdk/pull/650 From shade at openjdk.java.net Tue Oct 20 17:46:18 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 20 Oct 2020 17:46:18 GMT Subject: RFR: 8253525: Implement getInstanceSize/sizeOf intrinsics [v3] In-Reply-To: <0E5sXAWBENg290o8HpKfufH-e69Ue4EfAR074HuNOt4=.3e8af6ad-4aaf-4427-b008-642ca7138f02@github.com> References: <0E5sXAWBENg290o8HpKfufH-e69Ue4EfAR074HuNOt4=.3e8af6ad-4aaf-4427-b008-642ca7138f02@github.com> Message-ID: On Tue, 20 Oct 2020 17:33:56 GMT, Vladimir Kozlov wrote: >> Aleksey Shipilev has refreshed the contents of this pull request, and previous commits have been removed. The >> incremental views will show differences compared to the previous content of the PR. > > test/jdk/java/lang/instrument/GetObjectSizeTest.java line 2: > >> 1: /* >> 2: * Copyright (c) 2020, Red Hat, Inc. All rights reserved. > > You can't replace Copyright lines of one company with other. Well, I am replacing the entire file. There is a recent precedent of the similar change [here](https://github.com/openjdk/jdk/commit/6d13c766#diff-0daf75421f8fdb55a5640742ef6f12730fe1b370cc864311c188ad99996b51fe). Either that should be reversed, or this one accepted. ------------- PR: https://git.openjdk.java.net/jdk/pull/650 From shade at openjdk.java.net Tue Oct 20 17:46:18 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 20 Oct 2020 17:46:18 GMT Subject: RFR: 8253525: Implement getInstanceSize/sizeOf intrinsics [v3] In-Reply-To: References: <0E5sXAWBENg290o8HpKfufH-e69Ue4EfAR074HuNOt4=.3e8af6ad-4aaf-4427-b008-642ca7138f02@github.com> Message-ID: On Tue, 20 Oct 2020 17:42:00 GMT, Aleksey Shipilev wrote: >> test/jdk/java/lang/instrument/GetObjectSizeTest.java line 2: >> >>> 1: /* >>> 2: * Copyright (c) 2020, Red Hat, Inc. All rights reserved. >> >> You can't replace Copyright lines of one company with other. > > Well, I am replacing the entire file. There is a recent precedent of the similar change > [here](https://github.com/openjdk/jdk/commit/6d13c766#diff-0daf75421f8fdb55a5640742ef6f12730fe1b370cc864311c188ad99996b51fe). > Either that should be reversed, or this one accepted. ...or I can put new test into a separate file. But the existing test is quite inferior compared to the new one, so it does not seem to make a lot of sense to keep it. ------------- PR: https://git.openjdk.java.net/jdk/pull/650 From rrich at openjdk.java.net Tue Oct 20 17:56:14 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Tue, 20 Oct 2020 17:56:14 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v2] In-Reply-To: References: <5R147e7WLawIyLWxjBQUNK22Y19rDsT32HYDCUXep7g=.a2344642-ca05-429e-a733-4ead81eb4f20@github.com> <010C_PcZp7Ol1SpH7__fIb5SPJrbFaQMUa0DN_R2rtM=.49983d69-0901-4081-9043-fbfaef0e4c61@github.com> Message-ID: <-vefsX3p5IkTBZvrIepZdJ-Q1G9HoSDsyVj2M3dkyKc=.03e18d1d-341f-45cc-a25b-c2894ae373e8@github.com> On Tue, 20 Oct 2020 15:33:59 GMT, Richard Reingruber wrote: >>> Hi, >>> >>> this is a good change, because it is a simplification and it it makes the stack walks safe by doing them as part of a >>> handshake. >>> The change conflicts with #119 though. This one is ready to be pushed since last week but was delayed due to other >>> interferences. Would you mind me integrating #119 first? After integration it would be needed to pull 2 EscapeBarriers >>> out of handshakes. Of course I would help do that. Thanks, Richard. >> >> Hey Richard, go ahead and integrate your 119 first, I'll hold off and do the merge once you integrated. > >> >> >> > Hi, >> > this is a good change, because it is a simplification and it it makes the stack walks safe by doing them as part of a >> > handshake. The change conflicts with #119 though. This one is ready to be pushed since last week but was delayed due to >> > other interferences. Would you mind me integrating #119 first? After integration it would be needed to pull 2 >> > EscapeBarriers out of handshakes. Of course I would help do that. Thanks, Richard. >> >> Hey Richard, go ahead and integrate your 119 first, I'll hold off and do the merge once you integrated. > > Thanks Robbin! Hi Robbin, for merging master after integration of #119 I'd suggest to resolve the conflicts by chosing the alternative from this pr and then apply https://github.com/reinrich/jdk/commit/6fa91e344ed5bf6d877e3f5a2d0d1920591fd441 (is there a more elegant way to propose a patch?) I successfully tested make run-test TEST=test/jdk/com/sun/jdi/EATests.java which also covers PopFrame and ForceEarlyReturn. More tests are running. For night tests of our team it is unfortunately too late. Thanks, Richard. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From kvn at openjdk.java.net Tue Oct 20 17:57:13 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 20 Oct 2020 17:57:13 GMT Subject: RFR: 8253525: Implement getInstanceSize/sizeOf intrinsics [v3] In-Reply-To: References: <0E5sXAWBENg290o8HpKfufH-e69Ue4EfAR074HuNOt4=.3e8af6ad-4aaf-4427-b008-642ca7138f02@github.com> Message-ID: On Tue, 20 Oct 2020 17:43:07 GMT, Aleksey Shipilev wrote: >> Well, I am replacing the entire file. There is a recent precedent of the similar change >> [here](https://github.com/openjdk/jdk/commit/6d13c766#diff-0daf75421f8fdb55a5640742ef6f12730fe1b370cc864311c188ad99996b51fe). >> Either that should be reversed, or this one accepted. > > ...or I can put new test into a separate file. But the existing test is quite inferior compared to the new one, so it > does not seem to make a lot of sense to keep it. It was mistake in 8253191 (I file bug). If you modify existing file (even if you keep only test name the same) you have to preserve original Copyright and add new Copyright line. You don't need create new file. We have a lot of cases with 2 or more Copyright lines - it is normal: https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/compiler/vectorization/TestVectorsNotSavedAtSafepoint.java ------------- PR: https://git.openjdk.java.net/jdk/pull/650 From shade at openjdk.java.net Tue Oct 20 18:00:09 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 20 Oct 2020 18:00:09 GMT Subject: Integrated: 8254995: [x86] ControlWord::print(), rc/pc variables might not be initialized In-Reply-To: References: Message-ID: On Mon, 19 Oct 2020 10:20:18 GMT, Aleksey Shipilev wrote: > Static analyzers complain that in `ControlWord::print()`, `rc`/`pc` variables might not be initialized. This never > happens in practice, because `rounding_control()` and `precision_control()` return the good values. We can make it > cleaner to silence the compiler. Testing: > - [x] Linux x86_64 tier1 This pull request has now been integrated. Changeset: ee6eb986 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/ee6eb986 Stats: 6 lines in 1 file changed: 6 ins; 0 del; 0 mod 8254995: [x86] ControlWord::print(), rc/pc variables might not be initialized Reviewed-by: kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/731 From kvn at openjdk.java.net Tue Oct 20 18:15:13 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 20 Oct 2020 18:15:13 GMT Subject: RFR: 8253525: Implement getInstanceSize/sizeOf intrinsics [v3] In-Reply-To: References: <0E5sXAWBENg290o8HpKfufH-e69Ue4EfAR074HuNOt4=.3e8af6ad-4aaf-4427-b008-642ca7138f02@github.com> Message-ID: On Tue, 20 Oct 2020 17:54:21 GMT, Vladimir Kozlov wrote: >> ...or I can put new test into a separate file. But the existing test is quite inferior compared to the new one, so it >> does not seem to make a lot of sense to keep it. > > It was mistake in 8253191 (I file bug). If you modify existing file (even if you keep only test name the same) you have > to preserve original Copyright and add new Copyright line. You don't need create new file. We have a lot of cases with > 2 or more Copyright lines - it is normal: > https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/compiler/vectorization/TestVectorsNotSavedAtSafepoint.java I file 8255067 to restore Copyright line in TestUnsignedByteCompare.java test file. ------------- PR: https://git.openjdk.java.net/jdk/pull/650 From mdoerr at openjdk.java.net Tue Oct 20 18:22:13 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Tue, 20 Oct 2020 18:22:13 GMT Subject: RFR: 8255065: Zero: accessor_entry misses the IRIW case In-Reply-To: References: Message-ID: On Tue, 20 Oct 2020 17:09:59 GMT, Aleksey Shipilev wrote: > While doing a change in related area, I noticed there is no IRIW handling block in `ZeroInterpreter::accessor_entry` > when reading volatile fields. This probably breaks PPC64 Zero. > There is a block in `bytecodeInterpreter.cpp` for [common field > access](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/interpreter/zero/bytecodeInterpreter.cpp#L1899-L1901): > if (cache->is_volatile()) { > if (support_IRIW_for_not_multiple_copy_atomic_cpu) { > OrderAccess::fence(); > } > > Attention @TheRealMDoerr ;) > > Testing: > - [x] Linux x86_64 zero fastdebug build (includes jmod generation with Zero) Good catch! I don't think anybody uses zero for PPC64, but it's better to get this fixed. Thanks. ------------- Marked as reviewed by mdoerr (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/766 From jbhateja at openjdk.java.net Tue Oct 20 19:22:25 2020 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Tue, 20 Oct 2020 19:22:25 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v5] In-Reply-To: References: Message-ID: > Summary: > > 1) Partial in-lining technique avoids call overhead penalty for small array copy operations with size less than 32 > bytes. 2) At runtime, a conditional check based on copy length either calls an array-copy stub or executes an optimized > instruction sequence using AVX-512 masked instructions emitted at the call site. 3) New runtime flag > ArrayCopyPartialInlineSize=0/32(default)/64 bytes determines the maximum size for partial in-lining. 4) Based on the > perf results seen in benchmarks currently partial in-lining is performed only for arraycopy involving sub-word types > (bool/byte/char/short). Once PR-61 gets integrated we can extend this patch to cover all the primitive types. > Performance Results: > System : CascadeLake Server, Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz > Micros : test/micro/org/openjdk/bench/java/lang/ArrayCopy*.java > ArrayCopyPartialInlineSize : 32 > > JMH | Block Size | Baseline (ns/op) | Partial Inling (ns/op) | Gain > -- | -- | -- | -- | -- > ArrayCopyAligned.testByte | 1 | 5.417 | 2.696 | 2.009272997 > ArrayCopyAligned.testByte | 3 | 5.494 | 2.702 | 2.03330866 > ArrayCopyAligned.testByte | 5 | 5.417 | 2.637 | 2.05422829 > ArrayCopyAligned.testByte | 10 | 5.343 | 2.703 | 1.976692564 > ArrayCopyAligned.testByte | 20 | 5.837 | 2.636 | 2.214339909 > ArrayCopyAligned.testByte | 70 | 5.86 | 6 | 0.976666667 > ArrayCopyAligned.testByte | 150 | 6.766 | 6.906 | 0.979727773 > ArrayCopyAligned.testByte | 300 | 7.605 | 7.952 | 0.956363179 > ArrayCopyAligned.testByte | 600 | 11.989 | 12.007 | 0.998500874 > ArrayCopyAligned.testByte | 1200 | 16.447 | 16.585 | 0.991679228 > ArrayCopyAligned.testChar | 1 | 5.02 | 2.828 | 1.775106082 > ArrayCopyAligned.testChar | 3 | 5.129 | 2.762 | 1.85698769 > ArrayCopyAligned.testChar | 5 | 5.041 | 2.762 | 1.82512672 > ArrayCopyAligned.testChar | 10 | 5.716 | 2.762 | 2.069514844 > ArrayCopyAligned.testChar | 20 | 5.111 | 5.399 | 0.946656788 > ArrayCopyAligned.testChar | 70 | 6.271 | 6.242 | 1.004645947 > ArrayCopyAligned.testChar | 150 | 7.45 | 7.599 | 0.980392157 > ArrayCopyAligned.testChar | 300 | 9.904 | 10.112 | 0.97943038 > ArrayCopyAligned.testChar | 600 | 17.131 | 17.167 | 0.997902953 > ArrayCopyAligned.testChar | 1200 | 29.556 | 29.851 | 0.990117584 > ArrayCopyUnalignedBoth.testByte | 1 | 5.419 | 2.702 | 2.005551443 > ArrayCopyUnalignedBoth.testByte | 3 | 5.558 | 2.636 | 2.108497724 > ArrayCopyUnalignedBoth.testByte | 5 | 5.43 | 2.636 | 2.059939302 > ArrayCopyUnalignedBoth.testByte | 10 | 5.378 | 2.637 | 2.039438756 > ArrayCopyUnalignedBoth.testByte | 20 | 5.914 | 2.636 | 2.243550836 > ArrayCopyUnalignedBoth.testByte | 70 | 5.882 | 5.954 | 0.987907289 > ArrayCopyUnalignedBoth.testByte | 150 | 6.784 | 6.88 | 0.986046512 > ArrayCopyUnalignedBoth.testByte | 300 | 7.635 | 7.968 | 0.958207831 > ArrayCopyUnalignedBoth.testByte | 600 | 12.226 | 12.129 | 1.007997362 > ArrayCopyUnalignedBoth.testByte | 1200 | 16.992 | 20.717 | 0.820195974 > ArrayCopyUnalignedBoth.testChar | 1 | 5.019 | 2.828 | 1.774752475 > ArrayCopyUnalignedBoth.testChar | 3 | 5.163 | 2.763 | 1.868621064 > ArrayCopyUnalignedBoth.testChar | 5 | 5.042 | 2.827 | 1.783516095 > ArrayCopyUnalignedBoth.testChar | 10 | 5.718 | 2.828 | 2.021923621 > ArrayCopyUnalignedBoth.testChar | 20 | 5.111 | 5.404 | 0.945780903 > ArrayCopyUnalignedBoth.testChar | 70 | 6.367 | 6.235 | 1.02117081 > ArrayCopyUnalignedBoth.testChar | 150 | 7.367 | 8.269 | 0.890917886 > ArrayCopyUnalignedBoth.testChar | 300 | 10.358 | 10.642 | 0.973313287 > ArrayCopyUnalignedBoth.testChar | 600 | 20.84 | 17.522 | 1.189361945 > ArrayCopyUnalignedBoth.testChar | 1200 | 31.895 | 31.892 | 1.000094067 > ArrayCopyUnalignedDst.testByte | 1 | 5.455 | 2.637 | 2.068638604 > ArrayCopyUnalignedDst.testByte | 3 | 5.562 | 2.702 | 2.058475204 > ArrayCopyUnalignedDst.testByte | 5 | 5.427 | 2.702 | 2.008512213 > ArrayCopyUnalignedDst.testByte | 10 | 5.367 | 2.696 | 1.990727003 > ArrayCopyUnalignedDst.testByte | 20 | 5.839 | 2.637 | 2.214258627 > ArrayCopyUnalignedDst.testByte | 70 | 5.888 | 5.968 | 0.986595174 > ArrayCopyUnalignedDst.testByte | 150 | 6.785 | 6.773 | 1.001771741 > ArrayCopyUnalignedDst.testByte | 300 | 7.606 | 7.972 | 0.954089313 > ArrayCopyUnalignedDst.testByte | 600 | 11.986 | 21.195 | 0.565510734 > ArrayCopyUnalignedDst.testByte | 1200 | 16.54 | 16.784 | 0.985462345 > ArrayCopyUnalignedDst.testChar | 1 | 5.02 | 2.827 | 1.775733994 > ArrayCopyUnalignedDst.testChar | 3 | 5.131 | 2.762 | 1.857711803 > ArrayCopyUnalignedDst.testChar | 5 | 5.038 | 2.762 | 1.82404055 > ArrayCopyUnalignedDst.testChar | 10 | 5.718 | 2.762 | 2.070238957 > ArrayCopyUnalignedDst.testChar | 20 | 5.113 | 5.401 | 0.946676541 > ArrayCopyUnalignedDst.testChar | 70 | 6.222 | 6.214 | 1.001287416 > ArrayCopyUnalignedDst.testChar | 150 | 7.367 | 8.125 | 0.906707692 > ArrayCopyUnalignedDst.testChar | 300 | 10.204 | 10.082 | 1.012100774 > ArrayCopyUnalignedDst.testChar | 600 | 16.978 | 17.135 | 0.990837467 > ArrayCopyUnalignedDst.testChar | 1200 | 32.351 | 31.996 | 1.011095137 > ArrayCopyUnalignedSrc.testByte | 1 | 5.414 | 2.696 | 2.008160237 > ArrayCopyUnalignedSrc.testByte | 3 | 5.494 | 2.637 | 2.083428138 > ArrayCopyUnalignedSrc.testByte | 5 | 5.431 | 2.637 | 2.059537353 > ArrayCopyUnalignedSrc.testByte | 10 | 5.344 | 2.703 | 1.977062523 > ArrayCopyUnalignedSrc.testByte | 20 | 5.834 | 2.696 | 2.163946588 > ArrayCopyUnalignedSrc.testByte | 70 | 5.883 | 6.009 | 0.979031453 > ArrayCopyUnalignedSrc.testByte | 150 | 6.729 | 6.87 | 0.979475983 > ArrayCopyUnalignedSrc.testByte | 300 | 7.603 | 7.97 | 0.953952321 > ArrayCopyUnalignedSrc.testByte | 600 | 12.004 | 12.16 | 0.987171053 > ArrayCopyUnalignedSrc.testByte | 1200 | 16.534 | 16.643 | 0.9934507 > ArrayCopyUnalignedSrc.testChar | 1 | 5.021 | 2.762 | 1.81788559 > ArrayCopyUnalignedSrc.testChar | 3 | 5.13 | 2.762 | 1.857349747 > ArrayCopyUnalignedSrc.testChar | 5 | 5.042 | 2.827 | 1.783516095 > ArrayCopyUnalignedSrc.testChar | 10 | 5.726 | 2.761 | 2.073886273 > ArrayCopyUnalignedSrc.testChar | 20 | 5.112 | 5.401 | 0.94649139 > ArrayCopyUnalignedSrc.testChar | 70 | 6.113 | 6.227 | 0.981692629 > ArrayCopyUnalignedSrc.testChar | 150 | 7.493 | 7.888 | 0.949923935 > ArrayCopyUnalignedSrc.testChar | 300 | 10.234 | 10.501 | 0.97457385 > ArrayCopyUnalignedSrc.testChar | 600 | 17.175 | 17.142 | 1.001925096 > ArrayCopyUnalignedSrc.testChar | 1200 | 31.926 | 31.987 | 0.998092975 > > Detailed Reports: > Baseline : > [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt) > WithOpt : > [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt) Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: - Merge remote-tracking branch 'upstream' into JDK-8252848 - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8252848 - Replacing explicit type checks with existing type checking routines - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8252848 - 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions. ------------- Changes: https://git.openjdk.java.net/jdk/pull/302/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=302&range=04 Stats: 518 lines in 23 files changed: 494 ins; 0 del; 24 mod Patch: https://git.openjdk.java.net/jdk/pull/302.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/302/head:pull/302 PR: https://git.openjdk.java.net/jdk/pull/302 From rehn at openjdk.java.net Tue Oct 20 19:45:17 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 20 Oct 2020 19:45:17 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v2] In-Reply-To: <-vefsX3p5IkTBZvrIepZdJ-Q1G9HoSDsyVj2M3dkyKc=.03e18d1d-341f-45cc-a25b-c2894ae373e8@github.com> References: <5R147e7WLawIyLWxjBQUNK22Y19rDsT32HYDCUXep7g=.a2344642-ca05-429e-a733-4ead81eb4f20@github.com> <010C_PcZp7Ol1SpH7__fIb5SPJrbFaQMUa0DN_R2rtM=.49983d69-0901-4081-9043-fbfaef0e4c61@github.com> <-vefsX3p5IkTBZvrIepZdJ-Q1G9HoSDsyVj2M3dkyKc=.03e18d1d-341f-45cc-a25b-c2894ae373e8@github.com> Message-ID: <_akbNkG-trhkT8U9QefeEYexY06Ztv6F2HiFHZF3alI=.d7f4266e-1460-433c-8e20-8dea52d927aa@github.com> On Tue, 20 Oct 2020 17:53:47 GMT, Richard Reingruber wrote: >>> >>> >>> > Hi, >>> > this is a good change, because it is a simplification and it it makes the stack walks safe by doing them as part of a >>> > handshake. The change conflicts with #119 though. This one is ready to be pushed since last week but was delayed due to >>> > other interferences. Would you mind me integrating #119 first? After integration it would be needed to pull 2 >>> > EscapeBarriers out of handshakes. Of course I would help do that. Thanks, Richard. >>> >>> Hey Richard, go ahead and integrate your 119 first, I'll hold off and do the merge once you integrated. >> >> Thanks Robbin! > > Hi Robbin, > > for merging master after integration of #119 I'd suggest to resolve the > conflicts by chosing the alternative from this pr and then apply > https://github.com/reinrich/jdk/commit/6fa91e344ed5bf6d877e3f5a2d0d1920591fd441 > (is there a more elegant way to propose a patch?) > > I successfully tested > > make run-test TEST=test/jdk/com/sun/jdi/EATests.java > which also covers PopFrame and ForceEarlyReturn. > > More tests are running. > > For night tests of our team it is unfortunately too late. > > Thanks, Richard. Thanks, I'm exploring what we need to execute the EB inside the handshake. So far I think that really needs to go in a separate PR, since it becomes really unrelated to this.... picking up your change. > Hi Robbin, > > for merging master after integration of #119 I'd suggest to resolve the > conflicts by chosing the alternative from this pr and then apply > [reinrich at 6fa91e3](https://github.com/reinrich/jdk/commit/6fa91e344ed5bf6d877e3f5a2d0d1920591fd441) > (is there a more elegant way to propose a patch?) > > I successfully tested > > ``` > make run-test TEST=test/jdk/com/sun/jdi/EATests.java > ``` > > which also covers PopFrame and ForceEarlyReturn. > > More tests are running. > > For night tests of our team it is unfortunately too late. > > Thanks, Richard. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From rrich at openjdk.java.net Tue Oct 20 19:57:08 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Tue, 20 Oct 2020 19:57:08 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v2] In-Reply-To: <_akbNkG-trhkT8U9QefeEYexY06Ztv6F2HiFHZF3alI=.d7f4266e-1460-433c-8e20-8dea52d927aa@github.com> References: <5R147e7WLawIyLWxjBQUNK22Y19rDsT32HYDCUXep7g=.a2344642-ca05-429e-a733-4ead81eb4f20@github.com> <010C_PcZp7Ol1SpH7__fIb5SPJrbFaQMUa0DN_R2rtM=.49983d69-0901-4081-9043-fbfaef0e4c61@github.com> <-vefsX3p5IkTBZvrIepZdJ-Q1G9HoSDsyVj2M3dkyKc=.03e18d1d-341f-45cc-a25b-c2894ae373e8@github.com> <_akbNkG-trhkT8U9QefeEYexY06Ztv6F2HiFHZF3alI=.d7f4266e-1460-433c-8e20-8dea52d927aa@github.com> Message-ID: On Tue, 20 Oct 2020 19:42:38 GMT, Robbin Ehn wrote: > Thanks, I'm exploring what we need to execute the EB inside the handshake. I want to experiment with object reallocation without referencing a frame. I think a should be possible to reallocate objects given only the corresponding compiled pc. If so, then a handshake/vm operation can fail with the request to reallocate objects at a pc. This can be done concurrently and then the handshake/vm operation can be restarted. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From mdoerr at openjdk.java.net Tue Oct 20 20:58:11 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Tue, 20 Oct 2020 20:58:11 GMT Subject: RFR: 8233343: Deprecate -XX:+CriticalJNINatives flag which implements =?UTF-8?B?SmF2YeKApg==?= In-Reply-To: References: Message-ID: On Tue, 20 Oct 2020 12:46:29 GMT, Robbin Ehn wrote: >> This change deprecates the -XX:+CriticalJNINatives flag and removes the develop flag -XX:+StressCriticalJNINatives. >> See CSR for more details. >> This change also removes the lazy GC lock in the critical native transition, and runs the critical native function as >> thread_in_Java. I add a safepoint check at the end of the native function and transition to native and poll again for >> the safepoint after the function if a safepoint is requested. Tested with tier 1-6 (we have a few tests that use >> this). And built on linux-x86-open,linux-s390x-open,linux-arm32-debug,linux-ppc64le-debug. > > Thanks you @coleenp! > Looks good, really nice delta of -800 LOC ! Makes sense to me. But I have a couple of remarks/suggestions: - Object pinning for T_ARRAY on x86 shouldn't be needed any more since we stay in _thread_in_Java - Transition to _thread_in_native is pointless if we transition to _thread_in_native_trans immediately afterwards - I think the tests should also run on os.arch=="ppc64" | os.arch=="ppc64le" | os.arch=="s390x", but we should double-check if they really work Thanks for taking care of all platforms! ------------- PR: https://git.openjdk.java.net/jdk/pull/764 From coleenp at openjdk.java.net Tue Oct 20 21:58:16 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 20 Oct 2020 21:58:16 GMT Subject: RFR: 8233343: Deprecate -XX:+CriticalJNINatives flag which implements =?UTF-8?B?SmF2YeKApg==?= In-Reply-To: References: Message-ID: On Tue, 20 Oct 2020 20:55:37 GMT, Martin Doerr wrote: >> Thanks you @coleenp! >> Looks good, really nice delta of -800 LOC ! > > Makes sense to me. But I have a couple of remarks/suggestions: > - Object pinning for T_ARRAY on x86 shouldn't be needed any more since we stay in _thread_in_Java > - Transition to _thread_in_native is pointless if we transition to _thread_in_native_trans immediately afterwards > - I think the tests should also run on os.arch=="ppc64" | os.arch=="ppc64le" | os.arch=="s390x", but we should > double-check if they really work > Thanks for taking care of all platforms! @TheRealMDoerr Thank you for reviewing this and your comments. - yes, I can remove object pinning since it's no longer needed. - The transition to native is pointless but I was a bit unnerved from going from _thread_in_Java to _thread_in_native_trans even though there's no consistency checks to show it's an illegal transition. On the other hand, the dummy transition to native isn't nice either so I'll make this change and retest. - I can't test with ppc or s390 but can you modify the requires in the two tests in this patch and let me know if it works? ------------- PR: https://git.openjdk.java.net/jdk/pull/764 From david.holmes at oracle.com Tue Oct 20 23:02:00 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 21 Oct 2020 09:02:00 +1000 Subject: RFR: JDK-8254889: name_and_sig_as_C_string usages in frame coding without ResourceMark [v2] In-Reply-To: References: <2zlELXVF1ZB4uK35AHD6VrwfhHms2JdygTsv1qHmwVQ=.68dc2250-1329-457f-9964-8998d3e3c94f@github.com> Message-ID: <52801d78-c247-e0d7-daac-caf37cc2b803@oracle.com> On 20/10/2020 11:45 pm, Matthias Baesken wrote: > On Fri, 16 Oct 2020 21:36:22 GMT, David Holmes wrote: > >> As long as there is a ResourceMark in the caller there is no issue here - though the code should be documented in that >> case. > > hi David, I checked the callers of frame void describe(FrameValues& values, int frame_no); > more closely and I think they are good (both JavaThread::print_frame_layout and trace_method_handle_stub have a > ResourceMark) ; so should I place a comment in frame.cpp above describe (something like "Attention -callers need a > ResourceMark") ? Yes - also noting that the RA allocated string is returned to the caller. Thanks, David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/698 > From kvn at openjdk.java.net Tue Oct 20 23:11:19 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 20 Oct 2020 23:11:19 GMT Subject: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v10] In-Reply-To: References: Message-ID: On Tue, 20 Oct 2020 13:42:27 GMT, Fei Yang wrote: >> Contributed-by: ard.biesheuvel at linaro.org, dongbo4 at huawei.com >> >> This added an intrinsic for SHA3 using aarch64 v8.2 SHA3 Crypto Extensions. >> Reference implementation for core SHA-3 transform using ARMv8.2 Crypto Extensions: >> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/arm64/crypto/sha3-ce-core.S?h=v5.4.52 >> >> Trivial adaptation in SHA3. implCompress is needed for the purpose of adding the intrinsic. >> For SHA3, we need to pass one extra parameter "digestLength" to the stub for the calculation of block size. >> "digestLength" is also used in for the EOR loop before keccak to differentiate different SHA3 variants. >> >> We added jtreg tests for SHA3 and used QEMU system emulator which supports SHA3 instructions to test the functionality. >> Patch passed jtreg tier1-3 tests with QEMU system emulator. >> Also verified with jtreg tier1-3 tests without SHA3 instructions on aarch64-linux-gnu and x86_64-linux-gnu, to make >> sure that there's no regression. >> We used one existing JMH test for performance test: test/micro/org/openjdk/bench/java/security/MessageDigests.java >> We measured the performance benefit with an aarch64 cycle-accurate simulator. >> Patch delivers 20% - 40% performance improvement depending on specific SHA3 digest length and size of the message. >> >> For now, this feature will not be enabled automatically for aarch64. We can auto-enable this when it is fully tested on >> real hardware. But for the above testing purposes, this is auto-enabled when the corresponding hardware feature is >> detected. > > Fei Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains > 13 commits: > - Fix trailing whitespace issue reported by jcheck > - Merge master > - Merge master > - Remove unnecessary code changes in vm_version_aarch64.cpp > - Merge master > - Merge master > - Merge master > - Merge master > - Add sha3 instructions to cpu/aarch64/aarch64-asmtest.py and regenerate the test in assembler_aarch64.cpp:asm_check > - Rebase > - ... and 3 more: https://git.openjdk.java.net/jdk/compare/cdc8c401...d32c8ad7 Someone in Oracle have to run tier1-tier3 testing with these changes to make sure nothing is broken. I don't want to repeat 8254790. src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java line 604: > 602: add(ignore, "sun/security/provider/SHA5." + shaCompressName + "([BI)V"); > 603: } > 604: add(toBeInvestigated, "sun/security/provider/SHA3." + shaCompressName + "([BI)V"); This should be under `if (isJDK16OrHigher())` check. Something like this: https://github.com/openjdk/jdk/pull/650/files#diff-d1f378fc1b7fe041309e854d40b3a95a91e63fdecf0ecd9826b7c95eaeba314eR527 You can wait when Aleksey push it and update your changes ------------- Changes requested by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/207 From kvn at openjdk.java.net Tue Oct 20 23:22:10 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 20 Oct 2020 23:22:10 GMT Subject: RFR: 8248411: [aarch64] Insufficient error handling when CodeBuffer is exhausted In-Reply-To: References: Message-ID: On Tue, 20 Oct 2020 13:31:59 GMT, Patric Hedlin wrote: > Trampoline call generation (in the macro-assembler) may run out of CodeBuffer space without the proper error handling, > resulting in asserts such as: # Internal Error (.../open/src/hotspot/share/asm/codeBuffer.hpp:198), pid=845, tid=859 > # assert(allocates2(pc)) failed: relocation addr must be in this section > This update extends the error handling for such error cases to cover all uses of `trampoline_call()`, direct and > indirect. Failure registration/recording is retained in the "**aarch64.ad**" file. output.cpp change looks fine. aarch64 changes should be reviewed who familiar. ------------- PR: https://git.openjdk.java.net/jdk/pull/765 From zgu at openjdk.java.net Wed Oct 21 00:20:13 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Wed, 21 Oct 2020 00:20:13 GMT Subject: RFR: 8233343: Deprecate -XX:+CriticalJNINatives flag which implements =?UTF-8?B?SmF2YeKApg==?= In-Reply-To: References: Message-ID: On Tue, 20 Oct 2020 21:55:36 GMT, Coleen Phillimore wrote: > @TheRealMDoerr Thank you for reviewing this and your comments. > > * yes, I can remove object pinning since it's no longer needed. > * The transition to native is pointless but I was a bit unnerved from going from _thread_in_Java to > _thread_in_native_trans even though there's no consistency checks to show it's an illegal transition. On the other > hand, the dummy transition to native isn't nice either so I'll make this change and retest. > * I can't test with ppc or s390 but can you modify the requires in the two tests in this patch and let me know if it > works? Agree, object pinning is no longer needed for Shenandoah. ------------- PR: https://git.openjdk.java.net/jdk/pull/764 From github.com+51754783+coreyashford at openjdk.java.net Wed Oct 21 00:54:17 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Wed, 21 Oct 2020 00:54:17 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v4] In-Reply-To: <_Kp2I32RPuFG2TsngEkrL1CEHpYFbT5k2yLSpJ1r4-w=.d6734741-0392-4cc7-acef-81367a598ff4@github.com> References: <45FtTQB1m6HyZSASY42STMkQffIWlVPibWn9_r00xYs=.daad2653-2571-491f-8dd7-5954fe4ece00@github.com> <7-p-Kc9lQyyuoWdNtmgbXbwkxsgk4oQGKmFSCcMpvnU=.97810c01-3200-4767-bbd4-35d53c2bc5ca@github.com> <6Voyfr_s-ieyRA-8Rtvvpz7tkhhicA8sY2d2KTp3Kmw=.fa256bae-2143-4b43-bfea-5837ad31eb6a@github.com> <7XjzEn5DggliDrvjhrGwXZL5r4lsqeGF9SGLmRr5L84=.a4481a62-4ecf-4e3f-98f3-70e548c67b52@github.com> <9y5m4zfsDZMdIZ6CT38BzO0tpFMuFxUswAb pjfDny-w=.44c7ada7-bf77-45f1-b5a6-b542731d6685@github.com> <_Kp2I32RPuFG2TsngEkrL1CEHpYFbT5k2yLSpJ1r4-w=.d6734741-0392-4cc7-acef-81367a598ff4@github.com> Message-ID: On Tue, 20 Oct 2020 16:42:46 GMT, CoreyAshford wrote: > @CoreyAshford apologies i pointed to the "umbrella" test that runs Graal unit tests, the actual test is > [CheckGraalIntrinsics](https://github.com/openjdk/jdk/blob/master/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java) > See this PR for an example: #762 I believe I have a fix for this, but I'm having trouble running the test case. I wasn't even aware of this test suite before, so I've been trying to figure out how to correctly run it. Here's what I'm seeing: % pwd % ~/git-trees/mx/mx --java-home /home/cjashfor/git-trees/jdk/build/linux-x86_64-server-slowdebug/jdk -v unittest CheckGraalIntrinsics Setting environment variable JAVA_HOME=/home/cjashfor/git-trees/jdk/build/linux-x86_64-server-slowdebug/jdk from --java-home env JAVA_HOME=/home/cjashfor/git-trees/jdk/build/linux-x86_64-server-slowdebug/jdk MX_SUBPROCESS_COMMAND_FILE=/tmp/mx_subprocess_command.LLxLCx MX_HOME=/home/cjashfor/git-trees/mx MX_PRIMARY_SUITE_PATH=/home/cjashfor/git-trees/jdk/src/hotspot MX__SUITEMODEL=sibling \ [all files are up to date - skipping com.oracle.mxtool.junit.jdk9] [all files are up to date - skipping com.oracle.mxtool.junit] [skipping JUNIT_TOOL] Traceback (most recent call last): File "/home/cjashfor/git-trees/mx/mx.py", line 17030, in main() File "/home/cjashfor/git-trees/mx/mx.py", line 17011, in main retcode = c(command_args) File "/home/cjashfor/git-trees/mx/mx_commands.py", line 147, in __call__ return self.command_function(*args, **kwargs) File "/home/cjashfor/git-trees/mx/mx_unittest.py", line 487, in unittest _unittest(args, ['@Test', '@Parameters'], junit_args, **parsed_args.__dict__) File "/home/cjashfor/git-trees/mx/mx_unittest.py", line 319, in _unittest _run_tests(args, harness, vmLauncher, annotations, testfile, blacklist, whitelist, regex, mx.suite(suite) if suite else None) File "/home/cjashfor/git-trees/mx/mx_unittest.py", line 179, in _run_tests if vmLauncher.jdk().javaCompliance < p.javaCompliance: AttributeError: 'NoneType' object has no attribute 'javaCompliance' Any ideas? ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From kvn at openjdk.java.net Wed Oct 21 01:28:16 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 21 Oct 2020 01:28:16 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v5] In-Reply-To: References: Message-ID: On Mon, 12 Oct 2020 21:41:37 GMT, CoreyAshford wrote: >> This patch set encompasses the following commits: >> >> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for >> the intrinsic. The API is similar to the existing encodeBlock intrinsic. >> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic >> implementation >> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. >> - Adds a JMH microbenchmark for both Base64 encoding and encoding. >> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. > > CoreyAshford has updated the pull request incrementally with one additional commit since the last revision: > > Per Martin Doerr's v4 review: fix regression, and speed up return time for buffers that are too small > > - Check for case where the result of subtacting 12 off of the source > length produces a negative number. To do this efficiently, I added the > instruction definition for mcrxrx, which is implemented on Power9+. > > - Rearrange the code so that minimal initialization is performed before > checking the size, so that the intrinsic can return quickly in the event > that the buffer is too small to process. Please, update src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java ------------- Changes requested by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/293 From github.com+51754783+coreyashford at openjdk.java.net Wed Oct 21 01:39:29 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Wed, 21 Oct 2020 01:39:29 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v6] In-Reply-To: References: Message-ID: > This patch set encompasses the following commits: > > - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for > the intrinsic. The API is similar to the existing encodeBlock intrinsic. > - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic > implementation > - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. > - Adds a JMH microbenchmark for both Base64 encoding and encoding. > - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. CoreyAshford has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: - CheckGraalIntrinsics.java: Disable testing of decodeBlock intrinsic until implemented for AMD64/x86 - Merge branch 'master' of https://git.openjdk.java.net/jdk into base64_decode_intrinsic - stubGenerator_ppc.cpp: remove unnecessary complexity for checking < 0 after srawi. - Per Martin Doerr's v4 review: fix regression, and speed up return time for buffers that are too small - Check for case where the result of subtacting 12 off of the source length produces a negative number. To do this efficiently, I added the instruction definition for mcrxrx, which is implemented on Power9+. - Rearrange the code so that minimal initialization is performed before checking the size, so that the intrinsic can return quickly in the event that the buffer is too small to process. - TestBase64.java: fix comment to correctly reflect actual intrinsic names. The intrinsic names that are visible with -XX:+PrintCompilation are encode and decode, rather than encodeBlock and decodeBlock. - stubGenerator_ppc.cpp: fix regression caused by change to using loop counter My original fix didn't account for the case where sl < block_size. In the event sl < block_size, the shifted sl will become zero, so it should jump to the code that computes how much data was processed - 0 - and return. - stubGenerator_ppc.cpp: Fix multiple issues as per Martin Doerr's v2 review * Remove extraneous comma from SAP copyright notice * Move align(32) to the head of the loop rather than the beginning of the unwound code * Simplified looping condition to use a loop counter instead of a final address. This eliminated the need for the "end" variable, and essentially replaced it with CTR, which is computed using a simple bitwise shift of the size. * Re-ran benchmarks against loop_unrolls values: 1, 2, 4, 8, 16 to find optimal value, now 4. * Corrected a typo in the word "elements" - vm_version_ppc.cpp: per Martin Doerr's review of v2: fix copy/paste error - vmIntrinsics.cpp: Per Martin Doerr's v2 review: rearrange order of case statement to be consistent with others. - runtime.cpp: per Martin Doerr's review of v2, correct comment as per current semantics of decodeBlock() * The reference to "ofs" seems to be a copy/paste error. * -1 is no longer returned from decodeBlock() in the event of a non-base64 character being encountered; only a count of bytes written to dst. - ... and 12 more: https://git.openjdk.java.net/jdk/compare/3ccf4877...dcd15d57 ------------- Changes: https://git.openjdk.java.net/jdk/pull/293/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=293&range=05 Stats: 1894 lines in 25 files changed: 1866 ins; 4 del; 24 mod Patch: https://git.openjdk.java.net/jdk/pull/293.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/293/head:pull/293 PR: https://git.openjdk.java.net/jdk/pull/293 From github.com+51754783+coreyashford at openjdk.java.net Wed Oct 21 01:39:29 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Wed, 21 Oct 2020 01:39:29 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v5] In-Reply-To: References: Message-ID: On Wed, 21 Oct 2020 01:25:33 GMT, Vladimir Kozlov wrote: > Please, update > src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java I will push the code, but I haven't been successful in running the test (see https://github.com/openjdk/jdk/pull/293#issuecomment-713223068 ) ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From github.com+51754783+coreyashford at openjdk.java.net Wed Oct 21 01:51:26 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Wed, 21 Oct 2020 01:51:26 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v7] In-Reply-To: References: Message-ID: <7Dgp_C-H8LsbF3tPHinVmb5bT_LoLZLYZUx9eSqigCA=.894148e4-a3f2-42c3-ab19-13e134e66853@github.com> > This patch set encompasses the following commits: > > - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for > the intrinsic. The API is similar to the existing encodeBlock intrinsic. > - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic > implementation > - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. > - Adds a JMH microbenchmark for both Base64 encoding and encoding. > - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. CoreyAshford has updated the pull request incrementally with one additional commit since the last revision: CheckGraalIntrinsics.java: fix copy/paste error ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/293/files - new: https://git.openjdk.java.net/jdk/pull/293/files/dcd15d57..f93614dc Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=293&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=293&range=05-06 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/293.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/293/head:pull/293 PR: https://git.openjdk.java.net/jdk/pull/293 From kbarrett at openjdk.java.net Wed Oct 21 02:28:30 2020 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 21 Oct 2020 02:28:30 GMT Subject: RFR: 8188055: (ref) Add Reference::refersTo predicate [v5] In-Reply-To: References: Message-ID: On Tue, 20 Oct 2020 16:35:02 GMT, Mandy Chung wrote: >>> @kimbarrett your reworded text is okay. I think "if it initially had some other referent value" can be dropped. >>> >>> For a `Reference` constructed with a `null` referent, we can clarify in the spec that such reference object will never >>> get cleared and enqueued. I suggest to file a separate issue to follow up. >> >> I don't think that clause can be dropped, because of explicit clearing (by clear() or enqueue()) rather than by the >> GC. If the reference was constructed with a null referent, ref.refersTo(null) cannot tell whether ref.clear() has been >> called. > >> Mandy's comment implied that references with a null referent never get enqueued. Otherwise when would they get >> enqueued? There would be nothing to trigger it. > > Sorry I should have been clearer. What I try to say is that `Reference(null)` > will never be cleared and enqueued by GC since its referent is `null`. > > Kim is right that `Reference(null)` can be explicitly cleared and enqueued > via `Reference::enqueue`. `Reference::clear` on such an "empty" reference > object is essentially a no-op. Whoever creates an "empty" reference would > not intend to be cleared. > >> > But the more we discuss this the more I think allowing an initial null referent was a mistake in the first place. :( >> >> I agree, but here we are. Very hard to know what the compatibility impact of changing that would be. > > There are existing use cases depending on `Reference(null)` for example as a special > instance be an empty reference or the head of a doubly-linked list of references. > This was discussed two years ago [1]. > > [1] https://mail.openjdk.java.net/pipermail/core-libs-dev/2018-July/054325.html David, Mandy, and I discussed the wording in refersTo javadoc and reached a consensus that is reflected in 3a15b6a. ------------- PR: https://git.openjdk.java.net/jdk/pull/498 From kbarrett at openjdk.java.net Wed Oct 21 02:28:30 2020 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 21 Oct 2020 02:28:30 GMT Subject: RFR: 8188055: (ref) Add Reference::refersTo predicate [v6] In-Reply-To: References: Message-ID: <0dhF_xxcp1VoUowwdZenB2qWa9ILcZjTMe3lsaRrg7k=.3c633db8-f745-4353-ad34-a64fbc96d4e0@github.com> > Finally returning to this review that was started in April 2020. I've > recast it as a github PR. I think the security concern raised by Gil > has been adequately answered. > https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-April/029203.html > https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-July/030401.html > https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-August/030677.html > https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-September/030793.html > > Please review a new function: java.lang.ref.Reference.refersTo. > > This function is needed to test the referent of a Reference object without > artificially extending the lifetime of the referent object, as may happen > when calling Reference.get. Some garbage collectors require extending the > lifetime of a weak referent when accessed, in order to maintain collector > invariants. Lifetime extension may occur with any collector when the > Reference is a SoftReference, as calling get indicates recent access. This > new function also allows testing the referent of a PhantomReference, which > can't be accessed by calling get. > > The new function uses native methods whose implementations are in the VM so > they can use the Access API. It is the intent that these methods will be > intrinsified by optimizing compilers like C2 or graal, but that hasn't been > implemented yet. Bear that in mind before rushing off to change existing > uses of Reference.get. > > There are two native methods involved, one in Reference and an override in > PhantomReference, both package private in java.lang.ref. The reason for this > split is to simplify the intrinsification. This is a change from the version > from April 2020; that version had a single native method in Reference, > implemented using the ON_UNKNOWN_OOP_REF Access reference strength category. > However, adding support for that category in the compilers adds significant > implementation effort and complexity. Splitting avoids that complexity. > > Testing: > mach5 tier1 > Locally (linux-x64) verified the new test passes with various garbage collectors. Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: improve wording in refersTo javadoc ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/498/files - new: https://git.openjdk.java.net/jdk/pull/498/files/ab4e519b..3a15b6a9 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=498&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=498&range=04-05 Stats: 7 lines in 1 file changed: 0 ins; 2 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/498.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/498/head:pull/498 PR: https://git.openjdk.java.net/jdk/pull/498 From dholmes at openjdk.java.net Wed Oct 21 02:58:26 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 21 Oct 2020 02:58:26 GMT Subject: RFR: 8188055: (ref) Add Reference::refersTo predicate [v6] In-Reply-To: <0dhF_xxcp1VoUowwdZenB2qWa9ILcZjTMe3lsaRrg7k=.3c633db8-f745-4353-ad34-a64fbc96d4e0@github.com> References: <0dhF_xxcp1VoUowwdZenB2qWa9ILcZjTMe3lsaRrg7k=.3c633db8-f745-4353-ad34-a64fbc96d4e0@github.com> Message-ID: On Wed, 21 Oct 2020 02:28:30 GMT, Kim Barrett wrote: >> Finally returning to this review that was started in April 2020. I've >> recast it as a github PR. I think the security concern raised by Gil >> has been adequately answered. >> https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-April/029203.html >> https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-July/030401.html >> https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-August/030677.html >> https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-September/030793.html >> >> Please review a new function: java.lang.ref.Reference.refersTo. >> >> This function is needed to test the referent of a Reference object without >> artificially extending the lifetime of the referent object, as may happen >> when calling Reference.get. Some garbage collectors require extending the >> lifetime of a weak referent when accessed, in order to maintain collector >> invariants. Lifetime extension may occur with any collector when the >> Reference is a SoftReference, as calling get indicates recent access. This >> new function also allows testing the referent of a PhantomReference, which >> can't be accessed by calling get. >> >> The new function uses native methods whose implementations are in the VM so >> they can use the Access API. It is the intent that these methods will be >> intrinsified by optimizing compilers like C2 or graal, but that hasn't been >> implemented yet. Bear that in mind before rushing off to change existing >> uses of Reference.get. >> >> There are two native methods involved, one in Reference and an override in >> PhantomReference, both package private in java.lang.ref. The reason for this >> split is to simplify the intrinsification. This is a change from the version >> from April 2020; that version had a single native method in Reference, >> implemented using the ON_UNKNOWN_OOP_REF Access reference strength category. >> However, adding support for that category in the compilers adds significant >> implementation effort and complexity. Splitting avoids that complexity. >> >> Testing: >> mach5 tier1 >> Locally (linux-x64) verified the new test passes with various garbage collectors. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > improve wording in refersTo javadoc Update looks good. Need to reflect the change in the CSR. Thanks. David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/498 From jbhateja at openjdk.java.net Wed Oct 21 06:12:26 2020 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Wed, 21 Oct 2020 06:12:26 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v6] In-Reply-To: References: Message-ID: > Summary: > > 1) Partial in-lining technique avoids call overhead penalty for small array copy operations with size less than 32 > bytes. 2) At runtime, a conditional check based on copy length either calls an array-copy stub or executes an optimized > instruction sequence using AVX-512 masked instructions emitted at the call site. 3) New runtime flag > ArrayCopyPartialInlineSize=0/32(default)/64 bytes determines the maximum size for partial in-lining. 4) Based on the > perf results seen in benchmarks currently partial in-lining is performed only for arraycopy involving sub-word types > (bool/byte/char/short). Once PR-61 gets integrated we can extend this patch to cover all the primitive types. > Performance Results: > System : CascadeLake Server, Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz > Micros : test/micro/org/openjdk/bench/java/lang/ArrayCopy*.java > ArrayCopyPartialInlineSize : 32 > > JMH | Block Size | Baseline (ns/op) | Partial Inling (ns/op) | Gain > -- | -- | -- | -- | -- > ArrayCopyAligned.testByte | 1 | 5.417 | 2.696 | 2.009272997 > ArrayCopyAligned.testByte | 3 | 5.494 | 2.702 | 2.03330866 > ArrayCopyAligned.testByte | 5 | 5.417 | 2.637 | 2.05422829 > ArrayCopyAligned.testByte | 10 | 5.343 | 2.703 | 1.976692564 > ArrayCopyAligned.testByte | 20 | 5.837 | 2.636 | 2.214339909 > ArrayCopyAligned.testByte | 70 | 5.86 | 6 | 0.976666667 > ArrayCopyAligned.testByte | 150 | 6.766 | 6.906 | 0.979727773 > ArrayCopyAligned.testByte | 300 | 7.605 | 7.952 | 0.956363179 > ArrayCopyAligned.testByte | 600 | 11.989 | 12.007 | 0.998500874 > ArrayCopyAligned.testByte | 1200 | 16.447 | 16.585 | 0.991679228 > ArrayCopyAligned.testChar | 1 | 5.02 | 2.828 | 1.775106082 > ArrayCopyAligned.testChar | 3 | 5.129 | 2.762 | 1.85698769 > ArrayCopyAligned.testChar | 5 | 5.041 | 2.762 | 1.82512672 > ArrayCopyAligned.testChar | 10 | 5.716 | 2.762 | 2.069514844 > ArrayCopyAligned.testChar | 20 | 5.111 | 5.399 | 0.946656788 > ArrayCopyAligned.testChar | 70 | 6.271 | 6.242 | 1.004645947 > ArrayCopyAligned.testChar | 150 | 7.45 | 7.599 | 0.980392157 > ArrayCopyAligned.testChar | 300 | 9.904 | 10.112 | 0.97943038 > ArrayCopyAligned.testChar | 600 | 17.131 | 17.167 | 0.997902953 > ArrayCopyAligned.testChar | 1200 | 29.556 | 29.851 | 0.990117584 > ArrayCopyUnalignedBoth.testByte | 1 | 5.419 | 2.702 | 2.005551443 > ArrayCopyUnalignedBoth.testByte | 3 | 5.558 | 2.636 | 2.108497724 > ArrayCopyUnalignedBoth.testByte | 5 | 5.43 | 2.636 | 2.059939302 > ArrayCopyUnalignedBoth.testByte | 10 | 5.378 | 2.637 | 2.039438756 > ArrayCopyUnalignedBoth.testByte | 20 | 5.914 | 2.636 | 2.243550836 > ArrayCopyUnalignedBoth.testByte | 70 | 5.882 | 5.954 | 0.987907289 > ArrayCopyUnalignedBoth.testByte | 150 | 6.784 | 6.88 | 0.986046512 > ArrayCopyUnalignedBoth.testByte | 300 | 7.635 | 7.968 | 0.958207831 > ArrayCopyUnalignedBoth.testByte | 600 | 12.226 | 12.129 | 1.007997362 > ArrayCopyUnalignedBoth.testByte | 1200 | 16.992 | 20.717 | 0.820195974 > ArrayCopyUnalignedBoth.testChar | 1 | 5.019 | 2.828 | 1.774752475 > ArrayCopyUnalignedBoth.testChar | 3 | 5.163 | 2.763 | 1.868621064 > ArrayCopyUnalignedBoth.testChar | 5 | 5.042 | 2.827 | 1.783516095 > ArrayCopyUnalignedBoth.testChar | 10 | 5.718 | 2.828 | 2.021923621 > ArrayCopyUnalignedBoth.testChar | 20 | 5.111 | 5.404 | 0.945780903 > ArrayCopyUnalignedBoth.testChar | 70 | 6.367 | 6.235 | 1.02117081 > ArrayCopyUnalignedBoth.testChar | 150 | 7.367 | 8.269 | 0.890917886 > ArrayCopyUnalignedBoth.testChar | 300 | 10.358 | 10.642 | 0.973313287 > ArrayCopyUnalignedBoth.testChar | 600 | 20.84 | 17.522 | 1.189361945 > ArrayCopyUnalignedBoth.testChar | 1200 | 31.895 | 31.892 | 1.000094067 > ArrayCopyUnalignedDst.testByte | 1 | 5.455 | 2.637 | 2.068638604 > ArrayCopyUnalignedDst.testByte | 3 | 5.562 | 2.702 | 2.058475204 > ArrayCopyUnalignedDst.testByte | 5 | 5.427 | 2.702 | 2.008512213 > ArrayCopyUnalignedDst.testByte | 10 | 5.367 | 2.696 | 1.990727003 > ArrayCopyUnalignedDst.testByte | 20 | 5.839 | 2.637 | 2.214258627 > ArrayCopyUnalignedDst.testByte | 70 | 5.888 | 5.968 | 0.986595174 > ArrayCopyUnalignedDst.testByte | 150 | 6.785 | 6.773 | 1.001771741 > ArrayCopyUnalignedDst.testByte | 300 | 7.606 | 7.972 | 0.954089313 > ArrayCopyUnalignedDst.testByte | 600 | 11.986 | 21.195 | 0.565510734 > ArrayCopyUnalignedDst.testByte | 1200 | 16.54 | 16.784 | 0.985462345 > ArrayCopyUnalignedDst.testChar | 1 | 5.02 | 2.827 | 1.775733994 > ArrayCopyUnalignedDst.testChar | 3 | 5.131 | 2.762 | 1.857711803 > ArrayCopyUnalignedDst.testChar | 5 | 5.038 | 2.762 | 1.82404055 > ArrayCopyUnalignedDst.testChar | 10 | 5.718 | 2.762 | 2.070238957 > ArrayCopyUnalignedDst.testChar | 20 | 5.113 | 5.401 | 0.946676541 > ArrayCopyUnalignedDst.testChar | 70 | 6.222 | 6.214 | 1.001287416 > ArrayCopyUnalignedDst.testChar | 150 | 7.367 | 8.125 | 0.906707692 > ArrayCopyUnalignedDst.testChar | 300 | 10.204 | 10.082 | 1.012100774 > ArrayCopyUnalignedDst.testChar | 600 | 16.978 | 17.135 | 0.990837467 > ArrayCopyUnalignedDst.testChar | 1200 | 32.351 | 31.996 | 1.011095137 > ArrayCopyUnalignedSrc.testByte | 1 | 5.414 | 2.696 | 2.008160237 > ArrayCopyUnalignedSrc.testByte | 3 | 5.494 | 2.637 | 2.083428138 > ArrayCopyUnalignedSrc.testByte | 5 | 5.431 | 2.637 | 2.059537353 > ArrayCopyUnalignedSrc.testByte | 10 | 5.344 | 2.703 | 1.977062523 > ArrayCopyUnalignedSrc.testByte | 20 | 5.834 | 2.696 | 2.163946588 > ArrayCopyUnalignedSrc.testByte | 70 | 5.883 | 6.009 | 0.979031453 > ArrayCopyUnalignedSrc.testByte | 150 | 6.729 | 6.87 | 0.979475983 > ArrayCopyUnalignedSrc.testByte | 300 | 7.603 | 7.97 | 0.953952321 > ArrayCopyUnalignedSrc.testByte | 600 | 12.004 | 12.16 | 0.987171053 > ArrayCopyUnalignedSrc.testByte | 1200 | 16.534 | 16.643 | 0.9934507 > ArrayCopyUnalignedSrc.testChar | 1 | 5.021 | 2.762 | 1.81788559 > ArrayCopyUnalignedSrc.testChar | 3 | 5.13 | 2.762 | 1.857349747 > ArrayCopyUnalignedSrc.testChar | 5 | 5.042 | 2.827 | 1.783516095 > ArrayCopyUnalignedSrc.testChar | 10 | 5.726 | 2.761 | 2.073886273 > ArrayCopyUnalignedSrc.testChar | 20 | 5.112 | 5.401 | 0.94649139 > ArrayCopyUnalignedSrc.testChar | 70 | 6.113 | 6.227 | 0.981692629 > ArrayCopyUnalignedSrc.testChar | 150 | 7.493 | 7.888 | 0.949923935 > ArrayCopyUnalignedSrc.testChar | 300 | 10.234 | 10.501 | 0.97457385 > ArrayCopyUnalignedSrc.testChar | 600 | 17.175 | 17.142 | 1.001925096 > ArrayCopyUnalignedSrc.testChar | 1200 | 31.926 | 31.987 | 0.998092975 > > Detailed Reports: > Baseline : > [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt) > WithOpt : > [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt) Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: JDK-8252848 : Review comments resolution. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/302/files - new: https://git.openjdk.java.net/jdk/pull/302/files/3ff64896..a5d6c5de Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=302&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=302&range=04-05 Stats: 161 lines in 16 files changed: 24 ins; 88 del; 49 mod Patch: https://git.openjdk.java.net/jdk/pull/302.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/302/head:pull/302 PR: https://git.openjdk.java.net/jdk/pull/302 From shade at openjdk.java.net Wed Oct 21 06:19:11 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 21 Oct 2020 06:19:11 GMT Subject: Integrated: 8255065: Zero: accessor_entry misses the IRIW case In-Reply-To: References: Message-ID: On Tue, 20 Oct 2020 17:09:59 GMT, Aleksey Shipilev wrote: > While doing a change in related area, I noticed there is no IRIW handling block in `ZeroInterpreter::accessor_entry` > when reading volatile fields. This probably breaks PPC64 Zero. > There is a block in `bytecodeInterpreter.cpp` for [common field > access](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/interpreter/zero/bytecodeInterpreter.cpp#L1899-L1901): > if (cache->is_volatile()) { > if (support_IRIW_for_not_multiple_copy_atomic_cpu) { > OrderAccess::fence(); > } > > Attention @TheRealMDoerr ;) > > Testing: > - [x] Linux x86_64 zero fastdebug build (includes jmod generation with Zero) This pull request has now been integrated. Changeset: bd45191f Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/bd45191f Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod 8255065: Zero: accessor_entry misses the IRIW case Reviewed-by: mdoerr ------------- PR: https://git.openjdk.java.net/jdk/pull/766 From github.com+51754783+coreyashford at openjdk.java.net Wed Oct 21 06:25:19 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Wed, 21 Oct 2020 06:25:19 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v5] In-Reply-To: References: Message-ID: <9dnt9VysFlp4SZ_1AFvgcP2kekCynAXzHNHG1HWJ-jg=.6b0cec9f-aca2-465f-a224-1880e4223780@github.com> On Wed, 21 Oct 2020 01:33:48 GMT, CoreyAshford wrote: >> Please, update >> src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java > >> Please, update >> src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java > > I will push the code, but I haven't been successful in running the test (see > https://github.com/openjdk/jdk/pull/293#issuecomment-713223068 ) The latest push triggered a CI test run, which has a single failure on Windows x64 (hs/tier1 compiler): https://github.com/CoreyAshford/jdk/actions/runs/318910520 I'm not sure what to make of it yet. It doesn't look related, but I'm not sure. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From stefank at openjdk.java.net Wed Oct 21 07:27:07 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 21 Oct 2020 07:27:07 GMT Subject: RFR: 8255047: Add HotSpot flag to use with debuggers that restrict the =?UTF-8?B?Q1BV4oCm?= In-Reply-To: References: <_0YMtDNWAGXhg-_EyMf0c8GoNE4cf4wf1hpyr9j-sNs=.2ec5bf8b-023e-4bbd-9692-f74cd6c97f8d@github.com> Message-ID: On Tue, 20 Oct 2020 14:52:33 GMT, Daniel D. Daugherty wrote: >> Some debuggers don't work well with many threads, and/or incompletely restricts the number of used CPUs to one. >> >> This flag is intended as a catch-all for HotSpot developers (not available in product builds) to allow us to more >> easily use those debuggers. >> Currently, the proposal is to let the flag fix a few things: >> 1) Turn down the number of JVM threads >> 2) Turn off NUMA >> 3) Force processor_id() to return 0 instead of values above processor_count() >> >> (1) is purely ergonomics: gdb, rr, valgrind is faster and seems to work much better with fewer threads. The values >> would still be overridable by devs. >> (2) and (3) deals with the fact that some debuggers change the reported processor count, but don't change the processor >> ids returned by sched_getcpu. This causes problems for ZGC and NUMA, that both assumes that they can rely on >> os::processor_id() < os::processor_count(). The current proposed flag name is -XX:+LimitedCPUsDebugging. I'm not >> entirely happy with that name, but I been able to find a better name. >> An alternative to having one flag, is to split this into two flags, and maybe that would solve the naming problem. >> However, the usability aspects will be worse. >> If we can't find a suitable name, I rather introduce a flag called: >> -XX:DebuggerWorkarounds or -XX:DebuggerWorkaround1 >> >> Any suggestions / opinions? I really do want to at least fix the (2, 3) problem, because I keep having to add this to >> every single branch I'm working on. > > Perhaps: `UseDebuggerErgo` for the main option name with > `UseDebuggerErgo1` ... `UseDebuggerErgoN` for the suboptions > where `UseDebuggerErgo` enables all the numbered > `UseDebuggerErgo` options in one go. > > Update: Yes, this is the way that I have always wished that > `UseNewCode*` worked. @dcubed-ojdk Thanks for the suggestion. I like it. ------------- PR: https://git.openjdk.java.net/jdk/pull/763 From github.com+4146708+a74nh at openjdk.java.net Wed Oct 21 07:47:33 2020 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Wed, 21 Oct 2020 07:47:33 GMT Subject: RFR: 8221554: aarch64 cross-modifying code [v5] In-Reply-To: References: Message-ID: <6lRCpeDo_SOO6cIjEk2p-hFoCUlfcjQeP4FU_W0rCHM=.0b8123eb-5c04-4994-b568-42d44c3d3f19@github.com> > The AArch64 port uses maybe_isb in places where an ISB might be required > because the code may have safepointed. These maybe_isbs are very conservative > and are used in many places are used when a safepoint has not happened. > > cross_modify_fence was added in common code to place a barrier in all the > places after a safepoint has occurred. All the uses of it are in common code, > yet it remains unimplemented on AArch64. > > This set of patches implements cross_modify_fence for AArch64 and reconsiders > every uses of maybe_isb, discarding many of them. In addition, it introduces > a new diagnostic option, which when enabled on AArch64 tests the correct > usage of the barriers. > > Advantage of this patch is threefold: > * Reducing the number of ISBs - giving a theoretical performance improvement. > * Use of common code instead of backend specific code. > * Additional test diagnostic options > > Patch 1: Split cross_modify_fence > ================================= > This is simply refactoring work split out to simplify the other two patches. > > instruction_fence() is provided by each target and simply places > a fence for the instruction stream. > > cross_modify_fence() is now a member of JavaThread and just calls > instruction_fence. This function will be extended in Patch 3. > > Patch 2: Use cross_modify_fence instead of maybe_isb > ==================================================== > > The [n] References refer to the comments for cross_modify_fence in > thread.hpp. > > This is all the existing uses of maybe_isb in the AArch64 target: > > 1) Instances of Java code calling a VM function > * This encapsulates the changes to: > ** MacroAssembler::call_VM_leaf_base() > ** generate_fast_get_int_field0() > ** stubGenerator_aarch64 generate_throw_exception() > ** sharedRuntime_aarch64 generate_handler_blob() > ** SharedRuntime::generate_resolve_blob() > ** C1 LIR_Assembler::rt_call > ** C1 StubAssembler::call_RT(): used by Used by generate_exception_throw, > generate_handle_exception, generate_code_for. > ** OptoRuntime::generate_exception_blob() > * Any changes will be caught due to calls to [2] or [3] by the VM function. > * Any calls that do not call [2] or [3] do not require an ISB. > * This patch is more optimal for these cases. > > 2) Instances of Java code calling a JNI function > * This encapsulates the changes to: > ** SharedRuntime::generate_native_wrapper() > ** TemplateInterpreterGenerator::generate_native_entry() > * A safepoint still in progress after the call with be caught by [4]. > * An ISB is still required for the case where there was a safepoint > but it completed during the call. This happens if the code doesn't > branch on safepoint_in_progress > * In the SharedRuntime version, the two possible calls to > reguard_yellow_pages and complete_monitor_unlocking_C are after the thread > goes back into it's original state, so are covered by [2] and [3], the > same as a normal VM call. > * This patch is only more optimal for the two post-JNI calls. > > 3) Patching functions > * This encapsulates the changes to: > ** patch_callers_callsite() (called by gen_c2i_adapter()) > * This results in code being patched, but does not safepoint > * Therefore an ISB is required. > * This patch introduces no change here. > > 4) C1 MacroAssembler::emit_static_call_stub() > * Calls ISB (not maybe_isb) > * By design, the patching doesn't require that the up-to-date > destination is required for proper functioning. > * However, the ISB makes it most likely that the new destination will > be picked up. > * This patch introduces no change here. > > Patch 3: Add cross modify fence verification > ============================================ > > The VerifyCrossModifyFence diagnostic flag enables confirmation to the correct > usage of instruction barriers. It can safely be enabled on any Java run. > > Enabling it will cause the following: > > * Once all threads have been brought to a safepoint, each thread will be > marked. > > * On a cross_modify_fence and safepoint_fence the mark for that thread > will be cleared. > > * On entry to a method and in a safepoint poll, then the thread is checked. > If it is marked, then the code will error. Alan Hayward has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge master Change-Id: I5e1715fdb11305191fe7bf86cbfb7a6da446b3dc - Remove inlasm_isb define Change-Id: I2d0ef8a78292dac875f3f65d2253981cdb7a497a - AArch64: Add cross modify fence verification - AArch64: Use cross_modify_fence instead of maybe_isb - Split cross_modify_fence ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/428/files - new: https://git.openjdk.java.net/jdk/pull/428/files/338eca42..c26b4dee Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=428&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=428&range=03-04 Stats: 341271 lines in 1082 files changed: 323486 ins; 12988 del; 4797 mod Patch: https://git.openjdk.java.net/jdk/pull/428.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/428/head:pull/428 PR: https://git.openjdk.java.net/jdk/pull/428 From mbaesken at openjdk.java.net Wed Oct 21 08:01:22 2020 From: mbaesken at openjdk.java.net (Matthias Baesken) Date: Wed, 21 Oct 2020 08:01:22 GMT Subject: RFR: JDK-8254889: name_and_sig_as_C_string usages in frame coding without ResourceMark [v3] In-Reply-To: References: Message-ID: > Hello, seems we have some usages of name_and_sig_as_C_string() in frame related HS coding without using a ResourceMark. > Please review. > Thanks, Matthias Matthias Baesken has updated the pull request incrementally with two additional commits since the last revision: - Merge branch 'JDK-8254889' of https://github.com/MBaesken/jdk into JDK-8254889 - JDK-8254889 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/698/files - new: https://git.openjdk.java.net/jdk/pull/698/files/6a13e8e2..0c977177 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=698&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=698&range=01-02 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/698.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/698/head:pull/698 PR: https://git.openjdk.java.net/jdk/pull/698 From jbhateja at openjdk.java.net Wed Oct 21 08:39:49 2020 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Wed, 21 Oct 2020 08:39:49 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v4] In-Reply-To: References: Message-ID: <1alTposTHTu9m3Tyy0ero3fHinyJPrPC67z-iAZasF0=.387b7be1-0513-43aa-bd1b-07b9bf933246@github.com> On Tue, 20 Oct 2020 06:24:18 GMT, Tobias Hartmann wrote: >> There is regression after 8252847 changes: 8254890. >> It should be fixed before we proceed with these changes. > > [JDK-8254890](https://bugs.openjdk.java.net/browse/JDK-8254890) is a closed bug because it contains confidential information. I've filed [JDK-8255039](https://bugs.openjdk.java.net/browse/JDK-8255039). > Hi Jatin, > > I'm ready to approve it, but I would like to kick it through some performance testing first. > > Best regards, > Nils Eliasson Hi Nils, I have incorporated your review feedback. Kindly do shared you performance results. Best Regards, Jatin ------------- PR: https://git.openjdk.java.net/jdk/pull/302 From rehn at openjdk.java.net Wed Oct 21 08:40:49 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Wed, 21 Oct 2020 08:40:49 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v2] In-Reply-To: References: <5R147e7WLawIyLWxjBQUNK22Y19rDsT32HYDCUXep7g=.a2344642-ca05-429e-a733-4ead81eb4f20@github.com> <010C_PcZp7Ol1SpH7__fIb5SPJrbFaQMUa0DN_R2rtM=.49983d69-0901-4081-9043-fbfaef0e4c61@github.com> <-vefsX3p5IkTBZvrIepZdJ-Q1G9HoSDsyVj2M3dkyKc=.03e18d1d-341f-45cc-a25b-c2894ae373e8@github.com> <_akbNkG-trhkT8U9QefeEYexY06Ztv6F2HiFHZF3alI=.d7f4266e-1460-433c-8e20-8dea52d927aa@github.com> Message-ID: On Tue, 20 Oct 2020 19:54:06 GMT, Richard Reingruber wrote: >> Thanks, I'm exploring what we need to execute the EB inside the handshake. >> So far I think that really needs to go in a separate PR, since it becomes really unrelated to this.... picking up your change. >> >>> Hi Robbin, >>> >>> for merging master after integration of #119 I'd suggest to resolve the >>> conflicts by chosing the alternative from this pr and then apply >>> [reinrich at 6fa91e3](https://github.com/reinrich/jdk/commit/6fa91e344ed5bf6d877e3f5a2d0d1920591fd441) >>> (is there a more elegant way to propose a patch?) >>> >>> I successfully tested >>> >>> ``` >>> make run-test TEST=test/jdk/com/sun/jdi/EATests.java >>> ``` >>> >>> which also covers PopFrame and ForceEarlyReturn. >>> >>> More tests are running. >>> >>> For night tests of our team it is unfortunately too late. >>> >>> Thanks, Richard. > >> Thanks, I'm exploring what we need to execute the EB inside the handshake. > > I want to experiment with object reallocation without referencing a frame. I think a should be possible to reallocate objects given only the corresponding compiled pc. If so, then a handshake/vm operation can fail with the request to reallocate objects at a pc. This can be done concurrently and then the handshake/vm operation can be restarted. I pushed the merge. (I manage to pick up bad state in first merge, so I did a second merge to get the fixes for that) Please have a look. Still running test, but there were some interest in this change-set (it seem to fix an unrelated bug also) so I published it. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From rehn at openjdk.java.net Wed Oct 21 08:40:47 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Wed, 21 Oct 2020 08:40:47 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: Message-ID: > The main point of this change-set is to make it easier to implement S/R on top of handshakes. > Which is a prerequisite for removing _suspend_flag (which duplicates the handshake functionality). > But we also remove some complicated S/R methods. > > We basically just put in everything in the handshake closure, so the diff just looks much worse than what it is. > > TraceSuspendDebugBits have an ifdef, but in both cases it now just returns. > But I was unsure if I should remove now or when is_ext_suspend_completed() is removed. > > Passes multiple t1-5 runs, locally it passes many jck:vm/nsk_jvmti/nsk_jdi/jdk-jdi runs. Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: - Fixed merge miss - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended - Merge fix from Richard - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended - Removed TraceSuspendDebugBits - Removed unused method is_ext_suspend_completed_with_lock - Utilize handshakes instead of is_thread_fully_suspended ------------- Changes: https://git.openjdk.java.net/jdk/pull/729/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=729&range=02 Stats: 606 lines in 6 files changed: 182 ins; 371 del; 53 mod Patch: https://git.openjdk.java.net/jdk/pull/729.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/729/head:pull/729 PR: https://git.openjdk.java.net/jdk/pull/729 From fyang at openjdk.java.net Wed Oct 21 09:10:57 2020 From: fyang at openjdk.java.net (Fei Yang) Date: Wed, 21 Oct 2020 09:10:57 GMT Subject: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v10] In-Reply-To: References: Message-ID: On Tue, 20 Oct 2020 23:06:41 GMT, Vladimir Kozlov wrote: >> Fei Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: >> >> - Fix trailing whitespace issue reported by jcheck >> - Merge master >> - Merge master >> - Remove unnecessary code changes in vm_version_aarch64.cpp >> - Merge master >> - Merge master >> - Merge master >> - Merge master >> - Add sha3 instructions to cpu/aarch64/aarch64-asmtest.py and regenerate the test in assembler_aarch64.cpp:asm_check >> - Rebase >> - ... and 3 more: https://git.openjdk.java.net/jdk/compare/cdc8c401...d32c8ad7 > > src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java line 604: > >> 602: add(ignore, "sun/security/provider/SHA5." + shaCompressName + "([BI)V"); >> 603: } >> 604: add(toBeInvestigated, "sun/security/provider/SHA3." + shaCompressName + "([BI)V"); > > This should be under `if (isJDK16OrHigher())` check. Something like this: > https://github.com/openjdk/jdk/pull/650/files#diff-d1f378fc1b7fe041309e854d40b3a95a91e63fdecf0ecd9826b7c95eaeba314eR527 > You can wait when Aleksey push it and update your changes OK. Will update with the following change after Aleksey's PR is integrated: --- a/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java +++ b/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java @@ -608,6 +608,10 @@ public class CheckGraalIntrinsics extends GraalTest { if (!config.useSHA512Intrinsics()) { add(ignore, "sun/security/provider/SHA5." + shaCompressName + "([BI)V"); } + + if (isJDK16OrHigher()) { + add(toBeInvestigated, "sun/security/provider/SHA3." + shaCompressName + "([BI)V"); + } } ------------- PR: https://git.openjdk.java.net/jdk/pull/207 From fyang at openjdk.java.net Wed Oct 21 09:23:57 2020 From: fyang at openjdk.java.net (Fei Yang) Date: Wed, 21 Oct 2020 09:23:57 GMT Subject: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v10] In-Reply-To: References: Message-ID: On Tue, 20 Oct 2020 23:08:22 GMT, Vladimir Kozlov wrote: > Someone in Oracle have to run tier1-tier3 testing with these changes to make sure nothing is broken. I don't want to repeat 8254790. That's appreciated. On my side, I run tier1-tier3 both on aarch64 linux and x86_64 linux. The test result on these two platforms looks good for the latest changes. ------------- PR: https://git.openjdk.java.net/jdk/pull/207 From coleenp at openjdk.java.net Tue Oct 20 12:10:26 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 20 Oct 2020 12:10:26 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v2] In-Reply-To: References: <0UtoOZMbgTY-1hGACS1n5swvRCa9za_e3uu-XEVOidM=.85baf76c-e91b-492a-8841-10d9e49e3ec4@github.com> Message-ID: On Tue, 20 Oct 2020 07:17:24 GMT, Robbin Ehn wrote: >> src/hotspot/share/prims/jvmtiEnvBase.cpp line 1525: >> >>> 1523: Thread* current_thread = Thread::current(); >>> 1524: HandleMark hm(current_thread); >>> 1525: JavaThread* java_thread = target->as_Java_thread(); >> >> Contrast with the same three lines at L1390 - we should use the same boilerplate in each `doit`. And ideally refactor >> into some shared code somewhere (future RFE). > > Yes, that would be good. Why don't you just do: JavaThread* java_thread = JavaThread::current(); HandleMark hm(java_thread); JavaThread::current is the same thing as what you have. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From mdoerr at openjdk.java.net Wed Oct 21 09:58:54 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 21 Oct 2020 09:58:54 GMT Subject: RFR: 8233343: Deprecate -XX:+CriticalJNINatives flag which implements =?UTF-8?B?SmF2YeKApg==?= In-Reply-To: References: Message-ID: On Tue, 20 Oct 2020 11:41:17 GMT, Coleen Phillimore wrote: > This change deprecates the -XX:+CriticalJNINatives flag and removes the develop flag -XX:+StressCriticalJNINatives. See CSR for more details. > > This change also removes the lazy GC lock in the critical native transition, and runs the critical native function as thread_in_Java. I add a safepoint check at the end of the native function and transition to native and poll again for the safepoint after the function if a safepoint is requested. > > Tested with tier 1-6 (we have a few tests that use this). And built on linux-x86-open,linux-s390x-open,linux-arm32-debug,linux-ppc64le-debug. Changes requested by mdoerr (Reviewer). src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp line 2192: > 2190: > 2191: // Use that pc we placed in r_return_pc a while back as the current frame anchor. > 2192: __ set_last_Java_frame(R1_SP, r_return_pc); These 2 lines need to get moved before if (!is_critical_native) check. With this change, the 2 tests have passed with " | os.arch=="ppc64" | os.arch=="ppc64le" | os.arch=="s390x"" added. ------------- PR: https://git.openjdk.java.net/jdk/pull/764 From mcimadamore at openjdk.java.net Wed Oct 21 10:44:38 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Wed, 21 Oct 2020 10:44:38 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v8] In-Reply-To: References: Message-ID: <663pAFJdnevrreAcN1Zg9_pkE1kt9Vxxy9unNgQMvDk=.42b1e1ad-192a-418d-9271-c1830dd9fa82@github.com> > This patch contains the changes associated with the first incubation round of the foreign linker access API incubation > (see JEP 389 [1]). This work is meant to sit on top of the foreign memory access support (see JEP 393 [2] and associated pull request [3]). > > The main goal of this API is to provide a way to call native functions from Java code without the need of intermediate JNI glue code. In order to do this, native calls are modeled through the MethodHandle API. I suggest reading the writeup [4] I put together few weeks ago, which illustrates what the foreign linker support is, and how it should be used by clients. > > Disclaimer: the pull request mechanism isn't great at managing *dependent* reviews. For this reasons, I'm attaching a webrev which contains only the differences between this PR and the memory access PR. I will be periodically uploading new webrevs, as new iterations come out, to try and make the life of reviewers as simple as possible. > > A big thank to Jorn Vernee and Vladimir Ivanov - they are the main architects of all the hotspot changes you see here, and without their help, the foreign linker support wouldn't be what it is today. As usual, a big thank to Paul Sandoz, who provided many insights (often by trying the bits first hand). > > Thanks > Maurizio > > Webrev: > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/webrev > > Javadoc: > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/javadoc/jdk/incubator/foreign/package-summary.html > > Specdiff (relative to [3]): > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/specdiff_delta/overview-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254232 > > > > ### API Changes > > The API changes are actually rather slim: > > * `LibraryLookup` > * This class allows clients to lookup symbols in native libraries; the interface is fairly simple; you can load a library by name, or absolute path, and then lookup symbols on that library. > * `FunctionDescriptor` > * This is an abstraction that is very similar, in spirit, to `MethodType`; it is, at its core, an aggregate of memory layouts for the function arguments/return type. A function descriptor is used to describe the signature of a native function. > * `CLinker` > * This is the real star of the show. A `CLinker` has two main methods: `downcallHandle` and `upcallStub`; the first takes a native symbol (as obtained from `LibraryLookup`), a `MethodType` and a `FunctionDescriptor` and returns a `MethodHandle` instance which can be used to call the target native symbol. The second takes an existing method handle, and a `FunctionDescriptor` and returns a new `MemorySegment` corresponding to a code stub allocated by the VM which acts as a trampoline from native code to the user-provided method handle. This is very useful for implementing upcalls. > * This class also contains the various layout constants that should be used by clients when describing native signatures (e.g. `C_LONG` and friends); these layouts contain additional ABI classfication information (in the form of layout attributes) which is used by the runtime to *infer* how Java arguments should be shuffled for the native call to take place. > * Finally, this class provides some helper functions e.g. so that clients can convert Java strings into C strings and back. > * `NativeScope` > * This is an helper class which allows clients to group together logically related allocations; that is, rather than allocating separate memory segments using separate *try-with-resource* constructs, a `NativeScope` allows clients to use a _single_ block, and allocate all the required segments there. This is not only an usability boost, but also a performance boost, since not all allocation requests will be turned into `malloc` calls. > * `MemorySegment` > * Only one method added here - namely `handoff(NativeScope)` which allows a segment to be transferred onto an existing native scope. > > ### Safety > > The foreign linker API is intrinsically unsafe; many things can go wrong when requesting a native method handle. For instance, the description of the native signature might be wrong (e.g. have too many arguments) - and the runtime has, in the general case, no way to detect such mismatches. For these reasons, obtaining a `CLinker` instance is a *restricted* operation, which can be enabled by specifying the usual JDK property `-Dforeign.restricted=permit` (as it's the case for other restricted method in the foreign memory API). > > ### Implementation changes > > The Java changes associated with `LibraryLookup` are relative straightforward; the only interesting thing to note here is that library loading does _not_ depend on class loaders, so `LibraryLookup` is not subject to the same restrictions which apply to JNI library loading (e.g. same library cannot be loaded by different classloaders). > > As for `NativeScope` the changes are again relatively straightforward; it is an API which sits neatly on top of the foreign meory access API, providing some kind of allocation service which shares the same underlying memory segment(s), and turns an allocation request into a segment slice, which is a much less expensive operation. `NativeScope` comes in two variants: there are native scopes for which the allocation size is known a priori, and native scopes which can grow - these two schemes are implemented by two separate subclasses of `AbstractNativeScopeImpl`. > > Of course the bulk of the changes are to support the `CLinker` downcall/upcall routines. These changes cut pretty deep into the JVM; I'll briefly summarize the goal of some of this changes - for further details, Jorn has put together a detailed writeup which explains the rationale behind the VM support, with some references to the code [5]. > > The main idea behind foreign linker is to infer, given a Java method type (expressed as a `MethodType` instance) and the description of the signature of a native function (expressed as a `FunctionDescriptor` instance) a _recipe_ that can be used to turn a Java call into the corresponding native call targeting the requested native function. > > This inference scheme can be defined in a pretty straightforward fashion by looking at the various ABI specifications (for instance, see [6] for the SysV ABI, which is the one used on Linux/Mac). The various `CallArranger` classes, of which we have a flavor for each supported platform, do exactly that kind of inference. > > For the inference process to work, we need to attach extra information to memory layouts; it is no longer sufficient to know e.g. that a layout is 32/64 bits - we need to know whether it is meant to represent a floating point value, or an integral value; this knowledge is required because floating points are passed in different registers by most ABIs. For this reason, `CLinker` offers a set of pre-baked, platform-dependent layout constants which contain the required classification attributes (e.g. a `Clinker.TypeKind` enum value). The runtime extracts this attribute, and performs classification accordingly. > > A native call is decomposed into a sequence of basic, primitive operations, called `Binding` (see the great javadoc on the `Binding.java` class for more info). There are many such bindings - for instance the `Move` binding is used to move a value into a specific machine register/stack slot. So, the main job of the various `CallingArranger` classes is to determine, given a Java `MethodType` and `FunctionDescriptor` what is the set of bindings associated with the downcall/upcall. > > At the heart of the foreign linker support is the `ProgrammableInvoker` class. This class effectively generates a `MethodHandle` which follows the steps described by the various bindings obtained by `CallArranger`. There are actually various strategies to interpret these bindings - listed below: > > * basic intepreted mode; in this mode, all bindings are interpreted using a stack-based machine written in Java (see `BindingInterpreter`), except for the `Move` bindings. For these bindings, the move is implemented by allocating a *buffer* (whose size is ABI specific) and by moving all the lowered values into positions within this buffer. The buffer is then passed to a piece of assembly code inside the VM which takes values from the buffer and moves them in their expected registers/stack slots (note that each position in the buffer corresponds to a different register). This is the most general invocation mode, the more "customizable" one, but also the slowest - since for every call there is some extra allocation which takes place. > > * specialized interpreted mode; same as before, but instead of interpreting the bindings with a stack-based interpreter, we generate a method handle chain which effectively interprets all the bindings (again, except `Move` ones). > > * intrinsified mode; this is typically used in combination with the specialized interpreted mode described above (although it can also be used with the Java-based binding interpreter). The goal here is to remove the buffer allocation and copy by introducing an additional JVM intrinsic. If a native call recipe is constant (e.g. the set of bindings is constant, which is probably the case if the native method handle is stored in a `static`, `final` field), then the VM can generate specialized assembly code which interprets the `Move` binding without the need to go for an intermediate buffer. This gives us back performances that are on par with JNI. > > For upcalls, the support is not (yet) as advanced, and only the basic interpreted mode is available there. We plan to add support for intrinsified modes there as well, which should considerably boost perfomances (probably well beyond what JNI can offer at the moment, since the upcall support in JNI is not very well optimized). > > Again, for more readings on the internals of the foreign linker support, please refer to [5]. > > #### Test changes > > Many new tests have been added to validate the foreign linker support; we have high level tests (see `StdLibTest`) which aim at testing the linker from the perspective of code that clients could write. But we also have deeper combinatorial tests (see `TestUpcall` and `TestDowncall`) which are meant to stress every corner of the ABI implementation. There are also some great tests (see the `callarranger` folder) which test the various `CallArranger`s for all the possible platforms; these tests adopt more of a white-box approach - that is, instead of treating the linker machinery as a black box and verify that the support works by checking that the native call returned the results we expected, these tests aims at checking that the set of bindings generated by the call arranger is correct. This also mean that we can test the classification logic for Windows, Mac and Linux regardless of the platform we're executing on. > > Some additional microbenchmarks have been added to compare the performances of downcall/upcall with JNI. > > [1] - https://openjdk.java.net/jeps/389 > [2] - https://openjdk.java.net/jeps/393 > [3] - https://git.openjdk.java.net/jdk/pull/548 > [4] - https://github.com/openjdk/panama-foreign/blob/foreign-jextract/doc/panama_ffi.md > [5] - http://cr.openjdk.java.net/~jvernee/docs/Foreign-abi%20downcall%20intrinsics%20technical%20description.html Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Don't use JNI when generating native wrappers ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/634/files - new: https://git.openjdk.java.net/jdk/pull/634/files/502bd980..7cef16f4 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=634&range=07 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=634&range=06-07 Stats: 478 lines in 17 files changed: 245 ins; 140 del; 93 mod Patch: https://git.openjdk.java.net/jdk/pull/634.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/634/head:pull/634 PR: https://git.openjdk.java.net/jdk/pull/634 From mcimadamore at openjdk.java.net Wed Oct 21 11:33:27 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Wed, 21 Oct 2020 11:33:27 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v9] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the first incubation round of the foreign linker access API incubation > (see JEP 389 [1]). This work is meant to sit on top of the foreign memory access support (see JEP 393 [2] and associated pull request [3]). > > The main goal of this API is to provide a way to call native functions from Java code without the need of intermediate JNI glue code. In order to do this, native calls are modeled through the MethodHandle API. I suggest reading the writeup [4] I put together few weeks ago, which illustrates what the foreign linker support is, and how it should be used by clients. > > Disclaimer: the pull request mechanism isn't great at managing *dependent* reviews. For this reasons, I'm attaching a webrev which contains only the differences between this PR and the memory access PR. I will be periodically uploading new webrevs, as new iterations come out, to try and make the life of reviewers as simple as possible. > > A big thank to Jorn Vernee and Vladimir Ivanov - they are the main architects of all the hotspot changes you see here, and without their help, the foreign linker support wouldn't be what it is today. As usual, a big thank to Paul Sandoz, who provided many insights (often by trying the bits first hand). > > Thanks > Maurizio > > Webrev: > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/webrev > > Javadoc: > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/javadoc/jdk/incubator/foreign/package-summary.html > > Specdiff (relative to [3]): > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/specdiff_delta/overview-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254232 > > > > ### API Changes > > The API changes are actually rather slim: > > * `LibraryLookup` > * This class allows clients to lookup symbols in native libraries; the interface is fairly simple; you can load a library by name, or absolute path, and then lookup symbols on that library. > * `FunctionDescriptor` > * This is an abstraction that is very similar, in spirit, to `MethodType`; it is, at its core, an aggregate of memory layouts for the function arguments/return type. A function descriptor is used to describe the signature of a native function. > * `CLinker` > * This is the real star of the show. A `CLinker` has two main methods: `downcallHandle` and `upcallStub`; the first takes a native symbol (as obtained from `LibraryLookup`), a `MethodType` and a `FunctionDescriptor` and returns a `MethodHandle` instance which can be used to call the target native symbol. The second takes an existing method handle, and a `FunctionDescriptor` and returns a new `MemorySegment` corresponding to a code stub allocated by the VM which acts as a trampoline from native code to the user-provided method handle. This is very useful for implementing upcalls. > * This class also contains the various layout constants that should be used by clients when describing native signatures (e.g. `C_LONG` and friends); these layouts contain additional ABI classfication information (in the form of layout attributes) which is used by the runtime to *infer* how Java arguments should be shuffled for the native call to take place. > * Finally, this class provides some helper functions e.g. so that clients can convert Java strings into C strings and back. > * `NativeScope` > * This is an helper class which allows clients to group together logically related allocations; that is, rather than allocating separate memory segments using separate *try-with-resource* constructs, a `NativeScope` allows clients to use a _single_ block, and allocate all the required segments there. This is not only an usability boost, but also a performance boost, since not all allocation requests will be turned into `malloc` calls. > * `MemorySegment` > * Only one method added here - namely `handoff(NativeScope)` which allows a segment to be transferred onto an existing native scope. > > ### Safety > > The foreign linker API is intrinsically unsafe; many things can go wrong when requesting a native method handle. For instance, the description of the native signature might be wrong (e.g. have too many arguments) - and the runtime has, in the general case, no way to detect such mismatches. For these reasons, obtaining a `CLinker` instance is a *restricted* operation, which can be enabled by specifying the usual JDK property `-Dforeign.restricted=permit` (as it's the case for other restricted method in the foreign memory API). > > ### Implementation changes > > The Java changes associated with `LibraryLookup` are relative straightforward; the only interesting thing to note here is that library loading does _not_ depend on class loaders, so `LibraryLookup` is not subject to the same restrictions which apply to JNI library loading (e.g. same library cannot be loaded by different classloaders). > > As for `NativeScope` the changes are again relatively straightforward; it is an API which sits neatly on top of the foreign meory access API, providing some kind of allocation service which shares the same underlying memory segment(s), and turns an allocation request into a segment slice, which is a much less expensive operation. `NativeScope` comes in two variants: there are native scopes for which the allocation size is known a priori, and native scopes which can grow - these two schemes are implemented by two separate subclasses of `AbstractNativeScopeImpl`. > > Of course the bulk of the changes are to support the `CLinker` downcall/upcall routines. These changes cut pretty deep into the JVM; I'll briefly summarize the goal of some of this changes - for further details, Jorn has put together a detailed writeup which explains the rationale behind the VM support, with some references to the code [5]. > > The main idea behind foreign linker is to infer, given a Java method type (expressed as a `MethodType` instance) and the description of the signature of a native function (expressed as a `FunctionDescriptor` instance) a _recipe_ that can be used to turn a Java call into the corresponding native call targeting the requested native function. > > This inference scheme can be defined in a pretty straightforward fashion by looking at the various ABI specifications (for instance, see [6] for the SysV ABI, which is the one used on Linux/Mac). The various `CallArranger` classes, of which we have a flavor for each supported platform, do exactly that kind of inference. > > For the inference process to work, we need to attach extra information to memory layouts; it is no longer sufficient to know e.g. that a layout is 32/64 bits - we need to know whether it is meant to represent a floating point value, or an integral value; this knowledge is required because floating points are passed in different registers by most ABIs. For this reason, `CLinker` offers a set of pre-baked, platform-dependent layout constants which contain the required classification attributes (e.g. a `Clinker.TypeKind` enum value). The runtime extracts this attribute, and performs classification accordingly. > > A native call is decomposed into a sequence of basic, primitive operations, called `Binding` (see the great javadoc on the `Binding.java` class for more info). There are many such bindings - for instance the `Move` binding is used to move a value into a specific machine register/stack slot. So, the main job of the various `CallingArranger` classes is to determine, given a Java `MethodType` and `FunctionDescriptor` what is the set of bindings associated with the downcall/upcall. > > At the heart of the foreign linker support is the `ProgrammableInvoker` class. This class effectively generates a `MethodHandle` which follows the steps described by the various bindings obtained by `CallArranger`. There are actually various strategies to interpret these bindings - listed below: > > * basic intepreted mode; in this mode, all bindings are interpreted using a stack-based machine written in Java (see `BindingInterpreter`), except for the `Move` bindings. For these bindings, the move is implemented by allocating a *buffer* (whose size is ABI specific) and by moving all the lowered values into positions within this buffer. The buffer is then passed to a piece of assembly code inside the VM which takes values from the buffer and moves them in their expected registers/stack slots (note that each position in the buffer corresponds to a different register). This is the most general invocation mode, the more "customizable" one, but also the slowest - since for every call there is some extra allocation which takes place. > > * specialized interpreted mode; same as before, but instead of interpreting the bindings with a stack-based interpreter, we generate a method handle chain which effectively interprets all the bindings (again, except `Move` ones). > > * intrinsified mode; this is typically used in combination with the specialized interpreted mode described above (although it can also be used with the Java-based binding interpreter). The goal here is to remove the buffer allocation and copy by introducing an additional JVM intrinsic. If a native call recipe is constant (e.g. the set of bindings is constant, which is probably the case if the native method handle is stored in a `static`, `final` field), then the VM can generate specialized assembly code which interprets the `Move` binding without the need to go for an intermediate buffer. This gives us back performances that are on par with JNI. > > For upcalls, the support is not (yet) as advanced, and only the basic interpreted mode is available there. We plan to add support for intrinsified modes there as well, which should considerably boost perfomances (probably well beyond what JNI can offer at the moment, since the upcall support in JNI is not very well optimized). > > Again, for more readings on the internals of the foreign linker support, please refer to [5]. > > #### Test changes > > Many new tests have been added to validate the foreign linker support; we have high level tests (see `StdLibTest`) which aim at testing the linker from the perspective of code that clients could write. But we also have deeper combinatorial tests (see `TestUpcall` and `TestDowncall`) which are meant to stress every corner of the ABI implementation. There are also some great tests (see the `callarranger` folder) which test the various `CallArranger`s for all the possible platforms; these tests adopt more of a white-box approach - that is, instead of treating the linker machinery as a black box and verify that the support works by checking that the native call returned the results we expected, these tests aims at checking that the set of bindings generated by the call arranger is correct. This also mean that we can test the classification logic for Windows, Mac and Linux regardless of the platform we're executing on. > > Some additional microbenchmarks have been added to compare the performances of downcall/upcall with JNI. > > [1] - https://openjdk.java.net/jeps/389 > [2] - https://openjdk.java.net/jeps/393 > [3] - https://git.openjdk.java.net/jdk/pull/548 > [4] - https://github.com/openjdk/panama-foreign/blob/foreign-jextract/doc/panama_ffi.md > [5] - http://cr.openjdk.java.net/~jvernee/docs/Foreign-abi%20downcall%20intrinsics%20technical%20description.html Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 27 commits: - Merge branch 'master' into 8254231_linker - Don't use JNI when generating native wrappers - Merge branch 'master' into 8254231_linker - Fix incorrect capitalization in one copyright header - Update copyright years, and add classpath exception to files that were missing it - Use separate constants for native invoker code size - Re-add file erroneously deleted (detected as rename) - Re-add erroneously removed files - Merge branch 'master' into 8254231_linker - Fix tests - Fix more whitespaces - ... and 17 more: https://git.openjdk.java.net/jdk/compare/da97ab5c...8c7b75da ------------- Changes: https://git.openjdk.java.net/jdk/pull/634/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=634&range=08 Stats: 75713 lines in 267 files changed: 72828 ins; 1608 del; 1277 mod Patch: https://git.openjdk.java.net/jdk/pull/634.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/634/head:pull/634 PR: https://git.openjdk.java.net/jdk/pull/634 From adinn at openjdk.java.net Wed Oct 21 11:34:15 2020 From: adinn at openjdk.java.net (Andrew Dinn) Date: Wed, 21 Oct 2020 11:34:15 GMT Subject: RFR: 8248411: [aarch64] Insufficient error handling when CodeBuffer is exhausted In-Reply-To: References: Message-ID: <02kM0GEZd0Twb-mEJxf0diGjUzXV3hcpOfXclg8gt6A=.d0d3a1af-9641-4179-ba9f-18401c05e6a4@github.com> On Tue, 20 Oct 2020 13:31:59 GMT, Patric Hedlin wrote: > Trampoline call generation (in the macro-assembler) may run out of CodeBuffer space without the proper error handling, resulting in asserts such as: > # Internal Error (.../open/src/hotspot/share/asm/codeBuffer.hpp:198), pid=845, tid=859 > # assert(allocates2(pc)) failed: relocation addr must be in this section > This update extends the error handling for such error cases to cover all uses of `trampoline_call()`, direct and indirect. Failure registration/recording is retained in the "**aarch64.ad**" file. The AArch64 changes are ok. <rant> I am not at all keen on the many format-only changes that are included in this patch since they introduce a lot of changed lines for the sole and rather specious benefit of adherence to a questionable orthographic authority. That's especially so with the relatively unscryable (sic) changes in output.cpp that modify declarations of the form `Foo *foo` to `Foo* foo`. One is initially left wondering what has changed only, at penny-drop, to replace that feeling with equal wonder as to why it was worth bothering, especially as there remain many thousands more such editorial opportunities. Meanwhile the substantive signal that constitutes the real patch is lost amid this noise. Of course, you may continue tilting at this windmill if you really wish to.</rant> ------------- Marked as reviewed by adinn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/765 From stefank at openjdk.java.net Wed Oct 21 11:36:25 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 21 Oct 2020 11:36:25 GMT Subject: RFR: 8255047: Add HotSpot flag to use with debuggers that restrict the =?UTF-8?B?Q1BV4oCm?= [v2] In-Reply-To: <_0YMtDNWAGXhg-_EyMf0c8GoNE4cf4wf1hpyr9j-sNs=.2ec5bf8b-023e-4bbd-9692-f74cd6c97f8d@github.com> References: <_0YMtDNWAGXhg-_EyMf0c8GoNE4cf4wf1hpyr9j-sNs=.2ec5bf8b-023e-4bbd-9692-f74cd6c97f8d@github.com> Message-ID: > Some debuggers don't work well with many threads, and/or incompletely restricts the number of used CPUs to one. > > This flag is intended as a catch-all for HotSpot developers (not available in product builds) to allow us to more easily use those debuggers. > > Currently, the proposal is to let the flag fix a few things: > 1) Turn down the number of JVM threads > 2) Turn off NUMA > 3) Force processor_id() to return 0 instead of values above processor_count() > > (1) is purely ergonomics: gdb, rr, valgrind is faster and seems to work much better with fewer threads. The values would still be overridable by devs. > > (2) and (3) deals with the fact that some debuggers change the reported processor count, but don't change the processor ids returned by sched_getcpu. This causes problems for ZGC and NUMA, that both assumes that they can rely on os::processor_id() < os::processor_count(). > > The current proposed flag name is -XX:+LimitedCPUsDebugging. I'm not entirely happy with that name, but I been able to find a better name. > > An alternative to having one flag, is to split this into two flags, and maybe that would solve the naming problem. However, the usability aspects will be worse. > > If we can't find a suitable name, I rather introduce a flag called: > -XX:DebuggerWorkarounds or -XX:DebuggerWorkaround1 > > Any suggestions / opinions? I really do want to at least fix the (2, 3) problem, because I keep having to add this to every single branch I'm working on. Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: 8255047: Add HotSpot flag to use with debuggers that restrict the CPU count ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/763/files - new: https://git.openjdk.java.net/jdk/pull/763/files/67e87758..a7aa7b9f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=763&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=763&range=00-01 Stats: 56 lines in 3 files changed: 37 ins; 14 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/763.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/763/head:pull/763 PR: https://git.openjdk.java.net/jdk/pull/763 From jvernee at openjdk.java.net Wed Oct 21 11:37:17 2020 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Wed, 21 Oct 2020 11:37:17 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v4] In-Reply-To: <1qSzjGTeTsGkvOvOIiXjY8JP944k3uLaq6KhkUP1vHE=.068f0409-ef1d-486a-8d74-c587619893e9@github.com> References: <0Zh0H5gSXzvHSstQ2w8NBM-P8yERRPouvhZJDNGvu4A=.6cde913f-7499-4c45-bc63-b717502b661e@github.com> <2moJ2056gzwWoleYccv21TpFYQHw5h9bA-IZCImplhs=.763198bf-06b0-4589-b01e-217ba84af94a@github.com> <1qSzjGTeTsGkvOvOIiXjY8JP944k3uLaq6KhkUP1vHE=.068f0409-ef1d-486a-8d74-c587619893e9@github.com> Message-ID: On Mon, 19 Oct 2020 11:24:45 GMT, Jorn Vernee wrote: >> I looked through some Hotspot runtime code and that looks ok. I saw a couple of strange things on my way through the code. See comments. > > Hi David, this code somewhat predates me, so I initially kept the JVM_ENTRY since that was what was already in place. IIRC the thread state transition was added later to be able to call JNI code, which checks that the thread state is native in some asserts. > > I've re-written this code, per @coleenp 's suggestion, to use VM code directly to replace what we were doing with JNI, so the thread state transition is also gone. > > I've looked at some of the *_ENTRY macros and the only one that seems to avoid the thread state transition is JVM_LEAF. I've switched the RegisterNatives functions we use to JVM_LEAF to avoid the redundant transitions. I also tried changing `PI_invokeNative` to JVM_LEAF, but since we can call back into Java from that, I run into a missing handle mark assert for some of the tests, so I've left that one as JVM_ENTRY (but removed some redundant braces). > > I've created a separate sub-pr against this PR's branch to make it easier to see what I've changed: https://github.com/mcimadamore/jdk/pull/1 (feel free to take a look). > > Thanks for the comments. I've fixed the following issues from review comments: - don't rely on `MethodHandles::adapter_code_size` (after private discussion) - update copyright years - use VM-internal API instead of JNI when parsing ABIDescriptor and BufferLayout objects while generating down/up call wrappers. As far as I see, that covers all open review comments. ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From stefank at openjdk.java.net Wed Oct 21 11:39:18 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 21 Oct 2020 11:39:18 GMT Subject: RFR: 8255047: Add HotSpot flag to use with debuggers that restrict the =?UTF-8?B?Q1BV4oCm?= In-Reply-To: References: <_0YMtDNWAGXhg-_EyMf0c8GoNE4cf4wf1hpyr9j-sNs=.2ec5bf8b-023e-4bbd-9692-f74cd6c97f8d@github.com> Message-ID: <5_hRC71KTJKei2xU1RjTlLMpY8NP2qMWFLKCJ8lTQK4=.c560bcf5-6212-4ca6-a61a-67bdb0e69661@github.com> On Wed, 21 Oct 2020 07:24:52 GMT, Stefan Karlsson wrote: >> Perhaps: `UseDebuggerErgo` for the main option name with >> `UseDebuggerErgo1` ... `UseDebuggerErgoN` for the suboptions >> where `UseDebuggerErgo` enables all the numbered >> `UseDebuggerErgo` options in one go. >> >> Update: Yes, this is the way that I have always wished that >> `UseNewCode*` worked. > > @dcubed-ojdk Thanks for the suggestion. I like it. I've updated the patch with the suggestion from Dan. The flags now work as follows: -XX:+UseDebuggerErgo turns on all ergonomics / workarounds -XX:-/+UseDebuggerErgo1 fixes processor id vs processor count problems -XX:-/+UseDebuggerErgo2 limits the number of JVM threads and turns of NUMA ------------- PR: https://git.openjdk.java.net/jdk/pull/763 From ihse at openjdk.java.net Wed Oct 21 11:45:18 2020 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Wed, 21 Oct 2020 11:45:18 GMT Subject: RFR: 8188055: (ref) Add Reference::refersTo predicate [v6] In-Reply-To: <0dhF_xxcp1VoUowwdZenB2qWa9ILcZjTMe3lsaRrg7k=.3c633db8-f745-4353-ad34-a64fbc96d4e0@github.com> References: <0dhF_xxcp1VoUowwdZenB2qWa9ILcZjTMe3lsaRrg7k=.3c633db8-f745-4353-ad34-a64fbc96d4e0@github.com> Message-ID: On Wed, 21 Oct 2020 02:28:30 GMT, Kim Barrett wrote: >> Finally returning to this review that was started in April 2020. I've >> recast it as a github PR. I think the security concern raised by Gil >> has been adequately answered. >> https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-April/029203.html >> https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-July/030401.html >> https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-August/030677.html >> https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-September/030793.html >> >> Please review a new function: java.lang.ref.Reference.refersTo. >> >> This function is needed to test the referent of a Reference object without >> artificially extending the lifetime of the referent object, as may happen >> when calling Reference.get. Some garbage collectors require extending the >> lifetime of a weak referent when accessed, in order to maintain collector >> invariants. Lifetime extension may occur with any collector when the >> Reference is a SoftReference, as calling get indicates recent access. This >> new function also allows testing the referent of a PhantomReference, which >> can't be accessed by calling get. >> >> The new function uses native methods whose implementations are in the VM so >> they can use the Access API. It is the intent that these methods will be >> intrinsified by optimizing compilers like C2 or graal, but that hasn't been >> implemented yet. Bear that in mind before rushing off to change existing >> uses of Reference.get. >> >> There are two native methods involved, one in Reference and an override in >> PhantomReference, both package private in java.lang.ref. The reason for this >> split is to simplify the intrinsification. This is a change from the version >> from April 2020; that version had a single native method in Reference, >> implemented using the ON_UNKNOWN_OOP_REF Access reference strength category. >> However, adding support for that category in the compilers adds significant >> implementation effort and complexity. Splitting avoids that complexity. >> >> Testing: >> mach5 tier1 >> Locally (linux-x64) verified the new test passes with various garbage collectors. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > improve wording in refersTo javadoc Marked as reviewed by ihse (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/498 From ihse at openjdk.java.net Wed Oct 21 11:45:19 2020 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Wed, 21 Oct 2020 11:45:19 GMT Subject: RFR: 8188055: (ref) Add Reference::refersTo predicate [v6] In-Reply-To: References: <0dhF_xxcp1VoUowwdZenB2qWa9ILcZjTMe3lsaRrg7k=.3c633db8-f745-4353-ad34-a64fbc96d4e0@github.com> Message-ID: On Wed, 21 Oct 2020 02:55:49 GMT, David Holmes wrote: >> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: >> >> improve wording in refersTo javadoc > > Update looks good. Need to reflect the change in the CSR. > > Thanks. > David Build changes look good. ------------- PR: https://git.openjdk.java.net/jdk/pull/498 From jbhateja at openjdk.java.net Wed Oct 21 11:55:26 2020 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Wed, 21 Oct 2020 11:55:26 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v7] In-Reply-To: References: Message-ID: > Summary: > > 1) Partial in-lining technique avoids call overhead penalty for small array copy operations with size less than 32 bytes. > 2) At runtime, a conditional check based on copy length either calls an array-copy stub or executes an optimized instruction sequence using AVX-512 masked instructions emitted at the call site. > 3) New runtime flag ArrayCopyPartialInlineSize=0/32(default)/64 bytes determines the maximum size for partial in-lining. > 4) Based on the perf results seen in benchmarks currently partial in-lining is performed only for arraycopy involving sub-word types (bool/byte/char/short). Once PR-61 gets integrated we can extend this patch to cover all the primitive types. > > Performance Results: > System : CascadeLake Server, Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz > Micros : test/micro/org/openjdk/bench/java/lang/ArrayCopy*.java > ArrayCopyPartialInlineSize : 32 > > JMH | Block Size | Baseline (ns/op) | Partial Inling (ns/op) | Gain > -- | -- | -- | -- | -- > ArrayCopyAligned.testByte | 1 | 5.417 | 2.696 | 2.009272997 > ArrayCopyAligned.testByte | 3 | 5.494 | 2.702 | 2.03330866 > ArrayCopyAligned.testByte | 5 | 5.417 | 2.637 | 2.05422829 > ArrayCopyAligned.testByte | 10 | 5.343 | 2.703 | 1.976692564 > ArrayCopyAligned.testByte | 20 | 5.837 | 2.636 | 2.214339909 > ArrayCopyAligned.testByte | 70 | 5.86 | 6 | 0.976666667 > ArrayCopyAligned.testByte | 150 | 6.766 | 6.906 | 0.979727773 > ArrayCopyAligned.testByte | 300 | 7.605 | 7.952 | 0.956363179 > ArrayCopyAligned.testByte | 600 | 11.989 | 12.007 | 0.998500874 > ArrayCopyAligned.testByte | 1200 | 16.447 | 16.585 | 0.991679228 > ArrayCopyAligned.testChar | 1 | 5.02 | 2.828 | 1.775106082 > ArrayCopyAligned.testChar | 3 | 5.129 | 2.762 | 1.85698769 > ArrayCopyAligned.testChar | 5 | 5.041 | 2.762 | 1.82512672 > ArrayCopyAligned.testChar | 10 | 5.716 | 2.762 | 2.069514844 > ArrayCopyAligned.testChar | 20 | 5.111 | 5.399 | 0.946656788 > ArrayCopyAligned.testChar | 70 | 6.271 | 6.242 | 1.004645947 > ArrayCopyAligned.testChar | 150 | 7.45 | 7.599 | 0.980392157 > ArrayCopyAligned.testChar | 300 | 9.904 | 10.112 | 0.97943038 > ArrayCopyAligned.testChar | 600 | 17.131 | 17.167 | 0.997902953 > ArrayCopyAligned.testChar | 1200 | 29.556 | 29.851 | 0.990117584 > ArrayCopyUnalignedBoth.testByte | 1 | 5.419 | 2.702 | 2.005551443 > ArrayCopyUnalignedBoth.testByte | 3 | 5.558 | 2.636 | 2.108497724 > ArrayCopyUnalignedBoth.testByte | 5 | 5.43 | 2.636 | 2.059939302 > ArrayCopyUnalignedBoth.testByte | 10 | 5.378 | 2.637 | 2.039438756 > ArrayCopyUnalignedBoth.testByte | 20 | 5.914 | 2.636 | 2.243550836 > ArrayCopyUnalignedBoth.testByte | 70 | 5.882 | 5.954 | 0.987907289 > ArrayCopyUnalignedBoth.testByte | 150 | 6.784 | 6.88 | 0.986046512 > ArrayCopyUnalignedBoth.testByte | 300 | 7.635 | 7.968 | 0.958207831 > ArrayCopyUnalignedBoth.testByte | 600 | 12.226 | 12.129 | 1.007997362 > ArrayCopyUnalignedBoth.testByte | 1200 | 16.992 | 20.717 | 0.820195974 > ArrayCopyUnalignedBoth.testChar | 1 | 5.019 | 2.828 | 1.774752475 > ArrayCopyUnalignedBoth.testChar | 3 | 5.163 | 2.763 | 1.868621064 > ArrayCopyUnalignedBoth.testChar | 5 | 5.042 | 2.827 | 1.783516095 > ArrayCopyUnalignedBoth.testChar | 10 | 5.718 | 2.828 | 2.021923621 > ArrayCopyUnalignedBoth.testChar | 20 | 5.111 | 5.404 | 0.945780903 > ArrayCopyUnalignedBoth.testChar | 70 | 6.367 | 6.235 | 1.02117081 > ArrayCopyUnalignedBoth.testChar | 150 | 7.367 | 8.269 | 0.890917886 > ArrayCopyUnalignedBoth.testChar | 300 | 10.358 | 10.642 | 0.973313287 > ArrayCopyUnalignedBoth.testChar | 600 | 20.84 | 17.522 | 1.189361945 > ArrayCopyUnalignedBoth.testChar | 1200 | 31.895 | 31.892 | 1.000094067 > ArrayCopyUnalignedDst.testByte | 1 | 5.455 | 2.637 | 2.068638604 > ArrayCopyUnalignedDst.testByte | 3 | 5.562 | 2.702 | 2.058475204 > ArrayCopyUnalignedDst.testByte | 5 | 5.427 | 2.702 | 2.008512213 > ArrayCopyUnalignedDst.testByte | 10 | 5.367 | 2.696 | 1.990727003 > ArrayCopyUnalignedDst.testByte | 20 | 5.839 | 2.637 | 2.214258627 > ArrayCopyUnalignedDst.testByte | 70 | 5.888 | 5.968 | 0.986595174 > ArrayCopyUnalignedDst.testByte | 150 | 6.785 | 6.773 | 1.001771741 > ArrayCopyUnalignedDst.testByte | 300 | 7.606 | 7.972 | 0.954089313 > ArrayCopyUnalignedDst.testByte | 600 | 11.986 | 21.195 | 0.565510734 > ArrayCopyUnalignedDst.testByte | 1200 | 16.54 | 16.784 | 0.985462345 > ArrayCopyUnalignedDst.testChar | 1 | 5.02 | 2.827 | 1.775733994 > ArrayCopyUnalignedDst.testChar | 3 | 5.131 | 2.762 | 1.857711803 > ArrayCopyUnalignedDst.testChar | 5 | 5.038 | 2.762 | 1.82404055 > ArrayCopyUnalignedDst.testChar | 10 | 5.718 | 2.762 | 2.070238957 > ArrayCopyUnalignedDst.testChar | 20 | 5.113 | 5.401 | 0.946676541 > ArrayCopyUnalignedDst.testChar | 70 | 6.222 | 6.214 | 1.001287416 > ArrayCopyUnalignedDst.testChar | 150 | 7.367 | 8.125 | 0.906707692 > ArrayCopyUnalignedDst.testChar | 300 | 10.204 | 10.082 | 1.012100774 > ArrayCopyUnalignedDst.testChar | 600 | 16.978 | 17.135 | 0.990837467 > ArrayCopyUnalignedDst.testChar | 1200 | 32.351 | 31.996 | 1.011095137 > ArrayCopyUnalignedSrc.testByte | 1 | 5.414 | 2.696 | 2.008160237 > ArrayCopyUnalignedSrc.testByte | 3 | 5.494 | 2.637 | 2.083428138 > ArrayCopyUnalignedSrc.testByte | 5 | 5.431 | 2.637 | 2.059537353 > ArrayCopyUnalignedSrc.testByte | 10 | 5.344 | 2.703 | 1.977062523 > ArrayCopyUnalignedSrc.testByte | 20 | 5.834 | 2.696 | 2.163946588 > ArrayCopyUnalignedSrc.testByte | 70 | 5.883 | 6.009 | 0.979031453 > ArrayCopyUnalignedSrc.testByte | 150 | 6.729 | 6.87 | 0.979475983 > ArrayCopyUnalignedSrc.testByte | 300 | 7.603 | 7.97 | 0.953952321 > ArrayCopyUnalignedSrc.testByte | 600 | 12.004 | 12.16 | 0.987171053 > ArrayCopyUnalignedSrc.testByte | 1200 | 16.534 | 16.643 | 0.9934507 > ArrayCopyUnalignedSrc.testChar | 1 | 5.021 | 2.762 | 1.81788559 > ArrayCopyUnalignedSrc.testChar | 3 | 5.13 | 2.762 | 1.857349747 > ArrayCopyUnalignedSrc.testChar | 5 | 5.042 | 2.827 | 1.783516095 > ArrayCopyUnalignedSrc.testChar | 10 | 5.726 | 2.761 | 2.073886273 > ArrayCopyUnalignedSrc.testChar | 20 | 5.112 | 5.401 | 0.94649139 > ArrayCopyUnalignedSrc.testChar | 70 | 6.113 | 6.227 | 0.981692629 > ArrayCopyUnalignedSrc.testChar | 150 | 7.493 | 7.888 | 0.949923935 > ArrayCopyUnalignedSrc.testChar | 300 | 10.234 | 10.501 | 0.97457385 > ArrayCopyUnalignedSrc.testChar | 600 | 17.175 | 17.142 | 1.001925096 > ArrayCopyUnalignedSrc.testChar | 1200 | 31.926 | 31.987 | 0.998092975 > > Detailed Reports: > Baseline : [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt) > WithOpt : [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt) Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 44 commits: - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8252848 - 8253474: Javadoc clean up in HttpsExchange, HttpsParameters, and HttpsServer Reviewed-by: dfuchs, michaelm - 8255000: C2: Unify IGVN processing when loop opts are over Reviewed-by: neliasso, iveresov, kvn - 8255026: C2: Miscellaneous cleanups in Compile and PhaseIdealLoop code Reviewed-by: thartmann, neliasso, redestad - 8253964: [Graal] UnschedulableGraphTest#test01fails with expected:<4> but was:<3> Reviewed-by: kvn, dlong - 8255065: Zero: accessor_entry misses the IRIW case Reviewed-by: mdoerr - 8254785: compiler/graalunit/HotspotTest.java failed with "missing Graal intrinsics for: java/lang/StringLatin1.indexOfChar([BIII)I" Reviewed-by: psandoz, iignatyev, kvn - 8254976: Re-enable swing jtreg tests which were broken due to samevm mode Reviewed-by: serb - 8255043: Incorrectly styled copyright text Reviewed-by: dholmes, trebari, jdv - 8255074: sun.nio.fs.WindowsPath::getPathForWin32Calls synchronizes on String object Reviewed-by: bpb - ... and 34 more: https://git.openjdk.java.net/jdk/compare/da97ab5c...67b5b9e0 ------------- Changes: https://git.openjdk.java.net/jdk/pull/302/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=302&range=06 Stats: 454 lines in 23 files changed: 430 ins; 0 del; 24 mod Patch: https://git.openjdk.java.net/jdk/pull/302.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/302/head:pull/302 PR: https://git.openjdk.java.net/jdk/pull/302 From coleenp at openjdk.java.net Wed Oct 21 11:59:26 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 21 Oct 2020 11:59:26 GMT Subject: RFR: 8233343: Deprecate -XX:+CriticalJNINatives flag which implements =?UTF-8?B?SmF2YeKApg==?= [v2] In-Reply-To: References: Message-ID: > This change deprecates the -XX:+CriticalJNINatives flag and removes the develop flag -XX:+StressCriticalJNINatives. See CSR for more details. > > This change also removes the lazy GC lock in the critical native transition, and runs the critical native function as thread_in_Java. I add a safepoint check at the end of the native function and transition to native and poll again for the safepoint after the function if a safepoint is requested. > > Tested with tier 1-6 (we have a few tests that use this). And built on linux-x86-open,linux-s390x-open,linux-arm32-debug,linux-ppc64le-debug. Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: - Fixed test to run on other platforms, fixed copyrights, fixed ppc code. - Remove pin/unpin object code. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/764/files - new: https://git.openjdk.java.net/jdk/pull/764/files/fa23ec19..847feca7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=764&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=764&range=00-01 Stats: 280 lines in 9 files changed: 5 ins; 269 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/764.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/764/head:pull/764 PR: https://git.openjdk.java.net/jdk/pull/764 From coleenp at openjdk.java.net Wed Oct 21 11:59:27 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 21 Oct 2020 11:59:27 GMT Subject: RFR: 8233343: Deprecate -XX:+CriticalJNINatives flag which implements =?UTF-8?B?SmF2YeKApg==?= [v2] In-Reply-To: References: Message-ID: On Wed, 21 Oct 2020 09:55:21 GMT, Martin Doerr wrote: >> Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: >> >> - Fixed test to run on other platforms, fixed copyrights, fixed ppc code. >> - Remove pin/unpin object code. > > src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp line 2192: > >> 2190: >> 2191: // Use that pc we placed in r_return_pc a while back as the current frame anchor. >> 2192: __ set_last_Java_frame(R1_SP, r_return_pc); > > These 2 lines need to get moved before if (!is_critical_native) check. > With this change, the 2 tests have passed with " | os.arch=="ppc64" | os.arch=="ppc64le" | os.arch=="s390x"" added. Ok, I moved those lines and added the other platforms for the basic CriticalNative tests that I modified. Thank you so much for testing this on the other platforms. ------------- PR: https://git.openjdk.java.net/jdk/pull/764 From jbhateja at openjdk.java.net Wed Oct 21 12:01:28 2020 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Wed, 21 Oct 2020 12:01:28 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v8] In-Reply-To: References: Message-ID: > Summary: > > 1) Partial in-lining technique avoids call overhead penalty for small array copy operations with size less than 32 bytes. > 2) At runtime, a conditional check based on copy length either calls an array-copy stub or executes an optimized instruction sequence using AVX-512 masked instructions emitted at the call site. > 3) New runtime flag ArrayCopyPartialInlineSize=0/32(default)/64 bytes determines the maximum size for partial in-lining. > 4) Based on the perf results seen in benchmarks currently partial in-lining is performed only for arraycopy involving sub-word types (bool/byte/char/short). Once PR-61 gets integrated we can extend this patch to cover all the primitive types. > > Performance Results: > System : CascadeLake Server, Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz > Micros : test/micro/org/openjdk/bench/java/lang/ArrayCopy*.java > ArrayCopyPartialInlineSize : 32 > > JMH | Block Size | Baseline (ns/op) | Partial Inling (ns/op) | Gain > -- | -- | -- | -- | -- > ArrayCopyAligned.testByte | 1 | 5.417 | 2.696 | 2.009272997 > ArrayCopyAligned.testByte | 3 | 5.494 | 2.702 | 2.03330866 > ArrayCopyAligned.testByte | 5 | 5.417 | 2.637 | 2.05422829 > ArrayCopyAligned.testByte | 10 | 5.343 | 2.703 | 1.976692564 > ArrayCopyAligned.testByte | 20 | 5.837 | 2.636 | 2.214339909 > ArrayCopyAligned.testByte | 70 | 5.86 | 6 | 0.976666667 > ArrayCopyAligned.testByte | 150 | 6.766 | 6.906 | 0.979727773 > ArrayCopyAligned.testByte | 300 | 7.605 | 7.952 | 0.956363179 > ArrayCopyAligned.testByte | 600 | 11.989 | 12.007 | 0.998500874 > ArrayCopyAligned.testByte | 1200 | 16.447 | 16.585 | 0.991679228 > ArrayCopyAligned.testChar | 1 | 5.02 | 2.828 | 1.775106082 > ArrayCopyAligned.testChar | 3 | 5.129 | 2.762 | 1.85698769 > ArrayCopyAligned.testChar | 5 | 5.041 | 2.762 | 1.82512672 > ArrayCopyAligned.testChar | 10 | 5.716 | 2.762 | 2.069514844 > ArrayCopyAligned.testChar | 20 | 5.111 | 5.399 | 0.946656788 > ArrayCopyAligned.testChar | 70 | 6.271 | 6.242 | 1.004645947 > ArrayCopyAligned.testChar | 150 | 7.45 | 7.599 | 0.980392157 > ArrayCopyAligned.testChar | 300 | 9.904 | 10.112 | 0.97943038 > ArrayCopyAligned.testChar | 600 | 17.131 | 17.167 | 0.997902953 > ArrayCopyAligned.testChar | 1200 | 29.556 | 29.851 | 0.990117584 > ArrayCopyUnalignedBoth.testByte | 1 | 5.419 | 2.702 | 2.005551443 > ArrayCopyUnalignedBoth.testByte | 3 | 5.558 | 2.636 | 2.108497724 > ArrayCopyUnalignedBoth.testByte | 5 | 5.43 | 2.636 | 2.059939302 > ArrayCopyUnalignedBoth.testByte | 10 | 5.378 | 2.637 | 2.039438756 > ArrayCopyUnalignedBoth.testByte | 20 | 5.914 | 2.636 | 2.243550836 > ArrayCopyUnalignedBoth.testByte | 70 | 5.882 | 5.954 | 0.987907289 > ArrayCopyUnalignedBoth.testByte | 150 | 6.784 | 6.88 | 0.986046512 > ArrayCopyUnalignedBoth.testByte | 300 | 7.635 | 7.968 | 0.958207831 > ArrayCopyUnalignedBoth.testByte | 600 | 12.226 | 12.129 | 1.007997362 > ArrayCopyUnalignedBoth.testByte | 1200 | 16.992 | 20.717 | 0.820195974 > ArrayCopyUnalignedBoth.testChar | 1 | 5.019 | 2.828 | 1.774752475 > ArrayCopyUnalignedBoth.testChar | 3 | 5.163 | 2.763 | 1.868621064 > ArrayCopyUnalignedBoth.testChar | 5 | 5.042 | 2.827 | 1.783516095 > ArrayCopyUnalignedBoth.testChar | 10 | 5.718 | 2.828 | 2.021923621 > ArrayCopyUnalignedBoth.testChar | 20 | 5.111 | 5.404 | 0.945780903 > ArrayCopyUnalignedBoth.testChar | 70 | 6.367 | 6.235 | 1.02117081 > ArrayCopyUnalignedBoth.testChar | 150 | 7.367 | 8.269 | 0.890917886 > ArrayCopyUnalignedBoth.testChar | 300 | 10.358 | 10.642 | 0.973313287 > ArrayCopyUnalignedBoth.testChar | 600 | 20.84 | 17.522 | 1.189361945 > ArrayCopyUnalignedBoth.testChar | 1200 | 31.895 | 31.892 | 1.000094067 > ArrayCopyUnalignedDst.testByte | 1 | 5.455 | 2.637 | 2.068638604 > ArrayCopyUnalignedDst.testByte | 3 | 5.562 | 2.702 | 2.058475204 > ArrayCopyUnalignedDst.testByte | 5 | 5.427 | 2.702 | 2.008512213 > ArrayCopyUnalignedDst.testByte | 10 | 5.367 | 2.696 | 1.990727003 > ArrayCopyUnalignedDst.testByte | 20 | 5.839 | 2.637 | 2.214258627 > ArrayCopyUnalignedDst.testByte | 70 | 5.888 | 5.968 | 0.986595174 > ArrayCopyUnalignedDst.testByte | 150 | 6.785 | 6.773 | 1.001771741 > ArrayCopyUnalignedDst.testByte | 300 | 7.606 | 7.972 | 0.954089313 > ArrayCopyUnalignedDst.testByte | 600 | 11.986 | 21.195 | 0.565510734 > ArrayCopyUnalignedDst.testByte | 1200 | 16.54 | 16.784 | 0.985462345 > ArrayCopyUnalignedDst.testChar | 1 | 5.02 | 2.827 | 1.775733994 > ArrayCopyUnalignedDst.testChar | 3 | 5.131 | 2.762 | 1.857711803 > ArrayCopyUnalignedDst.testChar | 5 | 5.038 | 2.762 | 1.82404055 > ArrayCopyUnalignedDst.testChar | 10 | 5.718 | 2.762 | 2.070238957 > ArrayCopyUnalignedDst.testChar | 20 | 5.113 | 5.401 | 0.946676541 > ArrayCopyUnalignedDst.testChar | 70 | 6.222 | 6.214 | 1.001287416 > ArrayCopyUnalignedDst.testChar | 150 | 7.367 | 8.125 | 0.906707692 > ArrayCopyUnalignedDst.testChar | 300 | 10.204 | 10.082 | 1.012100774 > ArrayCopyUnalignedDst.testChar | 600 | 16.978 | 17.135 | 0.990837467 > ArrayCopyUnalignedDst.testChar | 1200 | 32.351 | 31.996 | 1.011095137 > ArrayCopyUnalignedSrc.testByte | 1 | 5.414 | 2.696 | 2.008160237 > ArrayCopyUnalignedSrc.testByte | 3 | 5.494 | 2.637 | 2.083428138 > ArrayCopyUnalignedSrc.testByte | 5 | 5.431 | 2.637 | 2.059537353 > ArrayCopyUnalignedSrc.testByte | 10 | 5.344 | 2.703 | 1.977062523 > ArrayCopyUnalignedSrc.testByte | 20 | 5.834 | 2.696 | 2.163946588 > ArrayCopyUnalignedSrc.testByte | 70 | 5.883 | 6.009 | 0.979031453 > ArrayCopyUnalignedSrc.testByte | 150 | 6.729 | 6.87 | 0.979475983 > ArrayCopyUnalignedSrc.testByte | 300 | 7.603 | 7.97 | 0.953952321 > ArrayCopyUnalignedSrc.testByte | 600 | 12.004 | 12.16 | 0.987171053 > ArrayCopyUnalignedSrc.testByte | 1200 | 16.534 | 16.643 | 0.9934507 > ArrayCopyUnalignedSrc.testChar | 1 | 5.021 | 2.762 | 1.81788559 > ArrayCopyUnalignedSrc.testChar | 3 | 5.13 | 2.762 | 1.857349747 > ArrayCopyUnalignedSrc.testChar | 5 | 5.042 | 2.827 | 1.783516095 > ArrayCopyUnalignedSrc.testChar | 10 | 5.726 | 2.761 | 2.073886273 > ArrayCopyUnalignedSrc.testChar | 20 | 5.112 | 5.401 | 0.94649139 > ArrayCopyUnalignedSrc.testChar | 70 | 6.113 | 6.227 | 0.981692629 > ArrayCopyUnalignedSrc.testChar | 150 | 7.493 | 7.888 | 0.949923935 > ArrayCopyUnalignedSrc.testChar | 300 | 10.234 | 10.501 | 0.97457385 > ArrayCopyUnalignedSrc.testChar | 600 | 17.175 | 17.142 | 1.001925096 > ArrayCopyUnalignedSrc.testChar | 1200 | 31.926 | 31.987 | 0.998092975 > > Detailed Reports: > Baseline : [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt) > WithOpt : [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt) Jatin Bhateja has updated the pull request incrementally with two additional commits since the last revision: - Merge branch 'JDK-8252848' of http://github.com/jatin-bhateja/jdk into JDK-8252848 - Merge remote-tracking branch 'upstream' into JDK-8252848 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/302/files - new: https://git.openjdk.java.net/jdk/pull/302/files/67b5b9e0..08724c33 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=302&range=07 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=302&range=06-07 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/302.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/302/head:pull/302 PR: https://git.openjdk.java.net/jdk/pull/302 From jbhateja at openjdk.java.net Wed Oct 21 12:13:27 2020 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Wed, 21 Oct 2020 12:13:27 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v9] In-Reply-To: References: Message-ID: > Summary: > > 1) Partial in-lining technique avoids call overhead penalty for small array copy operations with size less than 32 bytes. > 2) At runtime, a conditional check based on copy length either calls an array-copy stub or executes an optimized instruction sequence using AVX-512 masked instructions emitted at the call site. > 3) New runtime flag ArrayCopyPartialInlineSize=0/32(default)/64 bytes determines the maximum size for partial in-lining. > 4) Based on the perf results seen in benchmarks currently partial in-lining is performed only for arraycopy involving sub-word types (bool/byte/char/short). Once PR-61 gets integrated we can extend this patch to cover all the primitive types. > > Performance Results: > System : CascadeLake Server, Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz > Micros : test/micro/org/openjdk/bench/java/lang/ArrayCopy*.java > ArrayCopyPartialInlineSize : 32 > > JMH | Block Size | Baseline (ns/op) | Partial Inling (ns/op) | Gain > -- | -- | -- | -- | -- > ArrayCopyAligned.testByte | 1 | 5.417 | 2.696 | 2.009272997 > ArrayCopyAligned.testByte | 3 | 5.494 | 2.702 | 2.03330866 > ArrayCopyAligned.testByte | 5 | 5.417 | 2.637 | 2.05422829 > ArrayCopyAligned.testByte | 10 | 5.343 | 2.703 | 1.976692564 > ArrayCopyAligned.testByte | 20 | 5.837 | 2.636 | 2.214339909 > ArrayCopyAligned.testByte | 70 | 5.86 | 6 | 0.976666667 > ArrayCopyAligned.testByte | 150 | 6.766 | 6.906 | 0.979727773 > ArrayCopyAligned.testByte | 300 | 7.605 | 7.952 | 0.956363179 > ArrayCopyAligned.testByte | 600 | 11.989 | 12.007 | 0.998500874 > ArrayCopyAligned.testByte | 1200 | 16.447 | 16.585 | 0.991679228 > ArrayCopyAligned.testChar | 1 | 5.02 | 2.828 | 1.775106082 > ArrayCopyAligned.testChar | 3 | 5.129 | 2.762 | 1.85698769 > ArrayCopyAligned.testChar | 5 | 5.041 | 2.762 | 1.82512672 > ArrayCopyAligned.testChar | 10 | 5.716 | 2.762 | 2.069514844 > ArrayCopyAligned.testChar | 20 | 5.111 | 5.399 | 0.946656788 > ArrayCopyAligned.testChar | 70 | 6.271 | 6.242 | 1.004645947 > ArrayCopyAligned.testChar | 150 | 7.45 | 7.599 | 0.980392157 > ArrayCopyAligned.testChar | 300 | 9.904 | 10.112 | 0.97943038 > ArrayCopyAligned.testChar | 600 | 17.131 | 17.167 | 0.997902953 > ArrayCopyAligned.testChar | 1200 | 29.556 | 29.851 | 0.990117584 > ArrayCopyUnalignedBoth.testByte | 1 | 5.419 | 2.702 | 2.005551443 > ArrayCopyUnalignedBoth.testByte | 3 | 5.558 | 2.636 | 2.108497724 > ArrayCopyUnalignedBoth.testByte | 5 | 5.43 | 2.636 | 2.059939302 > ArrayCopyUnalignedBoth.testByte | 10 | 5.378 | 2.637 | 2.039438756 > ArrayCopyUnalignedBoth.testByte | 20 | 5.914 | 2.636 | 2.243550836 > ArrayCopyUnalignedBoth.testByte | 70 | 5.882 | 5.954 | 0.987907289 > ArrayCopyUnalignedBoth.testByte | 150 | 6.784 | 6.88 | 0.986046512 > ArrayCopyUnalignedBoth.testByte | 300 | 7.635 | 7.968 | 0.958207831 > ArrayCopyUnalignedBoth.testByte | 600 | 12.226 | 12.129 | 1.007997362 > ArrayCopyUnalignedBoth.testByte | 1200 | 16.992 | 20.717 | 0.820195974 > ArrayCopyUnalignedBoth.testChar | 1 | 5.019 | 2.828 | 1.774752475 > ArrayCopyUnalignedBoth.testChar | 3 | 5.163 | 2.763 | 1.868621064 > ArrayCopyUnalignedBoth.testChar | 5 | 5.042 | 2.827 | 1.783516095 > ArrayCopyUnalignedBoth.testChar | 10 | 5.718 | 2.828 | 2.021923621 > ArrayCopyUnalignedBoth.testChar | 20 | 5.111 | 5.404 | 0.945780903 > ArrayCopyUnalignedBoth.testChar | 70 | 6.367 | 6.235 | 1.02117081 > ArrayCopyUnalignedBoth.testChar | 150 | 7.367 | 8.269 | 0.890917886 > ArrayCopyUnalignedBoth.testChar | 300 | 10.358 | 10.642 | 0.973313287 > ArrayCopyUnalignedBoth.testChar | 600 | 20.84 | 17.522 | 1.189361945 > ArrayCopyUnalignedBoth.testChar | 1200 | 31.895 | 31.892 | 1.000094067 > ArrayCopyUnalignedDst.testByte | 1 | 5.455 | 2.637 | 2.068638604 > ArrayCopyUnalignedDst.testByte | 3 | 5.562 | 2.702 | 2.058475204 > ArrayCopyUnalignedDst.testByte | 5 | 5.427 | 2.702 | 2.008512213 > ArrayCopyUnalignedDst.testByte | 10 | 5.367 | 2.696 | 1.990727003 > ArrayCopyUnalignedDst.testByte | 20 | 5.839 | 2.637 | 2.214258627 > ArrayCopyUnalignedDst.testByte | 70 | 5.888 | 5.968 | 0.986595174 > ArrayCopyUnalignedDst.testByte | 150 | 6.785 | 6.773 | 1.001771741 > ArrayCopyUnalignedDst.testByte | 300 | 7.606 | 7.972 | 0.954089313 > ArrayCopyUnalignedDst.testByte | 600 | 11.986 | 21.195 | 0.565510734 > ArrayCopyUnalignedDst.testByte | 1200 | 16.54 | 16.784 | 0.985462345 > ArrayCopyUnalignedDst.testChar | 1 | 5.02 | 2.827 | 1.775733994 > ArrayCopyUnalignedDst.testChar | 3 | 5.131 | 2.762 | 1.857711803 > ArrayCopyUnalignedDst.testChar | 5 | 5.038 | 2.762 | 1.82404055 > ArrayCopyUnalignedDst.testChar | 10 | 5.718 | 2.762 | 2.070238957 > ArrayCopyUnalignedDst.testChar | 20 | 5.113 | 5.401 | 0.946676541 > ArrayCopyUnalignedDst.testChar | 70 | 6.222 | 6.214 | 1.001287416 > ArrayCopyUnalignedDst.testChar | 150 | 7.367 | 8.125 | 0.906707692 > ArrayCopyUnalignedDst.testChar | 300 | 10.204 | 10.082 | 1.012100774 > ArrayCopyUnalignedDst.testChar | 600 | 16.978 | 17.135 | 0.990837467 > ArrayCopyUnalignedDst.testChar | 1200 | 32.351 | 31.996 | 1.011095137 > ArrayCopyUnalignedSrc.testByte | 1 | 5.414 | 2.696 | 2.008160237 > ArrayCopyUnalignedSrc.testByte | 3 | 5.494 | 2.637 | 2.083428138 > ArrayCopyUnalignedSrc.testByte | 5 | 5.431 | 2.637 | 2.059537353 > ArrayCopyUnalignedSrc.testByte | 10 | 5.344 | 2.703 | 1.977062523 > ArrayCopyUnalignedSrc.testByte | 20 | 5.834 | 2.696 | 2.163946588 > ArrayCopyUnalignedSrc.testByte | 70 | 5.883 | 6.009 | 0.979031453 > ArrayCopyUnalignedSrc.testByte | 150 | 6.729 | 6.87 | 0.979475983 > ArrayCopyUnalignedSrc.testByte | 300 | 7.603 | 7.97 | 0.953952321 > ArrayCopyUnalignedSrc.testByte | 600 | 12.004 | 12.16 | 0.987171053 > ArrayCopyUnalignedSrc.testByte | 1200 | 16.534 | 16.643 | 0.9934507 > ArrayCopyUnalignedSrc.testChar | 1 | 5.021 | 2.762 | 1.81788559 > ArrayCopyUnalignedSrc.testChar | 3 | 5.13 | 2.762 | 1.857349747 > ArrayCopyUnalignedSrc.testChar | 5 | 5.042 | 2.827 | 1.783516095 > ArrayCopyUnalignedSrc.testChar | 10 | 5.726 | 2.761 | 2.073886273 > ArrayCopyUnalignedSrc.testChar | 20 | 5.112 | 5.401 | 0.94649139 > ArrayCopyUnalignedSrc.testChar | 70 | 6.113 | 6.227 | 0.981692629 > ArrayCopyUnalignedSrc.testChar | 150 | 7.493 | 7.888 | 0.949923935 > ArrayCopyUnalignedSrc.testChar | 300 | 10.234 | 10.501 | 0.97457385 > ArrayCopyUnalignedSrc.testChar | 600 | 17.175 | 17.142 | 1.001925096 > ArrayCopyUnalignedSrc.testChar | 1200 | 31.926 | 31.987 | 0.998092975 > > Detailed Reports: > Baseline : [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt) > WithOpt : [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt) Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8252848 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/302/files - new: https://git.openjdk.java.net/jdk/pull/302/files/08724c33..12a7820e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=302&range=08 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=302&range=07-08 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/302.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/302/head:pull/302 PR: https://git.openjdk.java.net/jdk/pull/302 From mdoerr at openjdk.java.net Wed Oct 21 12:40:21 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 21 Oct 2020 12:40:21 GMT Subject: RFR: 8233343: Deprecate -XX:+CriticalJNINatives flag which implements =?UTF-8?B?SmF2YeKApg==?= [v2] In-Reply-To: References: Message-ID: <3fV1EpMn6j4jbL3aWfyNXjWwBkueD7MrrK8oOr1mOWw=.1dbb096d-db12-45b0-8c81-f484b78f185b@github.com> On Wed, 21 Oct 2020 11:59:26 GMT, Coleen Phillimore wrote: >> This change deprecates the -XX:+CriticalJNINatives flag and removes the develop flag -XX:+StressCriticalJNINatives. See CSR for more details. >> >> This change also removes the lazy GC lock in the critical native transition, and runs the critical native function as thread_in_Java. I add a safepoint check at the end of the native function and transition to native and poll again for the safepoint after the function if a safepoint is requested. >> >> Tested with tier 1-6 (we have a few tests that use this). And built on linux-x86-open,linux-s390x-open,linux-arm32-debug,linux-ppc64le-debug. > > Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: > > - Fixed test to run on other platforms, fixed copyrights, fixed ppc code. > - Remove pin/unpin object code. Looks good. Thanks! ------------- Marked as reviewed by mdoerr (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/764 From aph at openjdk.java.net Wed Oct 21 12:59:22 2020 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 21 Oct 2020 12:59:22 GMT Subject: RFR: 8248411: [aarch64] Insufficient error handling when CodeBuffer is exhausted In-Reply-To: <02kM0GEZd0Twb-mEJxf0diGjUzXV3hcpOfXclg8gt6A=.d0d3a1af-9641-4179-ba9f-18401c05e6a4@github.com> References: <02kM0GEZd0Twb-mEJxf0diGjUzXV3hcpOfXclg8gt6A=.d0d3a1af-9641-4179-ba9f-18401c05e6a4@github.com> Message-ID: On Wed, 21 Oct 2020 11:31:47 GMT, Andrew Dinn wrote: >> Trampoline call generation (in the macro-assembler) may run out of CodeBuffer space without the proper error handling, resulting in asserts such as: >> # Internal Error (.../open/src/hotspot/share/asm/codeBuffer.hpp:198), pid=845, tid=859 >> # assert(allocates2(pc)) failed: relocation addr must be in this section >> This update extends the error handling for such error cases to cover all uses of `trampoline_call()`, direct and indirect. Failure registration/recording is retained in the "**aarch64.ad**" file. > > The AArch64 changes are ok. > <rant> I am not at all keen on the many format-only changes that are included in this patch since they introduce a lot of changed lines for the sole and rather specious benefit of adherence to a questionable orthographic authority. That's especially so with the relatively unscryable (sic) changes in output.cpp that modify declarations of the form `Foo *foo` to `Foo* foo`. One is initially left wondering what has changed only, at penny-drop, to replace that feeling with equal wonder as to why it was worth bothering, especially as there remain many thousands more such editorial opportunities. Meanwhile the substantive signal that constitutes the real patch is lost amid this noise. Of course, you may continue tilting at this windmill if you really wish to.</rant> > The AArch64 changes are ok. > I am not at all keen on the many format-only changes that are included in this patch since they introduce a lot of changed lines for the sole and rather specious benefit of adherence to a questionable orthographic authority. That's especially so with the relatively unscryable (sic) changes in output.cpp that modify declarations of the form `Foo *foo` to `Foo* foo`. One is initially left wondering what has changed only, at penny-drop, to replace that feeling with equal wonder as to why it was worth bothering, especially as there remain many thousands more such editorial opportunities. Meanwhile the substantive signal that constitutes the real patch is lost amid this noise. Of course, you may continue tilting at this windmill if you really wish to. I agree. I'm happy enough with code being tidied up as we go along, but not with a patch like this one, where it's not even clear that the result is an improvement. Also, it doesn't make sense to "tidy up" code that is nothing to do with the patch. Changing `Foo *foo` to `Foo* foo` is simply wrong. The * operator binds to the right, so something like `int* a, b` looks like a and b are pointers; they're not. That's why we write `int* a, b` : we should be writing for the reader. ------------- PR: https://git.openjdk.java.net/jdk/pull/765 From mdoerr at openjdk.java.net Wed Oct 21 13:03:18 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 21 Oct 2020 13:03:18 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v7] In-Reply-To: <7Dgp_C-H8LsbF3tPHinVmb5bT_LoLZLYZUx9eSqigCA=.894148e4-a3f2-42c3-ab19-13e134e66853@github.com> References: <7Dgp_C-H8LsbF3tPHinVmb5bT_LoLZLYZUx9eSqigCA=.894148e4-a3f2-42c3-ab19-13e134e66853@github.com> Message-ID: <0ZTSXtU_wMV3g5TjwKPUfsRFzMDDnfmGfz6BoqlVbOM=.d51d1653-5249-4bdc-8ce4-76d54bb86c22@github.com> On Wed, 21 Oct 2020 01:51:26 GMT, CoreyAshford wrote: >> This patch set encompasses the following commits: >> >> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for the intrinsic. The API is similar to the existing encodeBlock intrinsic. >> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic implementation >> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. >> - Adds a JMH microbenchmark for both Base64 encoding and encoding. >> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. > > CoreyAshford has updated the pull request incrementally with one additional commit since the last revision: > > CheckGraalIntrinsics.java: fix copy/paste error Marked as reviewed by mdoerr (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From rehn at openjdk.java.net Wed Oct 21 13:44:16 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Wed, 21 Oct 2020 13:44:16 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v2] In-Reply-To: References: <5R147e7WLawIyLWxjBQUNK22Y19rDsT32HYDCUXep7g=.a2344642-ca05-429e-a733-4ead81eb4f20@github.com> <010C_PcZp7Ol1SpH7__fIb5SPJrbFaQMUa0DN_R2rtM=.49983d69-0901-4081-9043-fbfaef0e4c61@github.com> <-vefsX3p5IkTBZvrIepZdJ-Q1G9HoSDsyVj2M3dkyKc=.03e18d1d-341f-45cc-a25b-c2894ae373e8@github.com> <_akbNkG-trhkT8U9QefeEYexY06Ztv6F2HiFHZF3alI=.d7f4266e-1460-433c-8e20-8dea52d927aa@github.com> Message-ID: On Wed, 21 Oct 2020 08:32:25 GMT, Robbin Ehn wrote: >>> Thanks, I'm exploring what we need to execute the EB inside the handshake. >> >> I want to experiment with object reallocation without referencing a frame. I think a should be possible to reallocate objects given only the corresponding compiled pc. If so, then a handshake/vm operation can fail with the request to reallocate objects at a pc. This can be done concurrently and then the handshake/vm operation can be restarted. > > I pushed the merge. (I manage to pick up bad state in first merge, so I did a second merge to get the fixes for that) > Please have a look. > > Still running test, but there were some interest in this change-set (it seem to fix an unrelated bug also) so I published it. No issues in testing. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From dcubed at openjdk.java.net Wed Oct 21 13:46:22 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 21 Oct 2020 13:46:22 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v7] In-Reply-To: <0ZTSXtU_wMV3g5TjwKPUfsRFzMDDnfmGfz6BoqlVbOM=.d51d1653-5249-4bdc-8ce4-76d54bb86c22@github.com> References: <7Dgp_C-H8LsbF3tPHinVmb5bT_LoLZLYZUx9eSqigCA=.894148e4-a3f2-42c3-ab19-13e134e66853@github.com> <0ZTSXtU_wMV3g5TjwKPUfsRFzMDDnfmGfz6BoqlVbOM=.d51d1653-5249-4bdc-8ce4-76d54bb86c22@github.com> Message-ID: On Wed, 21 Oct 2020 13:00:09 GMT, Martin Doerr wrote: >> CoreyAshford has updated the pull request incrementally with one additional commit since the last revision: >> >> CheckGraalIntrinsics.java: fix copy/paste error > > Marked as reviewed by mdoerr (Reviewer). Buried in that GitHub test run link are the results for windows-x64-debug_testlogs_hs_tier1_compiler which includes this file (test-summary.txt): ============================== Test summary ============================== TEST TOTAL PASS FAIL ERROR >> jtreg:test/hotspot/jtreg:tier1_compiler 742 683 59 0 << ============================== TEST FAILURE I poked around, but I don't see any logs for the individual test failures. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From dcubed at openjdk.java.net Wed Oct 21 14:00:20 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 21 Oct 2020 14:00:20 GMT Subject: RFR: 8255047: Add HotSpot flag to use with debuggers that restrict the =?UTF-8?B?Q1BV4oCm?= [v2] In-Reply-To: References: <_0YMtDNWAGXhg-_EyMf0c8GoNE4cf4wf1hpyr9j-sNs=.2ec5bf8b-023e-4bbd-9692-f74cd6c97f8d@github.com> Message-ID: On Wed, 21 Oct 2020 11:36:25 GMT, Stefan Karlsson wrote: >> Some debuggers don't work well with many threads, and/or incompletely restricts the number of used CPUs to one. >> >> This flag is intended as a catch-all for HotSpot developers (not available in product builds) to allow us to more easily use those debuggers. >> >> Currently, the proposal is to let the flag fix a few things: >> 1) Turn down the number of JVM threads >> 2) Turn off NUMA >> 3) Force processor_id() to return 0 instead of values above processor_count() >> >> (1) is purely ergonomics: gdb, rr, valgrind is faster and seems to work much better with fewer threads. The values would still be overridable by devs. >> >> (2) and (3) deals with the fact that some debuggers change the reported processor count, but don't change the processor ids returned by sched_getcpu. This causes problems for ZGC and NUMA, that both assumes that they can rely on os::processor_id() < os::processor_count(). >> >> The current proposed flag name is -XX:+LimitedCPUsDebugging. I'm not entirely happy with that name, but I been able to find a better name. >> >> An alternative to having one flag, is to split this into two flags, and maybe that would solve the naming problem. However, the usability aspects will be worse. >> >> If we can't find a suitable name, I rather introduce a flag called: >> -XX:DebuggerWorkarounds or -XX:DebuggerWorkaround1 >> >> Any suggestions / opinions? I really do want to at least fix the (2, 3) problem, because I keep having to add this to every single branch I'm working on. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > 8255047: Add HotSpot flag to use with debuggers that restrict the CPU count Thumbs up. src/hotspot/os/linux/os_linux.cpp line 4784: > 4782: #endif > 4783: > 4784: assert(id >= 0 && id < _processor_count, "Invalid processor id [%d]", id); Thanks for adding the bad id value to the assert() output. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/763 From rrich at openjdk.java.net Wed Oct 21 14:19:26 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Wed, 21 Oct 2020 14:19:26 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: Message-ID: <27LjwNE2Xl_ceCaXzFOuF9slhZklqN9TvLt0Vsw2sMM=.9c9d6a08-d19c-4ed3-af8f-890ad1a0bdc5@github.com> On Wed, 21 Oct 2020 08:40:47 GMT, Robbin Ehn wrote: >> The main point of this change-set is to make it easier to implement S/R on top of handshakes. >> Which is a prerequisite for removing _suspend_flag (which duplicates the handshake functionality). >> But we also remove some complicated S/R methods. >> >> We basically just put in everything in the handshake closure, so the diff just looks much worse than what it is. >> >> TraceSuspendDebugBits have an ifdef, but in both cases it now just returns. >> But I was unsure if I should remove now or when is_ext_suspend_completed() is removed. >> >> Passes multiple t1-5 runs, locally it passes many jck:vm/nsk_jvmti/nsk_jdi/jdk-jdi runs. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: > > - Fixed merge miss > - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended > - Merge fix from Richard > - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended > - Removed TraceSuspendDebugBits > - Removed unused method is_ext_suspend_completed_with_lock > - Utilize handshakes instead of is_thread_fully_suspended The change is good. I've only added a few comments (nothing important really). Thanks, also for giving precedence to me ;) src/hotspot/share/prims/jvmtiEnvBase.cpp line 1631: > 1629: _state->set_pending_step_for_popframe(); > 1630: _result = JVMTI_ERROR_NONE; > 1631: } I'd suggest to eliminate jt and use java_thread instead. Also because you're using java_thread in line 1626. The assertion should check if `_state->get_thread() == target` then. src/hotspot/share/prims/jvmtiEnv.cpp line 1808: > 1806: } > 1807: if (java_lang_Class::is_primitive(k_mirror)) { > 1808: return JVMTI_ERROR_NONE; The call of JvmtiSuspendControl::print() seems to be eliminated. Ok for me. src/hotspot/share/prims/jvmtiEnvBase.cpp line 1454: > 1452: _state->set_earlyret_pending(); > 1453: _state->set_earlyret_oop(ret_ob_h()); > 1454: _state->set_earlyret_value(_value, _tos); Good that these updates are done with a handshake now. Maybe I'm missing s.th. but I don't see synchronization in the older version. src/hotspot/share/prims/jvmtiEnvBase.hpp line 310: > 308: GrowableArray *owned_monitors_list); > 309: static jvmtiError check_top_frame(Thread* current_thread, JavaThread* java_thread, > 310: jvalue value, TosState tos, Handle* ret_ob_h); Maybe fix indentation? src/hotspot/share/runtime/deoptimization.cpp line 1771: > 1769: Deoptimization::deoptimize_frame_internal(thread, id, reason); > 1770: } else { > 1771: VM_DeoptimizeFrame deopt(thread, id, reason); I guess VM_DeoptimizeFrame can be replaced with a handshake too now. src/hotspot/share/runtime/thread.cpp line 537: > 535: // cancelled). Returns true if the thread is externally suspended and > 536: // false otherwise. > 537: bool JavaThread::is_ext_suspend_completed() { I'd think `JavaThread::is_ext_suspend_completed` can be removed also (as a separate enhancement). It also duplicates code of the handshake mechanism. Just replace VM_ThreadSuspend with a handshake. ------------- Marked as reviewed by rrich (Committer). PR: https://git.openjdk.java.net/jdk/pull/729 From rrich at openjdk.java.net Wed Oct 21 14:23:26 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Wed, 21 Oct 2020 14:23:26 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: <27LjwNE2Xl_ceCaXzFOuF9slhZklqN9TvLt0Vsw2sMM=.9c9d6a08-d19c-4ed3-af8f-890ad1a0bdc5@github.com> References: <27LjwNE2Xl_ceCaXzFOuF9slhZklqN9TvLt0Vsw2sMM=.9c9d6a08-d19c-4ed3-af8f-890ad1a0bdc5@github.com> Message-ID: <0yF3nXkvKHHTheoHvF4IQJ9v98AjHl-7R9tSo1qrOec=.367302e8-9414-455d-a601-046cdf5a40c8@github.com> On Wed, 21 Oct 2020 14:06:22 GMT, Richard Reingruber wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: >> >> - Fixed merge miss >> - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended >> - Merge fix from Richard >> - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended >> - Removed TraceSuspendDebugBits >> - Removed unused method is_ext_suspend_completed_with_lock >> - Utilize handshakes instead of is_thread_fully_suspended > > src/hotspot/share/runtime/deoptimization.cpp line 1771: > >> 1769: Deoptimization::deoptimize_frame_internal(thread, id, reason); >> 1770: } else { >> 1771: VM_DeoptimizeFrame deopt(thread, id, reason); > > I guess VM_DeoptimizeFrame can be replaced with a handshake too now. Not in this pr of course :) ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From github.com+4146708+a74nh at openjdk.java.net Wed Oct 21 15:37:17 2020 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Wed, 21 Oct 2020 15:37:17 GMT Subject: RFR: 8221554: aarch64 cross-modifying code [v4] In-Reply-To: <0qlO5eLJPQgvI1Lg3pO8YpUeoC2zGexNt3DgHptiQiA=.e7ce4706-a702-4e32-a976-eac4adc24771@github.com> References: <0qlO5eLJPQgvI1Lg3pO8YpUeoC2zGexNt3DgHptiQiA=.e7ce4706-a702-4e32-a976-eac4adc24771@github.com> Message-ID: On Mon, 19 Oct 2020 08:20:46 GMT, Robbin Ehn wrote: >> Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove inlasm_isb define >> >> Change-Id: I2d0ef8a78292dac875f3f65d2253981cdb7a497a > > Seems fine to me, mostly look at shared code part. Patch merged to master - so that it's on top of pchilano's patch. Already tested that both the patches work fine together (did a complete run of all our tests with VerifyCrossModifyFence set to true). robehn has reviewed this patch, but I think I need a second review too (?) ------------- PR: https://git.openjdk.java.net/jdk/pull/428 From rkennke at openjdk.java.net Wed Oct 21 16:01:14 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Wed, 21 Oct 2020 16:01:14 GMT Subject: RFR: 8255041: Zero: remove old JSR 292 support leftovers In-Reply-To: References: Message-ID: On Tue, 20 Oct 2020 07:56:48 GMT, Aleksey Shipilev wrote: > JDK-8000780 removed `ZeroInterpreter::method_handle_entry`, but left its helpers around. These have no uses, and can be eliminated. Attention @rkennke, who did the JDK-8000780 a while ago. > > Testing: > - [x] Linux x86_64 fastdebug zero images > - [x] Linux x86_64 release zero bootcycle-images Looks good to me, thanks! ------------- Marked as reviewed by rkennke (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/758 From github.com+51754783+coreyashford at openjdk.java.net Wed Oct 21 16:31:23 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Wed, 21 Oct 2020 16:31:23 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v7] In-Reply-To: References: <7Dgp_C-H8LsbF3tPHinVmb5bT_LoLZLYZUx9eSqigCA=.894148e4-a3f2-42c3-ab19-13e134e66853@github.com> <0ZTSXtU_wMV3g5TjwKPUfsRFzMDDnfmGfz6BoqlVbOM=.d51d1653-5249-4bdc-8ce4-76d54bb86c22@github.com> Message-ID: On Wed, 21 Oct 2020 13:42:48 GMT, Daniel D. Daugherty wrote: > Buried in that GitHub test run link are the results for > windows-x64-debug_testlogs_hs_tier1_compiler > which includes this file (test-summary.txt): > > ``` > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > >> jtreg:test/hotspot/jtreg:tier1_compiler 742 683 59 0 << > ============================== > TEST FAILURE > ``` > > I poked around, but I don't see any logs for the individual > test failures. In a browser at the provide test link, on the upper right in the "three-dot menu", there's an option to download the logs. The only file containing failures is `6_Windows x64 (hstier1 compiler).txt`. There are quite a few exception backtraces in this log file, but they all look similar, saying: TEST RESULT: Failed. Execution failed: `main' threw exception: java.lang.NullPointerException: Cannot invoke "java.lang.Throwable.getCause()" because "" is null There's only one that's slightly different: TEST RESULT: Failed. Execution failed: `main' threw exception: java.lang.Error: Can't get full path name for '.', got exception java.lang.NullPointerException: Cannot invoke "java.lang.Throwable.getCause()" because "" is null ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From shade at openjdk.java.net Wed Oct 21 16:39:18 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 21 Oct 2020 16:39:18 GMT Subject: Integrated: 8255041: Zero: remove old JSR 292 support leftovers In-Reply-To: References: Message-ID: On Tue, 20 Oct 2020 07:56:48 GMT, Aleksey Shipilev wrote: > JDK-8000780 removed `ZeroInterpreter::method_handle_entry`, but left its helpers around. These have no uses, and can be eliminated. Attention @rkennke, who did the JDK-8000780 a while ago. > > Testing: > - [x] Linux x86_64 fastdebug zero images > - [x] Linux x86_64 release zero bootcycle-images This pull request has now been integrated. Changeset: 8d9e6d01 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/8d9e6d01 Stats: 76 lines in 3 files changed: 0 ins; 76 del; 0 mod 8255041: Zero: remove old JSR 292 support leftovers Reviewed-by: rkennke ------------- PR: https://git.openjdk.java.net/jdk/pull/758 From zgu at openjdk.java.net Wed Oct 21 17:07:18 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Wed, 21 Oct 2020 17:07:18 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v10] In-Reply-To: <5LY7Ty5Q2S8j9S30uXyeX0EE8AQWQ4BFFd02NYMzVio=.2e423faa-895a-42c9-ad9f-6c85844e1ff8@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> <5LY7Ty5Q2S8j9S30uXyeX0EE8AQWQ4BFFd02NYMzVio=.2e423faa-895a-42c9-ad9f-6c85844e1ff8@github.com> Message-ID: On Wed, 14 Oct 2020 14:32:28 GMT, Roman Kennke wrote: >> Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads that make heavvy use of such weak references will therefore potentially cause significant GC pauses. >> >> There are 3 main items that contribute to pause time linear to number of references, or worse: >> - We need to scan and consider each reference on the various 'discovered' lists. >> - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is theoretically only bounded by the live data set size. >> - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' >> >> The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. >> >> The solution to this is two-fold: >> 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for FinalReference, marking stops there, and does not mark through the referent. >> 2. Concurrent processing is performed after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers in Reference.get() intrinsics that return NULL when the referent is not reachable. >> >> Testing: hotspot_gc_shenadoah (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions, tier1, tier2, vmTestbase_vm_metaspace, vmTestbase_nsk_jvmti, with -XX:+UseShenandoahGC without regressions, specjvm with various levels of verification > > Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: > > - Add fallback support for new properties in ObjArrayChunkedTask > - Fix 32bit interpreter LRB-native call src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.hpp line 64: > 62: void load_reference_barrier_not_null(MacroAssembler* masm, Register dst, Address load_addr); > 63: void load_reference_barrier_native(MacroAssembler* masm, Register dst, Address load_addr, bool native); > 64: is_native parameter seems weird. Maybe invert to is_weak_ref? BTW, I think I am seeing compressed oops in conc-stack-scanning. src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp line 399: > 397: > 398: save_xmm_registers(masm); > 399: if (UseCompressedOops && !native) { What's problem you saw with compressed oop + native? src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.hpp line 42: > 40: WEAK > 41: }; > 42: private: Declare as enum class for stronger typing? src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp line 750: > 748: __ load_parameter(0, r0); > 749: __ load_parameter(1, r1); > 750: if (kind == ShenandoahBarrierSet::NATIVE) { Use "switch" statement to be consistent with above src/hotspot/share/gc/shenandoah/shenandoahReferenceProcessor.hpp line 139: > 137: > 138: // template > 139: // void keep_alive(oop reference, ReferenceType type) const; Remove this src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.hpp line 100: > 98: oop load_reference_barrier_not_null(oop obj); > 99: > 100: template class -> typename also? ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From kvn at openjdk.java.net Wed Oct 21 17:19:21 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 21 Oct 2020 17:19:21 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v7] In-Reply-To: References: <7Dgp_C-H8LsbF3tPHinVmb5bT_LoLZLYZUx9eSqigCA=.894148e4-a3f2-42c3-ab19-13e134e66853@github.com> <0ZTSXtU_wMV3g5TjwKPUfsRFzMDDnfmGfz6BoqlVbOM=.d51d1653-5249-4bdc-8ce4-76d54bb86c22@github.com> Message-ID: On Wed, 21 Oct 2020 16:28:21 GMT, CoreyAshford wrote: >> Buried in that GitHub test run link are the results for >> windows-x64-debug_testlogs_hs_tier1_compiler >> which includes this file (test-summary.txt): >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >>>> jtreg:test/hotspot/jtreg:tier1_compiler 742 683 59 0 << >> ============================== >> TEST FAILURE >> I poked around, but I don't see any logs for the individual >> test failures. > >> Buried in that GitHub test run link are the results for >> windows-x64-debug_testlogs_hs_tier1_compiler >> which includes this file (test-summary.txt): >> >> ``` >> ============================== >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >> >> jtreg:test/hotspot/jtreg:tier1_compiler 742 683 59 0 << >> ============================== >> TEST FAILURE >> ``` >> >> I poked around, but I don't see any logs for the individual >> test failures. > > In a browser at the provide test link, on the upper right in the "three-dot menu", there's an option to download the logs. The only file containing failures is `6_Windows x64 (hstier1 compiler).txt`. There are quite a few exception backtraces in this log file, but they all look similar, saying: > TEST RESULT: Failed. Execution failed: `main' threw exception: java.lang.NullPointerException: Cannot invoke "java.lang.Throwable.getCause()" because "" is null > There's only one that's slightly different: > TEST RESULT: Failed. Execution failed: `main' threw exception: java.lang.Error: Can't get full path name for '.', got exception java.lang.NullPointerException: Cannot invoke "java.lang.Throwable.getCause()" because "" is null > I will push the code, but I haven't been successful in running the test (see [#293 (comment)](https://github.com/openjdk/jdk/pull/293#issuecomment-713223068) ) I submitted our testing and let you know results. I used your latest '06: Full' changes. It should run CheckGraalIntrinsics.java with pushed 8254785 fix which enable it again. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From shade at openjdk.java.net Wed Oct 21 17:31:37 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 21 Oct 2020 17:31:37 GMT Subject: RFR: 8253525: Implement getInstanceSize/sizeOf intrinsics [v5] In-Reply-To: References: Message-ID: > This is fork off the SizeOf JEP, JDK-8249196. There is already the entry point in JDK that can use the intrinsic like this: `Instrumentation.getInstanceSize`. Therefore, we can implement the C1/C2 intrinsic now, hook it up to `Instrumentation`, and let the tools use that fast path today. > > With this patch, JOL is able to be close to `deepSizeOf` implementation from SizeOf JEP. > > Example performance improvements for sizing up a custom linked list: > > Benchmark (size) Mode Cnt Score Error Units > > # Default > LinkedChainBench.linkedChain 1 avgt 5 705.835 ? 8.051 ns/op > LinkedChainBench.linkedChain 10 avgt 5 3148.874 ? 37.856 ns/op > LinkedChainBench.linkedChain 100 avgt 5 28693.256 ? 142.254 ns/op > LinkedChainBench.linkedChain 1000 avgt 5 290161.590 ? 4594.631 ns/op > > # Instrumentation attached, no intrinsics > LinkedChainBench.linkedChain 1 avgt 5 159.659 ? 19.238 ns/op > LinkedChainBench.linkedChain 10 avgt 5 717.659 ? 22.540 ns/op > LinkedChainBench.linkedChain 100 avgt 5 7739.394 ? 111.683 ns/op > LinkedChainBench.linkedChain 1000 avgt 5 80724.238 ? 2887.794 ns/op > > # Instrumentation attached, new intrinsics > LinkedChainBench.linkedChain 1 avgt 5 95.254 ? 0.808 ns/op > LinkedChainBench.linkedChain 10 avgt 5 261.564 ? 8.524 ns/op > LinkedChainBench.linkedChain 100 avgt 5 3367.192 ? 21.128 ns/op > LinkedChainBench.linkedChain 1000 avgt 5 34148.851 ? 373.080 ns/op Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - The new intrinsic-related test - Revert the change to test - Merge branch 'master' into JDK-8253525-sizeof-intrinsics - Add new intrinsics to toBeInvestigated list in CheckGraalIntrinsics.java - 8253525: Implement getInstanceSize/sizeOf intrinsics ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/650/files - new: https://git.openjdk.java.net/jdk/pull/650/files/132f2c50..482c2f24 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=650&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=650&range=03-04 Stats: 40731 lines in 676 files changed: 27719 ins; 10029 del; 2983 mod Patch: https://git.openjdk.java.net/jdk/pull/650.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/650/head:pull/650 PR: https://git.openjdk.java.net/jdk/pull/650 From shade at openjdk.java.net Wed Oct 21 17:31:37 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 21 Oct 2020 17:31:37 GMT Subject: RFR: 8253525: Implement getInstanceSize/sizeOf intrinsics [v3] In-Reply-To: References: <0E5sXAWBENg290o8HpKfufH-e69Ue4EfAR074HuNOt4=.3e8af6ad-4aaf-4427-b008-642ca7138f02@github.com> Message-ID: On Tue, 20 Oct 2020 18:12:06 GMT, Vladimir Kozlov wrote: >> It was mistake in 8253191 (I file bug). If you modify existing file (even if you keep only test name the same) you have to preserve original Copyright and add new Copyright line. You don't need create new file. >> We have a lot of cases with 2 or more Copyright lines - it is normal: >> https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/compiler/vectorization/TestVectorsNotSavedAtSafepoint.java > > I file 8255067 to restore Copyright line in TestUnsignedByteCompare.java test file. I made sure the new test is in the new file. ------------- PR: https://git.openjdk.java.net/jdk/pull/650 From kvn at openjdk.java.net Wed Oct 21 17:36:22 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 21 Oct 2020 17:36:22 GMT Subject: RFR: 8253525: Implement getInstanceSize/sizeOf intrinsics [v5] In-Reply-To: References: Message-ID: On Wed, 21 Oct 2020 17:31:37 GMT, Aleksey Shipilev wrote: >> This is fork off the SizeOf JEP, JDK-8249196. There is already the entry point in JDK that can use the intrinsic like this: `Instrumentation.getInstanceSize`. Therefore, we can implement the C1/C2 intrinsic now, hook it up to `Instrumentation`, and let the tools use that fast path today. >> >> With this patch, JOL is able to be close to `deepSizeOf` implementation from SizeOf JEP. >> >> Example performance improvements for sizing up a custom linked list: >> >> Benchmark (size) Mode Cnt Score Error Units >> >> # Default >> LinkedChainBench.linkedChain 1 avgt 5 705.835 ? 8.051 ns/op >> LinkedChainBench.linkedChain 10 avgt 5 3148.874 ? 37.856 ns/op >> LinkedChainBench.linkedChain 100 avgt 5 28693.256 ? 142.254 ns/op >> LinkedChainBench.linkedChain 1000 avgt 5 290161.590 ? 4594.631 ns/op >> >> # Instrumentation attached, no intrinsics >> LinkedChainBench.linkedChain 1 avgt 5 159.659 ? 19.238 ns/op >> LinkedChainBench.linkedChain 10 avgt 5 717.659 ? 22.540 ns/op >> LinkedChainBench.linkedChain 100 avgt 5 7739.394 ? 111.683 ns/op >> LinkedChainBench.linkedChain 1000 avgt 5 80724.238 ? 2887.794 ns/op >> >> # Instrumentation attached, new intrinsics >> LinkedChainBench.linkedChain 1 avgt 5 95.254 ? 0.808 ns/op >> LinkedChainBench.linkedChain 10 avgt 5 261.564 ? 8.524 ns/op >> LinkedChainBench.linkedChain 100 avgt 5 3367.192 ? 21.128 ns/op >> LinkedChainBench.linkedChain 1000 avgt 5 34148.851 ? 373.080 ns/op > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - The new intrinsic-related test > - Revert the change to test > - Merge branch 'master' into JDK-8253525-sizeof-intrinsics > - Add new intrinsics to toBeInvestigated list in CheckGraalIntrinsics.java > - 8253525: Implement getInstanceSize/sizeOf intrinsics Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/650 From shade at openjdk.java.net Wed Oct 21 17:40:16 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 21 Oct 2020 17:40:16 GMT Subject: RFR: 8253525: Implement getInstanceSize/sizeOf intrinsics [v5] In-Reply-To: References: Message-ID: <_TFbHUvH8zI29hGvpE3TGGNLGgi7PhP7rfed23up13U=.5a19aaed-cc1e-4d18-997d-f37616323e61@github.com> On Wed, 21 Oct 2020 17:33:27 GMT, Vladimir Kozlov wrote: >> Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: >> >> - The new intrinsic-related test >> - Revert the change to test >> - Merge branch 'master' into JDK-8253525-sizeof-intrinsics >> - Add new intrinsics to toBeInvestigated list in CheckGraalIntrinsics.java >> - 8253525: Implement getInstanceSize/sizeOf intrinsics > > Good. Thanks for review, @kvn! I would also like a review from someone from serviceability. ------------- PR: https://git.openjdk.java.net/jdk/pull/650 From dcubed at openjdk.java.net Wed Oct 21 17:48:28 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 21 Oct 2020 17:48:28 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: Message-ID: On Wed, 21 Oct 2020 08:40:47 GMT, Robbin Ehn wrote: >> The main point of this change-set is to make it easier to implement S/R on top of handshakes. >> Which is a prerequisite for removing _suspend_flag (which duplicates the handshake functionality). >> But we also remove some complicated S/R methods. >> >> We basically just put in everything in the handshake closure, so the diff just looks much worse than what it is. >> >> TraceSuspendDebugBits have an ifdef, but in both cases it now just returns. >> But I was unsure if I should remove now or when is_ext_suspend_completed() is removed. >> >> Passes multiple t1-5 runs, locally it passes many jck:vm/nsk_jvmti/nsk_jdi/jdk-jdi runs. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: > > - Fixed merge miss > - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended > - Merge fix from Richard > - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended > - Removed TraceSuspendDebugBits > - Removed unused method is_ext_suspend_completed_with_lock > - Utilize handshakes instead of is_thread_fully_suspended I don't think I have any "must fix" comments here. I'm going to assume that my confusion about why there is code from @reinrich's EscapeBarrier work here is because of the merging of conflicts... src/hotspot/share/prims/jvmtiEnv.cpp line 1646: > 1644: // java_thread - pre-checked > 1645: jvmtiError > 1646: JvmtiEnv::PopFrame(JavaThread* java_thread) { So I'm a bit confused why I'm seeing PopFrame() changes here that are related to @reinrich's EscapeBarrier work. I've seen mention of picking up a patch during this review from @reinrich so may that's why. I don't see anything wrong with the changes, but I am confused why they are here in this review. src/hotspot/share/prims/jvmtiEnv.cpp line 1715: > 1713: } > 1714: > 1715: SetFramePopClosure op(this, state, depth); The new closure is `SetFramePopClosure`, but the function we are in is `NotifyFramePop()` so that seems like a mismatch. Update: Okay, that just a move of existing code so this "mismatch" is pre-existing. src/hotspot/share/prims/jvmtiEnv.cpp line 1718: > 1716: MutexLocker mu(JvmtiThreadState_lock); > 1717: if (java_thread == JavaThread::current()) { > 1718: op.doit(java_thread, true); Please add a comment after the `true` parameter to indicate the name of the doit() function's parameter, e.g., `true /* self */`. src/hotspot/share/prims/jvmtiEnvBase.cpp line 56: > 54: #include "runtime/threadSMR.hpp" > 55: #include "runtime/vframe.hpp" > 56: #include "runtime/vframe.inline.hpp" When you add `foo.inline.hpp` you delete `foo.hpp` because the `foo.inline.hpp` file always includes the `foo.hpp` file. src/hotspot/share/prims/jvmtiEnvBase.cpp line 1311: > 1309: // It is to keep a ret_ob_h handle alive after return to the caller. > 1310: jvmtiError > 1311: JvmtiEnvBase::check_top_frame(Thread* current_thread, JavaThread* java_thread, Again, it is not clear why these changes to `check_top_frame` are here since they appear to be related to @reinrich's work. src/hotspot/share/prims/jvmtiEnvBase.cpp line 1398: > 1396: SetForceEarlyReturn op(state, value, tos); > 1397: if (java_thread == current_thread) { > 1398: op.doit(java_thread, true); Please add a comment after the true parameter to indicate the name of the doit() function's parameter, e.g., `true /* self */`. src/hotspot/share/prims/jvmtiEnvBase.cpp line 1570: > 1568: > 1569: ResourceMark rm(current_thread); > 1570: // Check if there are more than one Java frame in this thread, that the top two frames typo: s/are more/is more/ src/hotspot/share/prims/jvmtiEnvBase.cpp line 1543: > 1541: HandleMark hm(current_thread); > 1542: JavaThread* java_thread = target->as_Java_thread(); > 1543: This would be useful here: `assert(_state->get_thread() == java_thread, "Must be");` src/hotspot/share/prims/jvmtiEnvBase.cpp line 1642: > 1640: > 1641: if (!self) { > 1642: if (!java_thread->is_external_suspend()) { You could join these two if-statements with `&&` and have one less indenting level... src/hotspot/share/prims/jvmtiEnvBase.cpp line 1661: > 1659: assert(vf->frame_pointer() != NULL, "frame pointer mustn't be NULL"); > 1660: if (java_thread->is_exiting() || java_thread->threadObj() == NULL) { > 1661: return; What's the `_result` value if this `return` executes? src/hotspot/share/prims/jvmtiEnvBase.hpp line 361: > 359: _tos(tos) {} > 360: void do_thread(Thread *target) { > 361: doit(target, false); Please add a comment after the true parameter to indicate the name of the doit() function's parameter, e.g., `false /* self */`. src/hotspot/share/prims/jvmtiEnvBase.hpp line 395: > 393: _depth(depth) {} > 394: void do_thread(Thread *target) { > 395: doit(target, false); Please add a comment after the true parameter to indicate the name of the doit() function's parameter, e.g., `false /* self */`. src/hotspot/share/runtime/deoptimization.cpp line 1755: > 1753: thread->is_handshake_safe_for(Thread::current()) || > 1754: SafepointSynchronize::is_at_safepoint(), > 1755: "can only deoptimize other thread at a safepoint"); Should that now be: `safepoint/handshake`?? src/hotspot/share/runtime/thread.cpp line 567: > 565: // > 566: // _thread_in_native -> _thread_in_native_trans -> _thread_blocked > 567: // This code should not be needed with the much simpler suspension mechanism. (My agreement may come back to bite me if we have to change a suspend/resume bug in the near future. :-) ) src/hotspot/share/runtime/thread.cpp line 698: > 696: RememberProcessedThread rpt(this); > 697: oops_do_no_frames(f, cf); > 698: oops_do_frames(f, cf); In the comment above: // ... If we were // called by wait_for_ext_suspend_completion(), then it // will be doing the retries so we don't have to. `wait_for_ext_suspend_completion()` has been deleted so the comment needs work. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/729 From dcubed at openjdk.java.net Wed Oct 21 17:48:31 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 21 Oct 2020 17:48:31 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: <27LjwNE2Xl_ceCaXzFOuF9slhZklqN9TvLt0Vsw2sMM=.9c9d6a08-d19c-4ed3-af8f-890ad1a0bdc5@github.com> References: <27LjwNE2Xl_ceCaXzFOuF9slhZklqN9TvLt0Vsw2sMM=.9c9d6a08-d19c-4ed3-af8f-890ad1a0bdc5@github.com> Message-ID: On Wed, 21 Oct 2020 13:46:14 GMT, Richard Reingruber wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: >> >> - Fixed merge miss >> - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended >> - Merge fix from Richard >> - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended >> - Removed TraceSuspendDebugBits >> - Removed unused method is_ext_suspend_completed_with_lock >> - Utilize handshakes instead of is_thread_fully_suspended > > src/hotspot/share/prims/jvmtiEnv.cpp line 1808: > >> 1806: } >> 1807: if (java_lang_Class::is_primitive(k_mirror)) { >> 1808: return JVMTI_ERROR_NONE; > > The call of JvmtiSuspendControl::print() seems to be eliminated. Ok for me. It's not clear to me why the `JvmtiSuspendControl::print()` is being eliminated. Please explain. The `TraceJVMTICalls` support is so that someone can diagnose what JVM/TI calls are being made, including context in some cases, so it seems wrong to delete this call. > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1454: > >> 1452: _state->set_earlyret_pending(); >> 1453: _state->set_earlyret_oop(ret_ob_h()); >> 1454: _state->set_earlyret_value(_value, _tos); > > Good that these updates are done with a handshake now. Maybe I'm missing s.th. but I don't see synchronization in the older version. Agreed. @sspitsyn - This makes me wonder if the lack of synchronization is the cause of some instability in the JVM/TI ForceEarlyReturn() testing. Update: The old code only made the updates if the thread was fully suspended so you won't have a race between the requesting thread and the target thread in that case. > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1631: > >> 1629: _state->set_pending_step_for_popframe(); >> 1630: _result = JVMTI_ERROR_NONE; >> 1631: } > > I'd suggest to eliminate jt and use java_thread instead. Also because you're using java_thread in line 1626. The assertion should check if `_state->get_thread() == target` then. Especially if an assert() is added above on L1543. > src/hotspot/share/runtime/thread.cpp line 537: > >> 535: // cancelled). Returns true if the thread is externally suspended and >> 536: // false otherwise. >> 537: bool JavaThread::is_ext_suspend_completed() { > > I'd think `JavaThread::is_ext_suspend_completed` can be removed also (as a separate enhancement). It also duplicates code of the handshake mechanism. Just replace VM_ThreadSuspend with a handshake. `is_ext_suspend_completed()` includes code that detects that a thread that is in `_thread_in_native_trans` and does not yet have a walkable stack has not completed suspension and we will do some retries in this function until the target thread gets stable. We have to make sure that the handshake mechanism has a similar stability guarantee or a stack walker may fail intermittently. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From zgu at openjdk.java.net Wed Oct 21 18:54:16 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Wed, 21 Oct 2020 18:54:16 GMT Subject: RFR: 8233343: Deprecate -XX:+CriticalJNINatives flag which implements =?UTF-8?B?SmF2YeKApg==?= [v2] In-Reply-To: References: Message-ID: On Wed, 21 Oct 2020 11:59:26 GMT, Coleen Phillimore wrote: >> This change deprecates the -XX:+CriticalJNINatives flag and removes the develop flag -XX:+StressCriticalJNINatives. See CSR for more details. >> >> This change also removes the lazy GC lock in the critical native transition, and runs the critical native function as thread_in_Java. I add a safepoint check at the end of the native function and transition to native and poll again for the safepoint after the function if a safepoint is requested. >> >> Tested with tier 1-6 (we have a few tests that use this). And built on linux-x86-open,linux-s390x-open,linux-arm32-debug,linux-ppc64le-debug. > > Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: > > - Fixed test to run on other platforms, fixed copyrights, fixed ppc code. > - Remove pin/unpin object code. Marked as reviewed by zgu (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/764 From psandoz at openjdk.java.net Wed Oct 21 19:08:22 2020 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Wed, 21 Oct 2020 19:08:22 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v7] In-Reply-To: References: Message-ID: On Tue, 20 Oct 2020 17:23:26 GMT, Maurizio Cimadamore wrote: >> This patch contains the changes associated with the first incubation round of the foreign linker access API incubation >> (see JEP 389 [1]). This work is meant to sit on top of the foreign memory access support (see JEP 393 [2] and associated pull request [3]). >> >> The main goal of this API is to provide a way to call native functions from Java code without the need of intermediate JNI glue code. In order to do this, native calls are modeled through the MethodHandle API. I suggest reading the writeup [4] I put together few weeks ago, which illustrates what the foreign linker support is, and how it should be used by clients. >> >> Disclaimer: the pull request mechanism isn't great at managing *dependent* reviews. For this reasons, I'm attaching a webrev which contains only the differences between this PR and the memory access PR. I will be periodically uploading new webrevs, as new iterations come out, to try and make the life of reviewers as simple as possible. >> >> A big thank to Jorn Vernee and Vladimir Ivanov - they are the main architects of all the hotspot changes you see here, and without their help, the foreign linker support wouldn't be what it is today. As usual, a big thank to Paul Sandoz, who provided many insights (often by trying the bits first hand). >> >> Thanks >> Maurizio >> >> Webrev: >> http://cr.openjdk.java.net/~mcimadamore/8254231_v1/webrev >> >> Javadoc: >> >> http://cr.openjdk.java.net/~mcimadamore/8254231_v1/javadoc/jdk/incubator/foreign/package-summary.html >> >> Specdiff (relative to [3]): >> >> http://cr.openjdk.java.net/~mcimadamore/8254231_v1/specdiff_delta/overview-summary.html >> >> CSR: >> >> https://bugs.openjdk.java.net/browse/JDK-8254232 >> >> >> >> ### API Changes >> >> The API changes are actually rather slim: >> >> * `LibraryLookup` >> * This class allows clients to lookup symbols in native libraries; the interface is fairly simple; you can load a library by name, or absolute path, and then lookup symbols on that library. >> * `FunctionDescriptor` >> * This is an abstraction that is very similar, in spirit, to `MethodType`; it is, at its core, an aggregate of memory layouts for the function arguments/return type. A function descriptor is used to describe the signature of a native function. >> * `CLinker` >> * This is the real star of the show. A `CLinker` has two main methods: `downcallHandle` and `upcallStub`; the first takes a native symbol (as obtained from `LibraryLookup`), a `MethodType` and a `FunctionDescriptor` and returns a `MethodHandle` instance which can be used to call the target native symbol. The second takes an existing method handle, and a `FunctionDescriptor` and returns a new `MemorySegment` corresponding to a code stub allocated by the VM which acts as a trampoline from native code to the user-provided method handle. This is very useful for implementing upcalls. >> * This class also contains the various layout constants that should be used by clients when describing native signatures (e.g. `C_LONG` and friends); these layouts contain additional ABI classfication information (in the form of layout attributes) which is used by the runtime to *infer* how Java arguments should be shuffled for the native call to take place. >> * Finally, this class provides some helper functions e.g. so that clients can convert Java strings into C strings and back. >> * `NativeScope` >> * This is an helper class which allows clients to group together logically related allocations; that is, rather than allocating separate memory segments using separate *try-with-resource* constructs, a `NativeScope` allows clients to use a _single_ block, and allocate all the required segments there. This is not only an usability boost, but also a performance boost, since not all allocation requests will be turned into `malloc` calls. >> * `MemorySegment` >> * Only one method added here - namely `handoff(NativeScope)` which allows a segment to be transferred onto an existing native scope. >> >> ### Safety >> >> The foreign linker API is intrinsically unsafe; many things can go wrong when requesting a native method handle. For instance, the description of the native signature might be wrong (e.g. have too many arguments) - and the runtime has, in the general case, no way to detect such mismatches. For these reasons, obtaining a `CLinker` instance is a *restricted* operation, which can be enabled by specifying the usual JDK property `-Dforeign.restricted=permit` (as it's the case for other restricted method in the foreign memory API). >> >> ### Implementation changes >> >> The Java changes associated with `LibraryLookup` are relative straightforward; the only interesting thing to note here is that library loading does _not_ depend on class loaders, so `LibraryLookup` is not subject to the same restrictions which apply to JNI library loading (e.g. same library cannot be loaded by different classloaders). >> >> As for `NativeScope` the changes are again relatively straightforward; it is an API which sits neatly on top of the foreign meory access API, providing some kind of allocation service which shares the same underlying memory segment(s), and turns an allocation request into a segment slice, which is a much less expensive operation. `NativeScope` comes in two variants: there are native scopes for which the allocation size is known a priori, and native scopes which can grow - these two schemes are implemented by two separate subclasses of `AbstractNativeScopeImpl`. >> >> Of course the bulk of the changes are to support the `CLinker` downcall/upcall routines. These changes cut pretty deep into the JVM; I'll briefly summarize the goal of some of this changes - for further details, Jorn has put together a detailed writeup which explains the rationale behind the VM support, with some references to the code [5]. >> >> The main idea behind foreign linker is to infer, given a Java method type (expressed as a `MethodType` instance) and the description of the signature of a native function (expressed as a `FunctionDescriptor` instance) a _recipe_ that can be used to turn a Java call into the corresponding native call targeting the requested native function. >> >> This inference scheme can be defined in a pretty straightforward fashion by looking at the various ABI specifications (for instance, see [6] for the SysV ABI, which is the one used on Linux/Mac). The various `CallArranger` classes, of which we have a flavor for each supported platform, do exactly that kind of inference. >> >> For the inference process to work, we need to attach extra information to memory layouts; it is no longer sufficient to know e.g. that a layout is 32/64 bits - we need to know whether it is meant to represent a floating point value, or an integral value; this knowledge is required because floating points are passed in different registers by most ABIs. For this reason, `CLinker` offers a set of pre-baked, platform-dependent layout constants which contain the required classification attributes (e.g. a `Clinker.TypeKind` enum value). The runtime extracts this attribute, and performs classification accordingly. >> >> A native call is decomposed into a sequence of basic, primitive operations, called `Binding` (see the great javadoc on the `Binding.java` class for more info). There are many such bindings - for instance the `Move` binding is used to move a value into a specific machine register/stack slot. So, the main job of the various `CallingArranger` classes is to determine, given a Java `MethodType` and `FunctionDescriptor` what is the set of bindings associated with the downcall/upcall. >> >> At the heart of the foreign linker support is the `ProgrammableInvoker` class. This class effectively generates a `MethodHandle` which follows the steps described by the various bindings obtained by `CallArranger`. There are actually various strategies to interpret these bindings - listed below: >> >> * basic intepreted mode; in this mode, all bindings are interpreted using a stack-based machine written in Java (see `BindingInterpreter`), except for the `Move` bindings. For these bindings, the move is implemented by allocating a *buffer* (whose size is ABI specific) and by moving all the lowered values into positions within this buffer. The buffer is then passed to a piece of assembly code inside the VM which takes values from the buffer and moves them in their expected registers/stack slots (note that each position in the buffer corresponds to a different register). This is the most general invocation mode, the more "customizable" one, but also the slowest - since for every call there is some extra allocation which takes place. >> >> * specialized interpreted mode; same as before, but instead of interpreting the bindings with a stack-based interpreter, we generate a method handle chain which effectively interprets all the bindings (again, except `Move` ones). >> >> * intrinsified mode; this is typically used in combination with the specialized interpreted mode described above (although it can also be used with the Java-based binding interpreter). The goal here is to remove the buffer allocation and copy by introducing an additional JVM intrinsic. If a native call recipe is constant (e.g. the set of bindings is constant, which is probably the case if the native method handle is stored in a `static`, `final` field), then the VM can generate specialized assembly code which interprets the `Move` binding without the need to go for an intermediate buffer. This gives us back performances that are on par with JNI. >> >> For upcalls, the support is not (yet) as advanced, and only the basic interpreted mode is available there. We plan to add support for intrinsified modes there as well, which should considerably boost perfomances (probably well beyond what JNI can offer at the moment, since the upcall support in JNI is not very well optimized). >> >> Again, for more readings on the internals of the foreign linker support, please refer to [5]. >> >> #### Test changes >> >> Many new tests have been added to validate the foreign linker support; we have high level tests (see `StdLibTest`) which aim at testing the linker from the perspective of code that clients could write. But we also have deeper combinatorial tests (see `TestUpcall` and `TestDowncall`) which are meant to stress every corner of the ABI implementation. There are also some great tests (see the `callarranger` folder) which test the various `CallArranger`s for all the possible platforms; these tests adopt more of a white-box approach - that is, instead of treating the linker machinery as a black box and verify that the support works by checking that the native call returned the results we expected, these tests aims at checking that the set of bindings generated by the call arranger is correct. This also mean that we can test the classification logic for Windows, Mac and Linux regardless of the platform we're executing on. >> >> Some additional microbenchmarks have been added to compare the performances of downcall/upcall with JNI. >> >> [1] - https://openjdk.java.net/jeps/389 >> [2] - https://openjdk.java.net/jeps/393 >> [3] - https://git.openjdk.java.net/jdk/pull/548 >> [4] - https://github.com/openjdk/panama-foreign/blob/foreign-jextract/doc/panama_ffi.md >> [5] - http://cr.openjdk.java.net/~jvernee/docs/Foreign-abi%20downcall%20intrinsics%20technical%20description.html > > Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 25 commits: > > - Merge branch 'master' into 8254231_linker > - Fix incorrect capitalization in one copyright header > - Update copyright years, and add classpath exception to files that were missing it > - Use separate constants for native invoker code size > - Re-add file erroneously deleted (detected as rename) > - Re-add erroneously removed files > - Merge branch 'master' into 8254231_linker > > - Fix tests > - Fix more whitespaces > - Fix whitespaces > - Remove rejected file > - ... and 15 more: https://git.openjdk.java.net/jdk/compare/cb6167b2...502bd980 Some of this is familiar to me from reviews in the `panama-foreign` repository, but less so than the memory API. I focused on the Java code, and ignored changes that are common with the memory API PR. If it helps in can provide a PR in the `panama-foreign` repository addressing editorial comments to the linker API. src/java.base/share/classes/java/lang/invoke/NativeMethodHandle.java line 36: > 34: import static java.lang.invoke.MethodHandleStatics.newInternalError; > 35: > 36: /** TODO */ Is the TODO to make this class public later and adjust the return type of `downcallHandle`? src/java.base/share/classes/java/lang/invoke/NativeMethodHandle.java line 145: > 143: */ > 144: private static class Lazy { > 145: static Class THIS_CLASS = NativeMethodHandle.class; final field? Is this field needed, as `NativeMethodHandle.class` could be used directly, or use a local variable instead in the static code block. src/java.base/share/classes/jdk/internal/access/JavaLangAccess.java line 46: > 44: import java.util.stream.Stream; > 45: > 46: import jdk.internal.loader.NativeLibrary; Unused import? src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/CLinker.java line 130: > 128: * @return the downcall method handle. > 129: * @throws IllegalArgumentException in the case of a carrier type and memory layout mismatch. > 130: */ Add `@see LibraryLookup#lookup` src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/CLinker.java line 126: > 124: * > 125: * @param symbol downcall symbol. > 126: * @param type the method type. s/method type/carrier type ? src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/CLinker.java line 129: > 127: * @param function the function descriptor. > 128: * @return the downcall method handle. > 129: * @throws IllegalArgumentException in the case of a carrier type and memory layout mismatch. carrier type and function descriptor mismatch? src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/CLinker.java line 139: > 137: * > 138: *

The returned segment is not thread-confined, and it only features > 139: * the {@link MemorySegment#CLOSE} access mode. When the returned segment is closed, Implying that it is shared? If so might be better to state that directly (with a link), and can be closed explicitly or left until can be collected by the GC? src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/CLinker.java line 145: > 143: * @param function the function descriptor. > 144: * @return the native stub segment. > 145: * @throws IllegalArgumentException in the case of a carrier type and memory layout mismatch. What's carrier type here? `target.type()`? "IllegalArgumentException if the target's method type the function descriptor mismatch" ? src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/CLinker.java line 201: > 199: } > 200: > 201: /** Extra spaces src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/CLinker.java line 212: > 210: * @param str the Java string to be converted into a C string. > 211: * @return a new native memory segment containing the converted C string. > 212: * @throws NullPointerException if either {@code str == null}. if {@code str == null}. src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/CLinker.java line 202: > 200: > 201: /** > 202: * Convert a Java string into a null-terminated C string, using the Converts (and in other places and for other methods) src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/CLinker.java line 314: > 312: * restricted methods, and use safe and supported functionalities, where possible. > 313: * @param addr the address at which the string is stored. > 314: * @param charset The {@linkplain java.nio.charset.Charset} to be used to compute the contents of the Java string. s/linkplain/link (and in other places) src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/CLinker.java line 352: > 350: * @param charset The {@linkplain java.nio.charset.Charset} to be used to compute the contents of the Java string. > 351: * @return a Java string with the contents of the null-terminated C string at given address. > 352: * @throws NullPointerException if {@code addr == null} or charset == null (and in other places) src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/CLinker.java line 380: > 378: > 379: /** > 380: * Allocate memory of given size using malloc. What if allocation failed? `OutOfMemoryError`? src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/FunctionDescriptor.java line 60: > 58: private FunctionDescriptor(MemoryLayout resLayout, Map attributes, MemoryLayout... argLayouts) { > 59: this.resLayout = resLayout; > 60: this.attributes = Collections.unmodifiableMap(attributes); Since `attributes` is never exposed directly or indirectly via a set of keys/values/entries there is no need to wrap it. src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/FunctionDescriptor.java line 100: > 98: /** > 99: * Returns the return layout associated with this function. > 100: * @return the return s/the return/the return layout (and other places) src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/FunctionDescriptor.java line 145: > 143: * @throws NullPointerException if any of the new argument layouts is null. > 144: */ > 145: public FunctionDescriptor appendArgumentLayouts(MemoryLayout... addedLayouts) { Might consider using "with" as in "withAppendedArgumentLayouts", "withReturnLayout", "withVoidReturnLayout" src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/LibraryLookup.java line 37: > 35: > 36: /** > 37: * A native library lookup. Exposes lookup operation for searching symbols, see {@link LibraryLookup#lookup(String)}. s/Exposes lookup/Exposes a lookup src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/LibraryLookup.java line 91: > 89: > 90: /** > 91: * Lookups a symbol with given name in this library. The returned symbol maintains a strong reference to this lookup object. s/Lookups/Looks up src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/LibraryLookup.java line 130: > 128: > 129: /** > 130: * Obtain a library lookup object corresponding to a library identified by given library name. Mention the context in which the library is found i.e. what ever the equivalent of LD_LIBRARY_PATH is in Java (the system property name escapes me at this moment). src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/NativeScope.java line 44: > 42: * by off-heap memory. Native scopes can be either bounded or unbounded, depending on whether the size > 43: * of the native scope is known statically. If an application knows before-hand how much memory it needs to allocate, > 44: * then using a bounded native scope will typically provide better performances than independently allocating the memory s/performances/performance src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/NativeScope.java line 54: > 52: * To allow for more usability, it is possible for a native scope to reclaim ownership of an existing memory segment > 53: * (see {@link MemorySegment#handoff(NativeScope)}). This might be useful to allow one or more segments which were independently > 54: * created to share the same life-cycle as a given native scope - which in turns enables client to group all memory s/enables client/enables a client src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/NativeScope.java line 85: > 83: * @return a segment for the newly allocated memory block. > 84: * @throws OutOfMemoryError if there is not enough space left in this native scope, that is, if > 85: * {@code limit() - size() < layout.byteSize()}. Where do the `limit` and `size` methods come from? src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/NativeScope.java line 374: > 372: * } > 373: * @param elementLayout the array element layout. > 374: * @param size the array element count. s/size/length or count? ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From psandoz at openjdk.java.net Wed Oct 21 19:08:25 2020 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Wed, 21 Oct 2020 19:08:25 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v9] In-Reply-To: References: Message-ID: On Wed, 21 Oct 2020 11:33:27 GMT, Maurizio Cimadamore wrote: >> This patch contains the changes associated with the first incubation round of the foreign linker access API incubation >> (see JEP 389 [1]). This work is meant to sit on top of the foreign memory access support (see JEP 393 [2] and associated pull request [3]). >> >> The main goal of this API is to provide a way to call native functions from Java code without the need of intermediate JNI glue code. In order to do this, native calls are modeled through the MethodHandle API. I suggest reading the writeup [4] I put together few weeks ago, which illustrates what the foreign linker support is, and how it should be used by clients. >> >> Disclaimer: the pull request mechanism isn't great at managing *dependent* reviews. For this reasons, I'm attaching a webrev which contains only the differences between this PR and the memory access PR. I will be periodically uploading new webrevs, as new iterations come out, to try and make the life of reviewers as simple as possible. >> >> A big thank to Jorn Vernee and Vladimir Ivanov - they are the main architects of all the hotspot changes you see here, and without their help, the foreign linker support wouldn't be what it is today. As usual, a big thank to Paul Sandoz, who provided many insights (often by trying the bits first hand). >> >> Thanks >> Maurizio >> >> Webrev: >> http://cr.openjdk.java.net/~mcimadamore/8254231_v1/webrev >> >> Javadoc: >> >> http://cr.openjdk.java.net/~mcimadamore/8254231_v1/javadoc/jdk/incubator/foreign/package-summary.html >> >> Specdiff (relative to [3]): >> >> http://cr.openjdk.java.net/~mcimadamore/8254231_v1/specdiff_delta/overview-summary.html >> >> CSR: >> >> https://bugs.openjdk.java.net/browse/JDK-8254232 >> >> >> >> ### API Changes >> >> The API changes are actually rather slim: >> >> * `LibraryLookup` >> * This class allows clients to lookup symbols in native libraries; the interface is fairly simple; you can load a library by name, or absolute path, and then lookup symbols on that library. >> * `FunctionDescriptor` >> * This is an abstraction that is very similar, in spirit, to `MethodType`; it is, at its core, an aggregate of memory layouts for the function arguments/return type. A function descriptor is used to describe the signature of a native function. >> * `CLinker` >> * This is the real star of the show. A `CLinker` has two main methods: `downcallHandle` and `upcallStub`; the first takes a native symbol (as obtained from `LibraryLookup`), a `MethodType` and a `FunctionDescriptor` and returns a `MethodHandle` instance which can be used to call the target native symbol. The second takes an existing method handle, and a `FunctionDescriptor` and returns a new `MemorySegment` corresponding to a code stub allocated by the VM which acts as a trampoline from native code to the user-provided method handle. This is very useful for implementing upcalls. >> * This class also contains the various layout constants that should be used by clients when describing native signatures (e.g. `C_LONG` and friends); these layouts contain additional ABI classfication information (in the form of layout attributes) which is used by the runtime to *infer* how Java arguments should be shuffled for the native call to take place. >> * Finally, this class provides some helper functions e.g. so that clients can convert Java strings into C strings and back. >> * `NativeScope` >> * This is an helper class which allows clients to group together logically related allocations; that is, rather than allocating separate memory segments using separate *try-with-resource* constructs, a `NativeScope` allows clients to use a _single_ block, and allocate all the required segments there. This is not only an usability boost, but also a performance boost, since not all allocation requests will be turned into `malloc` calls. >> * `MemorySegment` >> * Only one method added here - namely `handoff(NativeScope)` which allows a segment to be transferred onto an existing native scope. >> >> ### Safety >> >> The foreign linker API is intrinsically unsafe; many things can go wrong when requesting a native method handle. For instance, the description of the native signature might be wrong (e.g. have too many arguments) - and the runtime has, in the general case, no way to detect such mismatches. For these reasons, obtaining a `CLinker` instance is a *restricted* operation, which can be enabled by specifying the usual JDK property `-Dforeign.restricted=permit` (as it's the case for other restricted method in the foreign memory API). >> >> ### Implementation changes >> >> The Java changes associated with `LibraryLookup` are relative straightforward; the only interesting thing to note here is that library loading does _not_ depend on class loaders, so `LibraryLookup` is not subject to the same restrictions which apply to JNI library loading (e.g. same library cannot be loaded by different classloaders). >> >> As for `NativeScope` the changes are again relatively straightforward; it is an API which sits neatly on top of the foreign meory access API, providing some kind of allocation service which shares the same underlying memory segment(s), and turns an allocation request into a segment slice, which is a much less expensive operation. `NativeScope` comes in two variants: there are native scopes for which the allocation size is known a priori, and native scopes which can grow - these two schemes are implemented by two separate subclasses of `AbstractNativeScopeImpl`. >> >> Of course the bulk of the changes are to support the `CLinker` downcall/upcall routines. These changes cut pretty deep into the JVM; I'll briefly summarize the goal of some of this changes - for further details, Jorn has put together a detailed writeup which explains the rationale behind the VM support, with some references to the code [5]. >> >> The main idea behind foreign linker is to infer, given a Java method type (expressed as a `MethodType` instance) and the description of the signature of a native function (expressed as a `FunctionDescriptor` instance) a _recipe_ that can be used to turn a Java call into the corresponding native call targeting the requested native function. >> >> This inference scheme can be defined in a pretty straightforward fashion by looking at the various ABI specifications (for instance, see [6] for the SysV ABI, which is the one used on Linux/Mac). The various `CallArranger` classes, of which we have a flavor for each supported platform, do exactly that kind of inference. >> >> For the inference process to work, we need to attach extra information to memory layouts; it is no longer sufficient to know e.g. that a layout is 32/64 bits - we need to know whether it is meant to represent a floating point value, or an integral value; this knowledge is required because floating points are passed in different registers by most ABIs. For this reason, `CLinker` offers a set of pre-baked, platform-dependent layout constants which contain the required classification attributes (e.g. a `Clinker.TypeKind` enum value). The runtime extracts this attribute, and performs classification accordingly. >> >> A native call is decomposed into a sequence of basic, primitive operations, called `Binding` (see the great javadoc on the `Binding.java` class for more info). There are many such bindings - for instance the `Move` binding is used to move a value into a specific machine register/stack slot. So, the main job of the various `CallingArranger` classes is to determine, given a Java `MethodType` and `FunctionDescriptor` what is the set of bindings associated with the downcall/upcall. >> >> At the heart of the foreign linker support is the `ProgrammableInvoker` class. This class effectively generates a `MethodHandle` which follows the steps described by the various bindings obtained by `CallArranger`. There are actually various strategies to interpret these bindings - listed below: >> >> * basic intepreted mode; in this mode, all bindings are interpreted using a stack-based machine written in Java (see `BindingInterpreter`), except for the `Move` bindings. For these bindings, the move is implemented by allocating a *buffer* (whose size is ABI specific) and by moving all the lowered values into positions within this buffer. The buffer is then passed to a piece of assembly code inside the VM which takes values from the buffer and moves them in their expected registers/stack slots (note that each position in the buffer corresponds to a different register). This is the most general invocation mode, the more "customizable" one, but also the slowest - since for every call there is some extra allocation which takes place. >> >> * specialized interpreted mode; same as before, but instead of interpreting the bindings with a stack-based interpreter, we generate a method handle chain which effectively interprets all the bindings (again, except `Move` ones). >> >> * intrinsified mode; this is typically used in combination with the specialized interpreted mode described above (although it can also be used with the Java-based binding interpreter). The goal here is to remove the buffer allocation and copy by introducing an additional JVM intrinsic. If a native call recipe is constant (e.g. the set of bindings is constant, which is probably the case if the native method handle is stored in a `static`, `final` field), then the VM can generate specialized assembly code which interprets the `Move` binding without the need to go for an intermediate buffer. This gives us back performances that are on par with JNI. >> >> For upcalls, the support is not (yet) as advanced, and only the basic interpreted mode is available there. We plan to add support for intrinsified modes there as well, which should considerably boost perfomances (probably well beyond what JNI can offer at the moment, since the upcall support in JNI is not very well optimized). >> >> Again, for more readings on the internals of the foreign linker support, please refer to [5]. >> >> #### Test changes >> >> Many new tests have been added to validate the foreign linker support; we have high level tests (see `StdLibTest`) which aim at testing the linker from the perspective of code that clients could write. But we also have deeper combinatorial tests (see `TestUpcall` and `TestDowncall`) which are meant to stress every corner of the ABI implementation. There are also some great tests (see the `callarranger` folder) which test the various `CallArranger`s for all the possible platforms; these tests adopt more of a white-box approach - that is, instead of treating the linker machinery as a black box and verify that the support works by checking that the native call returned the results we expected, these tests aims at checking that the set of bindings generated by the call arranger is correct. This also mean that we can test the classification logic for Windows, Mac and Linux regardless of the platform we're executing on. >> >> Some additional microbenchmarks have been added to compare the performances of downcall/upcall with JNI. >> >> [1] - https://openjdk.java.net/jeps/389 >> [2] - https://openjdk.java.net/jeps/393 >> [3] - https://git.openjdk.java.net/jdk/pull/548 >> [4] - https://github.com/openjdk/panama-foreign/blob/foreign-jextract/doc/panama_ffi.md >> [5] - http://cr.openjdk.java.net/~jvernee/docs/Foreign-abi%20downcall%20intrinsics%20technical%20description.html > > Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 27 commits: > > - Merge branch 'master' into 8254231_linker > - Don't use JNI when generating native wrappers > > - Merge branch 'master' into 8254231_linker > - Fix incorrect capitalization in one copyright header > - Update copyright years, and add classpath exception to files that were missing it > - Use separate constants for native invoker code size > - Re-add file erroneously deleted (detected as rename) > - Re-add erroneously removed files > - Merge branch 'master' into 8254231_linker > > - Fix tests > - Fix more whitespaces > - ... and 17 more: https://git.openjdk.java.net/jdk/compare/da97ab5c...8c7b75da src/jdk.incubator.foreign/share/classes/jdk/internal/foreign/AbstractNativeScope.java line 41: > 39: private static final int SCOPE_MASK = MemorySegment.READ | MemorySegment.WRITE; // no terminal operations allowed > 40: > 41: AbstractNativeScope(Thread ownerThread) { Since all concrete classes pass in `Thread.currentThread()` we can make this a no-arg constructor. The concrete classes can be marked as final? src/jdk.incubator.foreign/share/classes/jdk/internal/foreign/AbstractNativeScope.java line 120: > 118: } > 119: } > 120: throw new AssertionError("Cannot get here!"); This code is a little confusing, effectively using an exception for control flow, within a retry loop. I recommend performing an explicit bounds check to determine if a new segment of `BLOCK_SIZE` is required from which to slice into. It will also be faster than the exceptional case. src/jdk.incubator.foreign/share/classes/jdk/internal/foreign/CABI.java line 33: > 31: AArch64; > 32: > 33: public static CABI current() { I suspect this might be called often, create a private static final constant? src/jdk.incubator.foreign/share/classes/jdk/internal/foreign/abi/Binding.java line 203: > 201: * -------------------- > 202: */ > 203: public abstract class Binding { Some design considerations, to consider later maybe. The IR representation could be simplified to use record classes (which should be exiting preview in 16), implementing a Binding interface. The interpreter and specializer (compiler) could be separate if need be, operating on a sequence of instructions that just hold the data. Pattern matching could be used on the binding instances. It may be simpler and more efficient if the compiler generated explicit byte code rather than using MH combinators. src/jdk.incubator.foreign/share/classes/jdk/internal/foreign/abi/BindingInterpreter.java line 50: > 48: } > 49: > 50: interface StoreFunc { Annotate with `@FunctionalInterface`? src/jdk.incubator.foreign/share/classes/jdk/internal/foreign/abi/CallingSequence.java line 51: > 49: } > 50: > 51: public Stream argBindings() { Duplicate methods `argBindings` and `argumentBindings`? src/jdk.incubator.foreign/share/classes/jdk/internal/foreign/abi/CallingSequenceBuilder.java line 123: > 121: } > 122: > 123: private static final Set boxTags = EnumSet.of( s/boxTags/BOX_TAGS ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From kvn at openjdk.java.net Wed Oct 21 19:24:13 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 21 Oct 2020 19:24:13 GMT Subject: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v10] In-Reply-To: References: Message-ID: <06kvIM_3abB-35pPdgFfbvwCND6oe9QCBqXBQ8iIrZ4=.64ae7da4-be02-46cc-afde-ffeb9ec9d703@github.com> On Wed, 21 Oct 2020 09:19:57 GMT, Fei Yang wrote: > > Someone in Oracle have to run tier1-tier3 testing with these changes to make sure nothing is broken. I don't want to repeat 8254790. > > That's appreciated. > On my side, I run tier1-tier3 both on aarch64 linux and x86_64 linux. > The test result on these two platforms looks good for the latest changes. I started testing of 09: version. >> src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java line 604: >> >>> 602: add(ignore, "sun/security/provider/SHA5." + shaCompressName + "([BI)V"); >>> 603: } >>> 604: add(toBeInvestigated, "sun/security/provider/SHA3." + shaCompressName + "([BI)V"); >> >> This should be under `if (isJDK16OrHigher())` check. Something like this: >> https://github.com/openjdk/jdk/pull/650/files#diff-d1f378fc1b7fe041309e854d40b3a95a91e63fdecf0ecd9826b7c95eaeba314eR527 >> You can wait when Aleksey push it and update your changes > > OK. Will update with the following change after Aleksey's PR is integrated: > > --- a/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java > +++ b/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java > @@ -608,6 +608,10 @@ public class CheckGraalIntrinsics extends GraalTest { > if (!config.useSHA512Intrinsics()) { > add(ignore, "sun/security/provider/SHA5." + shaCompressName + "([BI)V"); > } > + > + if (isJDK16OrHigher()) { > + add(toBeInvestigated, "sun/security/provider/SHA3." + shaCompressName + "([BI)V"); > + } > } Yes, please, do that. ------------- PR: https://git.openjdk.java.net/jdk/pull/207 From rkennke at openjdk.java.net Wed Oct 21 20:12:22 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Wed, 21 Oct 2020 20:12:22 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v10] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> <5LY7Ty5Q2S8j9S30uXyeX0EE8AQWQ4BFFd02NYMzVio=.2e423faa-895a-42c9-ad9f-6c85844e1ff8@github.com> Message-ID: On Mon, 19 Oct 2020 14:42:25 GMT, Zhengyu Gu wrote: >> Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: >> >> - Add fallback support for new properties in ObjArrayChunkedTask >> - Fix 32bit interpreter LRB-native call > > src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.hpp line 64: > >> 62: void load_reference_barrier_not_null(MacroAssembler* masm, Register dst, Address load_addr); >> 63: void load_reference_barrier_native(MacroAssembler* masm, Register dst, Address load_addr, bool native); >> 64: > > is_native parameter seems weird. Maybe invert to is_weak_ref? > BTW, I think I am seeing compressed oops in conc-stack-scanning. I think it's clearer as it is. The motivation for this is that native references are always oops, while weak reference's referents can be oops or narrowOops. Which means that we need to call a different method for native-refs (the always-oops entry point). The interesting differentiator is native vs. not-native, because one is always-oops, the other can be narrowOop. Weak vs not-weak is not as clear because there can also be weak/phantom native-refs. > src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp line 399: > >> 397: >> 398: save_xmm_registers(masm); >> 399: if (UseCompressedOops && !native) { > > What's problem you saw with compressed oop + native? None, but see above. > src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.hpp line 42: > >> 40: WEAK >> 41: }; >> 42: private: > > Declare as enum class for stronger typing? Ok will do. > src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp line 750: > >> 748: __ load_parameter(0, r0); >> 749: __ load_parameter(1, r1); >> 750: if (kind == ShenandoahBarrierSet::NATIVE) { > > Use "switch" statement to be consistent with above OK. > src/hotspot/share/gc/shenandoah/shenandoahReferenceProcessor.hpp line 139: > >> 137: >> 138: // template >> 139: // void keep_alive(oop reference, ReferenceType type) const; > > Remove this Ok. > src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.hpp line 100: > >> 98: oop load_reference_barrier_not_null(oop obj); >> 99: >> 100: template > > class -> typename also? Right. Will do. ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From kvn at openjdk.java.net Wed Oct 21 20:16:16 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 21 Oct 2020 20:16:16 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v7] In-Reply-To: <7Dgp_C-H8LsbF3tPHinVmb5bT_LoLZLYZUx9eSqigCA=.894148e4-a3f2-42c3-ab19-13e134e66853@github.com> References: <7Dgp_C-H8LsbF3tPHinVmb5bT_LoLZLYZUx9eSqigCA=.894148e4-a3f2-42c3-ab19-13e134e66853@github.com> Message-ID: <9IvmeUEG1iYCXnh7wM4GcijH3T4tMI7XPrtrO1Wxm8M=.bc17c4ef-258a-4dc1-8001-a618e9596fd5@github.com> On Wed, 21 Oct 2020 01:51:26 GMT, CoreyAshford wrote: >> This patch set encompasses the following commits: >> >> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for the intrinsic. The API is similar to the existing encodeBlock intrinsic. >> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic implementation >> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. >> - Adds a JMH microbenchmark for both Base64 encoding and encoding. >> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. > > CoreyAshford has updated the pull request incrementally with one additional commit since the last revision: > > CheckGraalIntrinsics.java: fix copy/paste error Changes requested by kvn (Reviewer). test/hotspot/jtreg/compiler/intrinsics/base64/TestBase64.java line 29: > 27: * @summary tests java.util.Base64 > 28: * @library /test/lib / > 29: * @build sun.hotspot.WhiteBox jdk.test.lib.Utils Tier3 testing shows failures: java.lang.NoClassDefFoundError: jdk/test/lib/Platform @iignatev pointed that you should not add jdk.test.lib.Utils to @build here - it will be built by jtreg in correct order. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From coleenp at openjdk.java.net Wed Oct 21 20:42:11 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 21 Oct 2020 20:42:11 GMT Subject: RFR: 8233343: Deprecate -XX:+CriticalJNINatives flag which implements JavaCritical native functions [v2] In-Reply-To: References: Message-ID: On Wed, 21 Oct 2020 18:50:32 GMT, Zhengyu Gu wrote: >> Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: >> >> - Fixed test to run on other platforms, fixed copyrights, fixed ppc code. >> - Remove pin/unpin object code. > > Marked as reviewed by zgu (Reviewer). Thank you for the code reviews! ------------- PR: https://git.openjdk.java.net/jdk/pull/764 From coleenp at openjdk.java.net Wed Oct 21 20:42:12 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 21 Oct 2020 20:42:12 GMT Subject: Integrated: 8233343: Deprecate -XX:+CriticalJNINatives flag which implements JavaCritical native functions In-Reply-To: References: Message-ID: On Tue, 20 Oct 2020 11:41:17 GMT, Coleen Phillimore wrote: > This change deprecates the -XX:+CriticalJNINatives flag and removes the develop flag -XX:+StressCriticalJNINatives. See CSR for more details. > > This change also removes the lazy GC lock in the critical native transition, and runs the critical native function as thread_in_Java. I add a safepoint check at the end of the native function and transition to native and poll again for the safepoint after the function if a safepoint is requested. > > Tested with tier 1-6 (we have a few tests that use this). And built on linux-x86-open,linux-s390x-open,linux-arm32-debug,linux-ppc64le-debug. This pull request has now been integrated. Changeset: 56ea490f Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/56ea490f Stats: 1279 lines in 18 files changed: 77 ins; 1133 del; 69 mod 8233343: Deprecate -XX:+CriticalJNINatives flag which implements JavaCritical native functions Reviewed-by: rehn, mdoerr, zgu ------------- PR: https://git.openjdk.java.net/jdk/pull/764 From github.com+51754783+coreyashford at openjdk.java.net Wed Oct 21 20:43:31 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Wed, 21 Oct 2020 20:43:31 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v7] In-Reply-To: <9IvmeUEG1iYCXnh7wM4GcijH3T4tMI7XPrtrO1Wxm8M=.bc17c4ef-258a-4dc1-8001-a618e9596fd5@github.com> References: <7Dgp_C-H8LsbF3tPHinVmb5bT_LoLZLYZUx9eSqigCA=.894148e4-a3f2-42c3-ab19-13e134e66853@github.com> <9IvmeUEG1iYCXnh7wM4GcijH3T4tMI7XPrtrO1Wxm8M=.bc17c4ef-258a-4dc1-8001-a618e9596fd5@github.com> Message-ID: On Wed, 21 Oct 2020 20:12:47 GMT, Vladimir Kozlov wrote: >> CoreyAshford has updated the pull request incrementally with one additional commit since the last revision: >> >> CheckGraalIntrinsics.java: fix copy/paste error > > test/hotspot/jtreg/compiler/intrinsics/base64/TestBase64.java line 29: > >> 27: * @summary tests java.util.Base64 >> 28: * @library /test/lib / >> 29: * @build sun.hotspot.WhiteBox jdk.test.lib.Utils > > Tier3 testing shows failures: java.lang.NoClassDefFoundError: jdk/test/lib/Platform > @iignatev pointed that you should not add jdk.test.lib.Utils to @build here - it will be built by jtreg in correct order. Done. Thank you. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From github.com+51754783+coreyashford at openjdk.java.net Wed Oct 21 20:43:30 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Wed, 21 Oct 2020 20:43:30 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v8] In-Reply-To: References: Message-ID: > This patch set encompasses the following commits: > > - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for the intrinsic. The API is similar to the existing encodeBlock intrinsic. > - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic implementation > - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. > - Adds a JMH microbenchmark for both Base64 encoding and encoding. > - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. CoreyAshford has updated the pull request incrementally with one additional commit since the last revision: TestBase64.java: remove jdk.test.lib.Utils from @build which was causing Tier3 failures. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/293/files - new: https://git.openjdk.java.net/jdk/pull/293/files/f93614dc..8e15d971 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=293&range=07 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=293&range=06-07 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/293.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/293/head:pull/293 PR: https://git.openjdk.java.net/jdk/pull/293 From rkennke at openjdk.java.net Wed Oct 21 20:48:29 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Wed, 21 Oct 2020 20:48:29 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v11] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads that make heavvy use of such weak references will therefore potentially cause significant GC pauses. > > There are 3 main items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for FinalReference, marking stops there, and does not mark through the referent. > 2. Concurrent processing is performed after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers in Reference.get() intrinsics that return NULL when the referent is not reachable. > > Testing: hotspot_gc_shenadoah (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions, tier1, tier2, vmTestbase_vm_metaspace, vmTestbase_nsk_jvmti, with -XX:+UseShenandoahGC without regressions, specjvm with various levels of verification Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Change ShenandoahLRBKind to be an enum class instead of plain enum, and some minor touch-ups ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/505/files - new: https://git.openjdk.java.net/jdk/pull/505/files/46dc1b75..f2a9bb61 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=10 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=09-10 Stats: 112 lines in 10 files changed: 47 ins; 20 del; 45 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From kvn at openjdk.java.net Wed Oct 21 21:02:17 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 21 Oct 2020 21:02:17 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v7] In-Reply-To: <9IvmeUEG1iYCXnh7wM4GcijH3T4tMI7XPrtrO1Wxm8M=.bc17c4ef-258a-4dc1-8001-a618e9596fd5@github.com> References: <7Dgp_C-H8LsbF3tPHinVmb5bT_LoLZLYZUx9eSqigCA=.894148e4-a3f2-42c3-ab19-13e134e66853@github.com> <9IvmeUEG1iYCXnh7wM4GcijH3T4tMI7XPrtrO1Wxm8M=.bc17c4ef-258a-4dc1-8001-a618e9596fd5@github.com> Message-ID: On Wed, 21 Oct 2020 20:12:55 GMT, Vladimir Kozlov wrote: >> CoreyAshford has updated the pull request incrementally with one additional commit since the last revision: >> >> CheckGraalIntrinsics.java: fix copy/paste error > > Changes requested by kvn (Reviewer). Note, tier1 and tier2 passed clean. But I have to rebuild it with updated test and run tier3 again. CheckGraalIntrinsics.java passed. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From rrich at openjdk.java.net Wed Oct 21 21:12:21 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Wed, 21 Oct 2020 21:12:21 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: Message-ID: On Wed, 21 Oct 2020 17:45:20 GMT, Daniel D. Daugherty wrote: > I'm going to assume that my confusion about why there > is code from @reinrich's EscapeBarrier work here is > because of the merging of conflicts... That's correct. #119 got integrated and this pr needs to resolve a few locations because it moves code that has EscapeBarriers into handshakes. EBs cannot be executed in a handshake as they can safepoint doing heap allocations so they are moved before the handshake. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From dholmes at openjdk.java.net Wed Oct 21 22:50:14 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 21 Oct 2020 22:50:14 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: Message-ID: On Wed, 21 Oct 2020 08:40:47 GMT, Robbin Ehn wrote: >> The main point of this change-set is to make it easier to implement S/R on top of handshakes. >> Which is a prerequisite for removing _suspend_flag (which duplicates the handshake functionality). >> But we also remove some complicated S/R methods. >> >> We basically just put in everything in the handshake closure, so the diff just looks much worse than what it is. >> >> TraceSuspendDebugBits have an ifdef, but in both cases it now just returns. >> But I was unsure if I should remove now or when is_ext_suspend_completed() is removed. >> >> Passes multiple t1-5 runs, locally it passes many jck:vm/nsk_jvmti/nsk_jdi/jdk-jdi runs. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: > > - Fixed merge miss > - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended > - Merge fix from Richard > - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended > - Removed TraceSuspendDebugBits > - Removed unused method is_ext_suspend_completed_with_lock > - Utilize handshakes instead of is_thread_fully_suspended Incremental updates seem fine to me. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/729 From dholmes at openjdk.java.net Wed Oct 21 23:01:15 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 21 Oct 2020 23:01:15 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: <27LjwNE2Xl_ceCaXzFOuF9slhZklqN9TvLt0Vsw2sMM=.9c9d6a08-d19c-4ed3-af8f-890ad1a0bdc5@github.com> Message-ID: On Wed, 21 Oct 2020 17:12:43 GMT, Daniel D. Daugherty wrote: >> src/hotspot/share/prims/jvmtiEnvBase.cpp line 1631: >> >>> 1629: _state->set_pending_step_for_popframe(); >>> 1630: _result = JVMTI_ERROR_NONE; >>> 1631: } >> >> I'd suggest to eliminate jt and use java_thread instead. Also because you're using java_thread in line 1626. The assertion should check if `_state->get_thread() == target` then. > > Especially if an assert() is added above on L1543. Agreed - this code has become confused about what thread variables are present and their relationship. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From dholmes at openjdk.java.net Wed Oct 21 23:01:17 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 21 Oct 2020 23:01:17 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: Message-ID: On Wed, 21 Oct 2020 17:15:01 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: >> >> - Fixed merge miss >> - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended >> - Merge fix from Richard >> - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended >> - Removed TraceSuspendDebugBits >> - Removed unused method is_ext_suspend_completed_with_lock >> - Utilize handshakes instead of is_thread_fully_suspended > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1661: > >> 1659: assert(vf->frame_pointer() != NULL, "frame pointer mustn't be NULL"); >> 1660: if (java_thread->is_exiting() || java_thread->threadObj() == NULL) { >> 1661: return; > > What's the `_result` value if this `return` executes? The default "not alive" value. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From fyang at openjdk.java.net Wed Oct 21 23:42:33 2020 From: fyang at openjdk.java.net (Fei Yang) Date: Wed, 21 Oct 2020 23:42:33 GMT Subject: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v11] In-Reply-To: References: Message-ID: > Contributed-by: ard.biesheuvel at linaro.org, dongbo4 at huawei.com > > This added an intrinsic for SHA3 using aarch64 v8.2 SHA3 Crypto Extensions. > Reference implementation for core SHA-3 transform using ARMv8.2 Crypto Extensions: > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/arm64/crypto/sha3-ce-core.S?h=v5.4.52 > > Trivial adaptation in SHA3. implCompress is needed for the purpose of adding the intrinsic. > For SHA3, we need to pass one extra parameter "digestLength" to the stub for the calculation of block size. > "digestLength" is also used in for the EOR loop before keccak to differentiate different SHA3 variants. > > We added jtreg tests for SHA3 and used QEMU system emulator which supports SHA3 instructions to test the functionality. > Patch passed jtreg tier1-3 tests with QEMU system emulator. > Also verified with jtreg tier1-3 tests without SHA3 instructions on aarch64-linux-gnu and x86_64-linux-gnu, to make sure that there's no regression. > > We used one existing JMH test for performance test: test/micro/org/openjdk/bench/java/security/MessageDigests.java > We measured the performance benefit with an aarch64 cycle-accurate simulator. > Patch delivers 20% - 40% performance improvement depending on specific SHA3 digest length and size of the message. > > For now, this feature will not be enabled automatically for aarch64. We can auto-enable this when it is fully tested on real hardware. But for the above testing purposes, this is auto-enabled when the corresponding hardware feature is detected. Fei Yang has updated the pull request incrementally with one additional commit since the last revision: Add if (isJDK16OrHigher()) check for SHA3 in CheckGraalIntrinsics.java ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/207/files - new: https://git.openjdk.java.net/jdk/pull/207/files/d32c8ad7..b43f9197 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=207&range=10 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=207&range=09-10 Stats: 4 lines in 1 file changed: 3 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/207.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/207/head:pull/207 PR: https://git.openjdk.java.net/jdk/pull/207 From fyang at openjdk.java.net Thu Oct 22 00:49:12 2020 From: fyang at openjdk.java.net (Fei Yang) Date: Thu, 22 Oct 2020 00:49:12 GMT Subject: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v10] In-Reply-To: <06kvIM_3abB-35pPdgFfbvwCND6oe9QCBqXBQ8iIrZ4=.64ae7da4-be02-46cc-afde-ffeb9ec9d703@github.com> References: <06kvIM_3abB-35pPdgFfbvwCND6oe9QCBqXBQ8iIrZ4=.64ae7da4-be02-46cc-afde-ffeb9ec9d703@github.com> Message-ID: On Wed, 21 Oct 2020 19:20:28 GMT, Vladimir Kozlov wrote: >> OK. Will update with the following change after Aleksey's PR is integrated: >> >> --- a/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java >> +++ b/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java >> @@ -608,6 +608,10 @@ public class CheckGraalIntrinsics extends GraalTest { >> if (!config.useSHA512Intrinsics()) { >> add(ignore, "sun/security/provider/SHA5." + shaCompressName + "([BI)V"); >> } >> + >> + if (isJDK16OrHigher()) { >> + add(toBeInvestigated, "sun/security/provider/SHA3." + shaCompressName + "([BI)V"); >> + } >> } > > Yes, please, do that. Done. Commit: https://github.com/openjdk/jdk/pull/207/commits/b43f91970d44e6e0c1b3b4ef452ec388ecbecb83 I think this will not conflict with Aleksey's PR as we modify in different places of CheckGraalIntrinsics.java ------------- PR: https://git.openjdk.java.net/jdk/pull/207 From dholmes at openjdk.java.net Thu Oct 22 00:57:13 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 22 Oct 2020 00:57:13 GMT Subject: RFR: 8255047: Add HotSpot flag to use with debuggers that restrict the =?UTF-8?B?Q1BV4oCm?= [v2] In-Reply-To: References: <_0YMtDNWAGXhg-_EyMf0c8GoNE4cf4wf1hpyr9j-sNs=.2ec5bf8b-023e-4bbd-9692-f74cd6c97f8d@github.com> Message-ID: <9R41Eop1StEd-T4WGzs0UY5-7y122G0dorGSd0QjL2A=.0133182a-cdf7-4c45-a56b-2980b405ba9f@github.com> On Wed, 21 Oct 2020 11:36:25 GMT, Stefan Karlsson wrote: >> Some debuggers don't work well with many threads, and/or incompletely restricts the number of used CPUs to one. >> >> This flag is intended as a catch-all for HotSpot developers (not available in product builds) to allow us to more easily use those debuggers. >> >> Currently, the proposal is to let the flag fix a few things: >> 1) Turn down the number of JVM threads >> 2) Turn off NUMA >> 3) Force processor_id() to return 0 instead of values above processor_count() >> >> (1) is purely ergonomics: gdb, rr, valgrind is faster and seems to work much better with fewer threads. The values would still be overridable by devs. >> >> (2) and (3) deals with the fact that some debuggers change the reported processor count, but don't change the processor ids returned by sched_getcpu. This causes problems for ZGC and NUMA, that both assumes that they can rely on os::processor_id() < os::processor_count(). >> >> The current proposed flag name is -XX:+LimitedCPUsDebugging. I'm not entirely happy with that name, but I been able to find a better name. >> >> An alternative to having one flag, is to split this into two flags, and maybe that would solve the naming problem. However, the usability aspects will be worse. >> >> If we can't find a suitable name, I rather introduce a flag called: >> -XX:DebuggerWorkarounds or -XX:DebuggerWorkaround1 >> >> Any suggestions / opinions? I really do want to at least fix the (2, 3) problem, because I keep having to add this to every single branch I'm working on. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > 8255047: Add HotSpot flag to use with debuggers that restrict the CPU count LGTM! Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/763 From kvn at openjdk.java.net Thu Oct 22 01:40:16 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 22 Oct 2020 01:40:16 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v8] In-Reply-To: References: Message-ID: On Wed, 21 Oct 2020 20:43:30 GMT, CoreyAshford wrote: >> This patch set encompasses the following commits: >> >> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for the intrinsic. The API is similar to the existing encodeBlock intrinsic. >> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic implementation >> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. >> - Adds a JMH microbenchmark for both Base64 encoding and encoding. >> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. > > CoreyAshford has updated the pull request incrementally with one additional commit since the last revision: > > TestBase64.java: remove jdk.test.lib.Utils from @build which was causing Tier3 failures. Tier3 testing clean with updated test. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/293 From github.com+51754783+coreyashford at openjdk.java.net Thu Oct 22 03:46:16 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Thu, 22 Oct 2020 03:46:16 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v8] In-Reply-To: References: Message-ID: <_DbhZNXM5Jq69gF-xg91xtj6ctkRAiltu0TRJK-SN98=.acafc7c4-971b-47a7-8051-9ae94071139a@github.com> On Thu, 22 Oct 2020 01:36:59 GMT, Vladimir Kozlov wrote: > Tier3 testing clean with updated test. Thank you for identifying the problem, the fix, then rebuilding and rerunning the tests! ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From kvn at openjdk.java.net Thu Oct 22 04:02:15 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 22 Oct 2020 04:02:15 GMT Subject: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v11] In-Reply-To: References: Message-ID: On Wed, 21 Oct 2020 23:42:33 GMT, Fei Yang wrote: >> Contributed-by: ard.biesheuvel at linaro.org, dongbo4 at huawei.com >> >> This added an intrinsic for SHA3 using aarch64 v8.2 SHA3 Crypto Extensions. >> Reference implementation for core SHA-3 transform using ARMv8.2 Crypto Extensions: >> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/arm64/crypto/sha3-ce-core.S?h=v5.4.52 >> >> Trivial adaptation in SHA3. implCompress is needed for the purpose of adding the intrinsic. >> For SHA3, we need to pass one extra parameter "digestLength" to the stub for the calculation of block size. >> "digestLength" is also used in for the EOR loop before keccak to differentiate different SHA3 variants. >> >> We added jtreg tests for SHA3 and used QEMU system emulator which supports SHA3 instructions to test the functionality. >> Patch passed jtreg tier1-3 tests with QEMU system emulator. >> Also verified with jtreg tier1-3 tests without SHA3 instructions on aarch64-linux-gnu and x86_64-linux-gnu, to make sure that there's no regression. >> >> We used one existing JMH test for performance test: test/micro/org/openjdk/bench/java/security/MessageDigests.java >> We measured the performance benefit with an aarch64 cycle-accurate simulator. >> Patch delivers 20% - 40% performance improvement depending on specific SHA3 digest length and size of the message. >> >> For now, this feature will not be enabled automatically for aarch64. We can auto-enable this when it is fully tested on real hardware. But for the above testing purposes, this is auto-enabled when the corresponding hardware feature is detected. > > Fei Yang has updated the pull request incrementally with one additional commit since the last revision: > > Add if (isJDK16OrHigher()) check for SHA3 in CheckGraalIntrinsics.java tier1,2,3 passed. I verified that new SHA3 tests were run and passed. But because SHA3 is not enabled for now (even on aarch64), it does not test asm code. At least testing verified that changes in shared code does not cause any issues. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/207 From fyang at openjdk.java.net Thu Oct 22 04:23:11 2020 From: fyang at openjdk.java.net (Fei Yang) Date: Thu, 22 Oct 2020 04:23:11 GMT Subject: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v11] In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 03:59:45 GMT, Vladimir Kozlov wrote: > tier1,2,3 passed. I verified that new SHA3 tests were run and passed. > But because SHA3 is not enabled for now (even on aarch64), it does not test asm code. > At least testing verified that changes in shared code does not cause any issues. Great to hear that :-) Thanks for the effect. With that testing result and reviewing from three reviewers, I think it's safe to integrate. ------------- PR: https://git.openjdk.java.net/jdk/pull/207 From fyang at openjdk.java.net Thu Oct 22 04:44:21 2020 From: fyang at openjdk.java.net (Fei Yang) Date: Thu, 22 Oct 2020 04:44:21 GMT Subject: Integrated: 8252204: AArch64: Implement SHA3 accelerator/intrinsic In-Reply-To: References: Message-ID: On Wed, 16 Sep 2020 16:36:54 GMT, Fei Yang wrote: > Contributed-by: ard.biesheuvel at linaro.org, dongbo4 at huawei.com > > This added an intrinsic for SHA3 using aarch64 v8.2 SHA3 Crypto Extensions. > Reference implementation for core SHA-3 transform using ARMv8.2 Crypto Extensions: > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/arm64/crypto/sha3-ce-core.S?h=v5.4.52 > > Trivial adaptation in SHA3. implCompress is needed for the purpose of adding the intrinsic. > For SHA3, we need to pass one extra parameter "digestLength" to the stub for the calculation of block size. > "digestLength" is also used in for the EOR loop before keccak to differentiate different SHA3 variants. > > We added jtreg tests for SHA3 and used QEMU system emulator which supports SHA3 instructions to test the functionality. > Patch passed jtreg tier1-3 tests with QEMU system emulator. > Also verified with jtreg tier1-3 tests without SHA3 instructions on aarch64-linux-gnu and x86_64-linux-gnu, to make sure that there's no regression. > > We used one existing JMH test for performance test: test/micro/org/openjdk/bench/java/security/MessageDigests.java > We measured the performance benefit with an aarch64 cycle-accurate simulator. > Patch delivers 20% - 40% performance improvement depending on specific SHA3 digest length and size of the message. > > For now, this feature will not be enabled automatically for aarch64. We can auto-enable this when it is fully tested on real hardware. But for the above testing purposes, this is auto-enabled when the corresponding hardware feature is detected. This pull request has now been integrated. Changeset: b25d8940 Author: Fei Yang URL: https://git.openjdk.java.net/jdk/commit/b25d8940 Stats: 1265 lines in 36 files changed: 1010 ins; 22 del; 233 mod 8252204: AArch64: Implement SHA3 accelerator/intrinsic Co-authored-by: Ard Biesheuvel Co-authored-by: Dong Bo Reviewed-by: aph, kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/207 From stefank at openjdk.java.net Thu Oct 22 06:38:12 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 22 Oct 2020 06:38:12 GMT Subject: RFR: 8255047: Add HotSpot flag to use with debuggers that restrict the =?UTF-8?B?Q1BV4oCm?= [v2] In-Reply-To: <9R41Eop1StEd-T4WGzs0UY5-7y122G0dorGSd0QjL2A=.0133182a-cdf7-4c45-a56b-2980b405ba9f@github.com> References: <_0YMtDNWAGXhg-_EyMf0c8GoNE4cf4wf1hpyr9j-sNs=.2ec5bf8b-023e-4bbd-9692-f74cd6c97f8d@github.com> <9R41Eop1StEd-T4WGzs0UY5-7y122G0dorGSd0QjL2A=.0133182a-cdf7-4c45-a56b-2980b405ba9f@github.com> Message-ID: On Thu, 22 Oct 2020 00:54:54 GMT, David Holmes wrote: >> Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: >> >> 8255047: Add HotSpot flag to use with debuggers that restrict the CPU count > > LGTM! > > Thanks, > David Thanks @dcubed-ojdk and @dholmes-ora! ------------- PR: https://git.openjdk.java.net/jdk/pull/763 From stefank at openjdk.java.net Thu Oct 22 07:35:21 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 22 Oct 2020 07:35:21 GMT Subject: RFR: 8237363 remove oop iterate verification Message-ID: There's verification code in the "oop iterate" framework that asserts that a pointer is "is in the heap". This works for most GCs, but ZGC *can* eagerly decommit the old relocation set pages, which means that pointers to the old / from copy of the object could point to memory that is currently not a part of the current heap. To combat this in the past I've added a way for oop iterate closures to turn off this verification. However, every single time we add a new closure we have to consider if we can allow this verification check or if we have to remove it. Personally, I think this is a false abstraction and also widens the oop iterate closure interface. I previously proposed a patch that moved the verification code down into the oop iterate closures. It wasn't a huge patch, but I got push-back that it was convenient for other GCs to get this automatic verification, and the review stalled. In this new patch I propose a different way to retain the verification. The realization is that most oop iterate closures have to deal with both compressed and non-compressed oops, so the code typically looks like this: template inline void G1ScanCardClosure::do_oop_work(T* p) { T o = RawAccess<>::oop_load(p); if (CompressedOops::is_null(o)) { return; } oop obj = CompressedOops::decode_not_null(o); Therefore the suggest new place to put the is_in verification is in the CompressedOops::decode*. This injects the assert into almost all non-ZGC closures, and also to places that don't use the oop iterate closure framework. I think this is a neat workaround, and hope this patch is accepted this time. I've tested this patch a few weeks ago, but will rerun the relevant tiers. ------------- Commit messages: - Remove assert macros - Merge branch 'master' into 8237363_remove_oop_iterate_verification - Merge branch 'master' into 8237363_remove_oop_iterate_verification - 8237363: Remove automatic is in heap verification in OopIterateClosure Changes: https://git.openjdk.java.net/jdk/pull/797/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=797&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8237363 Stats: 117 lines in 17 files changed: 25 ins; 83 del; 9 mod Patch: https://git.openjdk.java.net/jdk/pull/797.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/797/head:pull/797 PR: https://git.openjdk.java.net/jdk/pull/797 From stefank at openjdk.java.net Thu Oct 22 07:35:21 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 22 Oct 2020 07:35:21 GMT Subject: RFR: 8237363 remove oop iterate verification In-Reply-To: References: Message-ID: <7w2CxRtmft5OYgHC47LvVEWDbRTkb1beUxk_esk2Dec=.f451aae0-1b20-455e-84fd-52f43201d96c@github.com> On Thu, 22 Oct 2020 07:30:16 GMT, Stefan Karlsson wrote: > There's verification code in the "oop iterate" framework that asserts that a pointer is "is in the heap". This works for most GCs, but ZGC *can* eagerly decommit the old relocation set pages, which means that pointers to the old / from copy of the object could point to memory that is currently not a part of the current heap. > > To combat this in the past I've added a way for oop iterate closures to turn off this verification. However, every single time we add a new closure we have to consider if we can allow this verification check or if we have to remove it. Personally, I think this is a false abstraction and also widens the oop iterate closure interface. I previously proposed a patch that moved the verification code down into the oop iterate closures. It wasn't a huge patch, but I got push-back that it was convenient for other GCs to get this automatic verification, and the review stalled. > > In this new patch I propose a different way to retain the verification. The realization is that most oop iterate closures have to deal with both compressed and non-compressed oops, so the code typically looks like this: > > template > inline void G1ScanCardClosure::do_oop_work(T* p) { > T o = RawAccess<>::oop_load(p); > if (CompressedOops::is_null(o)) { > return; > } > oop obj = CompressedOops::decode_not_null(o); > > Therefore the suggest new place to put the is_in verification is in the CompressedOops::decode*. This injects the assert into almost all non-ZGC closures, and also to places that don't use the oop iterate closure framework. I think this is a neat workaround, and hope this patch is accepted this time. > > I've tested this patch a few weeks ago, but will rerun the relevant tiers. This is mostly of concerns for the hotspot-gc, but touches compressed oops so I'll move this to hotspot instead. ------------- PR: https://git.openjdk.java.net/jdk/pull/797 From rrich at openjdk.java.net Thu Oct 22 07:43:23 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Thu, 22 Oct 2020 07:43:23 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: <27LjwNE2Xl_ceCaXzFOuF9slhZklqN9TvLt0Vsw2sMM=.9c9d6a08-d19c-4ed3-af8f-890ad1a0bdc5@github.com> Message-ID: On Wed, 21 Oct 2020 17:03:45 GMT, Daniel D. Daugherty wrote: >> src/hotspot/share/prims/jvmtiEnvBase.cpp line 1454: >> >>> 1452: _state->set_earlyret_pending(); >>> 1453: _state->set_earlyret_oop(ret_ob_h()); >>> 1454: _state->set_earlyret_value(_value, _tos); >> >> Good that these updates are done with a handshake now. Maybe I'm missing s.th. but I don't see synchronization in the older version. > > Agreed. @sspitsyn - This makes me wonder if the lack of > synchronization is the cause of some instability in the > JVM/TI ForceEarlyReturn() testing. > > Update: The old code only made the updates if the thread was fully > suspended so you won't have a race between the requesting thread > and the target thread in that case. Yes, I meant synchronization between racing agent threads. Surely a corner case. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From rrich at openjdk.java.net Thu Oct 22 07:43:26 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Thu, 22 Oct 2020 07:43:26 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: Message-ID: On Wed, 21 Oct 2020 16:45:53 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: >> >> - Fixed merge miss >> - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended >> - Merge fix from Richard >> - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended >> - Removed TraceSuspendDebugBits >> - Removed unused method is_ext_suspend_completed_with_lock >> - Utilize handshakes instead of is_thread_fully_suspended > > src/hotspot/share/prims/jvmtiEnv.cpp line 1646: > >> 1644: // java_thread - pre-checked >> 1645: jvmtiError >> 1646: JvmtiEnv::PopFrame(JavaThread* java_thread) { > > So I'm a bit confused why I'm seeing PopFrame() changes here that are > related to @reinrich's EscapeBarrier work. I've seen mention of picking > up a patch during this review from @reinrich so may that's why. I don't > see anything wrong with the changes, but I am confused why they are > here in this review. This change moves code with EscapeBarriers (integrated with #119) into a handshake. That does not work because object reallocation can safepoint. So the EBs are pulled out of the handshake. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From shade at openjdk.java.net Thu Oct 22 07:47:15 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 22 Oct 2020 07:47:15 GMT Subject: RFR: 8237363 remove oop iterate verification In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 07:30:16 GMT, Stefan Karlsson wrote: > There's verification code in the "oop iterate" framework that asserts that a pointer is "is in the heap". This works for most GCs, but ZGC *can* eagerly decommit the old relocation set pages, which means that pointers to the old / from copy of the object could point to memory that is currently not a part of the current heap. > > To combat this in the past I've added a way for oop iterate closures to turn off this verification. However, every single time we add a new closure we have to consider if we can allow this verification check or if we have to remove it. Personally, I think this is a false abstraction and also widens the oop iterate closure interface. I previously proposed a patch that moved the verification code down into the oop iterate closures. It wasn't a huge patch, but I got push-back that it was convenient for other GCs to get this automatic verification, and the review stalled. > > In this new patch I propose a different way to retain the verification. The realization is that most oop iterate closures have to deal with both compressed and non-compressed oops, so the code typically looks like this: > > template > inline void G1ScanCardClosure::do_oop_work(T* p) { > T o = RawAccess<>::oop_load(p); > if (CompressedOops::is_null(o)) { > return; > } > oop obj = CompressedOops::decode_not_null(o); > > Therefore the suggest new place to put the is_in verification is in the CompressedOops::decode*. This injects the assert into almost all non-ZGC closures, and also to places that don't use the oop iterate closure framework. I think this is a neat workaround, and hope this patch is accepted this time. > > I've tested this patch a few weeks ago, but will rerun the relevant tiers. This has two minor drawbacks for GC implementations that verify oops with their own asserts (like Shenandoah): they would call into `CollectedHeap::is_in` twice (once from shared code assert, and once from their own), and then also fail with non-rich assert (in the shared code) when something goes wrong. Of course, that can be mitigated by calling into `_raw` versions. src/hotspot/share/memory/filemap.cpp line 1740: > 1738: narrowOop n = CompressedOops::narrow_oop_cast(offset); > 1739: if (with_current_oop_encoding_mode) { > 1740: return cast_from_oop

(CompressedOops::decode_raw_not_null(n)); Why does this line skip verification now? ------------- PR: https://git.openjdk.java.net/jdk/pull/797 From eosterlund at openjdk.java.net Thu Oct 22 07:49:16 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 22 Oct 2020 07:49:16 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: Message-ID: On Wed, 21 Oct 2020 08:40:47 GMT, Robbin Ehn wrote: >> The main point of this change-set is to make it easier to implement S/R on top of handshakes. >> Which is a prerequisite for removing _suspend_flag (which duplicates the handshake functionality). >> But we also remove some complicated S/R methods. >> >> We basically just put in everything in the handshake closure, so the diff just looks much worse than what it is. >> >> TraceSuspendDebugBits have an ifdef, but in both cases it now just returns. >> But I was unsure if I should remove now or when is_ext_suspend_completed() is removed. >> >> Passes multiple t1-5 runs, locally it passes many jck:vm/nsk_jvmti/nsk_jdi/jdk-jdi runs. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: > > - Fixed merge miss > - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended > - Merge fix from Richard > - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended > - Removed TraceSuspendDebugBits > - Removed unused method is_ext_suspend_completed_with_lock > - Utilize handshakes instead of is_thread_fully_suspended Just wondering why the escape barrier for force early return uses a stack depth is 0. Either that is wrong, or the escape barrier is not needed in the first place here. I think. src/hotspot/share/prims/jvmtiEnvBase.cpp line 1390: > 1388: return JVMTI_ERROR_OUT_OF_MEMORY; > 1389: } > 1390: if (!eb.deoptimize_objects(0)) { Why is the depth 0 here? That makes no sense to me. My understanding of this is that we have extracted the object deopt that would "normally" (since last week?) be done in JvmtiEnvBase::check_top_frame. And it is walking 1 frame, so shouldn't the depth be 1? ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From eosterlund at openjdk.java.net Thu Oct 22 07:54:17 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 22 Oct 2020 07:54:17 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: Message-ID: <3Ys1ehxM15rARcQvGbk3jOkmZupTUQnxSGYdUETQuGU=.5e2bec65-9188-4cb0-8df3-631dcccfd60a@github.com> On Wed, 21 Oct 2020 08:40:47 GMT, Robbin Ehn wrote: >> The main point of this change-set is to make it easier to implement S/R on top of handshakes. >> Which is a prerequisite for removing _suspend_flag (which duplicates the handshake functionality). >> But we also remove some complicated S/R methods. >> >> We basically just put in everything in the handshake closure, so the diff just looks much worse than what it is. >> >> TraceSuspendDebugBits have an ifdef, but in both cases it now just returns. >> But I was unsure if I should remove now or when is_ext_suspend_completed() is removed. >> >> Passes multiple t1-5 runs, locally it passes many jck:vm/nsk_jvmti/nsk_jdi/jdk-jdi runs. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: > > - Fixed merge miss > - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended > - Merge fix from Richard > - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended > - Removed TraceSuspendDebugBits > - Removed unused method is_ext_suspend_completed_with_lock > - Utilize handshakes instead of is_thread_fully_suspended src/hotspot/share/prims/jvmtiEnv.cpp line 1663: > 1661: return JVMTI_ERROR_OUT_OF_MEMORY; > 1662: } > 1663: if (!eb.deoptimize_objects(1)) { Oh and why is the depth 1 here, when two frames are deoptimized? Maybe I missed something. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From rehn at openjdk.java.net Thu Oct 22 08:07:22 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 22 Oct 2020 08:07:22 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: <27LjwNE2Xl_ceCaXzFOuF9slhZklqN9TvLt0Vsw2sMM=.9c9d6a08-d19c-4ed3-af8f-890ad1a0bdc5@github.com> Message-ID: On Wed, 21 Oct 2020 16:47:39 GMT, Daniel D. Daugherty wrote: >> src/hotspot/share/prims/jvmtiEnv.cpp line 1808: >> >>> 1806: } >>> 1807: if (java_lang_Class::is_primitive(k_mirror)) { >>> 1808: return JVMTI_ERROR_NONE; >> >> The call of JvmtiSuspendControl::print() seems to be eliminated. Ok for me. > > It's not clear to me why the `JvmtiSuspendControl::print()` is being > eliminated. Please explain. The `TraceJVMTICalls` support is so that > someone can diagnose what JVM/TI calls are being made, including > context in some cases, so it seems wrong to delete this call. TraceJVMTICalls is a define local to the file jvmtiEnv.cpp set to false which added some more logging for three of the JVM TI functions. You must first read the code to see if TraceJVMTICalls affects the functions you have an issue with and then change it to true if your lucky it's on of those three. And then you need to recompile. Before commit you need to set to false again. Why not just temporary add the JvmtiSuspendControl::print()/relevant logging instead ? Which you still need to do if it's not one of those three functions. Since this code is not in jvmtiEnv.cpp, we also would need to move TraceJVMTICalls to global scope in some header. Turning TraceJVMTICalls into UL is good idea I guess, but not in scope of this :) >> src/hotspot/share/runtime/thread.cpp line 537: >> >>> 535: // cancelled). Returns true if the thread is externally suspended and >>> 536: // false otherwise. >>> 537: bool JavaThread::is_ext_suspend_completed() { >> >> I'd think `JavaThread::is_ext_suspend_completed` can be removed also (as a separate enhancement). It also duplicates code of the handshake mechanism. Just replace VM_ThreadSuspend with a handshake. > > `is_ext_suspend_completed()` includes code that detects that a thread > that is in `_thread_in_native_trans` and does not yet have a walkable > stack has not completed suspension and we will do some retries in > this function until the target thread gets stable. We have to make sure > that the handshake mechanism has a similar stability guarantee or a > stack walker may fail intermittently. Handshake can only be executed at safepoint poll site, which means if the stack is walkable in all safepoints it is also true for handshakes. And we would be in so much trouble if it were not walkable in all safepoints :) ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From rehn at openjdk.java.net Thu Oct 22 08:07:29 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 22 Oct 2020 08:07:29 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: Message-ID: On Wed, 21 Oct 2020 16:54:48 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: >> >> - Fixed merge miss >> - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended >> - Merge fix from Richard >> - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended >> - Removed TraceSuspendDebugBits >> - Removed unused method is_ext_suspend_completed_with_lock >> - Utilize handshakes instead of is_thread_fully_suspended > > src/hotspot/share/prims/jvmtiEnv.cpp line 1718: > >> 1716: MutexLocker mu(JvmtiThreadState_lock); >> 1717: if (java_thread == JavaThread::current()) { >> 1718: op.doit(java_thread, true); > > Please add a comment after the `true` parameter to > indicate the name of the doit() function's parameter, > e.g., `true /* self */`. Fixed > src/hotspot/share/prims/jvmtiEnvBase.cpp line 56: > >> 54: #include "runtime/threadSMR.hpp" >> 55: #include "runtime/vframe.hpp" >> 56: #include "runtime/vframe.inline.hpp" > > When you add `foo.inline.hpp` you delete `foo.hpp` because > the `foo.inline.hpp` file always includes the `foo.hpp` file. Fixed > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1311: > >> 1309: // It is to keep a ret_ob_h handle alive after return to the caller. >> 1310: jvmtiError >> 1311: JvmtiEnvBase::check_top_frame(Thread* current_thread, JavaThread* java_thread, > > Again, it is not clear why these changes to `check_top_frame` are here since they > appear to be related to @reinrich's work. Long story: Before async handshakes was integrate I had a patch which does the same as this. This change was accidentally slipped into async handshakes change-set and was integrated. Richard notice this, I told him he could revert it in his change-set for EB, so he did. But now we need this change, so here it comes once more! VM thread is allowed to execute these handshakes, thus when calling check_top_frame() from SetForceEarlyReturn::doit() it's just a Thread*. > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1398: > >> 1396: SetForceEarlyReturn op(state, value, tos); >> 1397: if (java_thread == current_thread) { >> 1398: op.doit(java_thread, true); > > Please add a comment after the true parameter to > indicate the name of the doit() function's parameter, > e.g., `true /* self */`. Fixed > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1543: > >> 1541: HandleMark hm(current_thread); >> 1542: JavaThread* java_thread = target->as_Java_thread(); >> 1543: > > This would be useful here: > > `assert(_state->get_thread() == java_thread, "Must be");` Fixed. > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1570: > >> 1568: >> 1569: ResourceMark rm(current_thread); >> 1570: // Check if there are more than one Java frame in this thread, that the top two frames > > typo: s/are more/is more/ Fixed > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1642: > >> 1640: >> 1641: if (!self) { >> 1642: if (!java_thread->is_external_suspend()) { > > You could join these two if-statements with `&&` and have > one less indenting level... Fixed > src/hotspot/share/prims/jvmtiEnvBase.hpp line 361: > >> 359: _tos(tos) {} >> 360: void do_thread(Thread *target) { >> 361: doit(target, false); > > Please add a comment after the true parameter to > indicate the name of the doit() function's parameter, > e.g., `false /* self */`. Fixed > src/hotspot/share/prims/jvmtiEnvBase.hpp line 395: > >> 393: _depth(depth) {} >> 394: void do_thread(Thread *target) { >> 395: doit(target, false); > > Please add a comment after the true parameter to > indicate the name of the doit() function's parameter, > e.g., `false /* self */`. Fixed > src/hotspot/share/runtime/deoptimization.cpp line 1755: > >> 1753: thread->is_handshake_safe_for(Thread::current()) || >> 1754: SafepointSynchronize::is_at_safepoint(), >> 1755: "can only deoptimize other thread at a safepoint"); > > Should that now be: `safepoint/handshake`?? Fixed > src/hotspot/share/runtime/thread.cpp line 698: > >> 696: RememberProcessedThread rpt(this); >> 697: oops_do_no_frames(f, cf); >> 698: oops_do_frames(f, cf); > > In the comment above: > // ... If we were > // called by wait_for_ext_suspend_completion(), then it > // will be doing the retries so we don't have to. > `wait_for_ext_suspend_completion()` has been deleted so the > comment needs work. I just delete the no longer relevant parts. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From rehn at openjdk.java.net Thu Oct 22 08:07:30 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 22 Oct 2020 08:07:30 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: <27LjwNE2Xl_ceCaXzFOuF9slhZklqN9TvLt0Vsw2sMM=.9c9d6a08-d19c-4ed3-af8f-890ad1a0bdc5@github.com> Message-ID: On Thu, 22 Oct 2020 06:50:37 GMT, Richard Reingruber wrote: >> Agreed. @sspitsyn - This makes me wonder if the lack of >> synchronization is the cause of some instability in the >> JVM/TI ForceEarlyReturn() testing. >> >> Update: The old code only made the updates if the thread was fully >> suspended so you won't have a race between the requesting thread >> and the target thread in that case. > > Yes, I meant synchronization between racing agent threads. Surely a corner case. Since we do not hold Threads_lock nor SR_lock nothing is stopping the resume at this point AFAICT. Now this might be illegal, but it can happen if you are not really careful, specially in test like Kitchensink where two modules might use S/R on the same thread. Also if two threads calling this, the second thread might have passed: if (_state->is_earlyret_pending()) { When we do the setting: _state->set_earlyret_pending(); But now it's protected, even if this never manifested as bug, now we sure it will not. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From rehn at openjdk.java.net Thu Oct 22 08:07:30 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 22 Oct 2020 08:07:30 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: <27LjwNE2Xl_ceCaXzFOuF9slhZklqN9TvLt0Vsw2sMM=.9c9d6a08-d19c-4ed3-af8f-890ad1a0bdc5@github.com> Message-ID: <6edg514T7j4mK6ugFkgnj9hhe3_Unk5V7zgrOBTAg5A=.2203f384-6baa-417f-9da9-73e00fb5406e@github.com> On Wed, 21 Oct 2020 22:57:26 GMT, David Holmes wrote: >> Especially if an assert() is added above on L1543. > > Agreed - this code has become confused about what thread variables are present and their relationship. Fixed and moved assert. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From rehn at openjdk.java.net Thu Oct 22 08:07:30 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 22 Oct 2020 08:07:30 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: Message-ID: On Wed, 21 Oct 2020 22:57:59 GMT, David Holmes wrote: >> src/hotspot/share/prims/jvmtiEnvBase.cpp line 1661: >> >>> 1659: assert(vf->frame_pointer() != NULL, "frame pointer mustn't be NULL"); >>> 1660: if (java_thread->is_exiting() || java_thread->threadObj() == NULL) { >>> 1661: return; >> >> What's the `_result` value if this `return` executes? > > The default "not alive" value. Added comment. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From rehn at openjdk.java.net Thu Oct 22 08:07:32 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 22 Oct 2020 08:07:32 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: <27LjwNE2Xl_ceCaXzFOuF9slhZklqN9TvLt0Vsw2sMM=.9c9d6a08-d19c-4ed3-af8f-890ad1a0bdc5@github.com> References: <27LjwNE2Xl_ceCaXzFOuF9slhZklqN9TvLt0Vsw2sMM=.9c9d6a08-d19c-4ed3-af8f-890ad1a0bdc5@github.com> Message-ID: On Wed, 21 Oct 2020 14:02:59 GMT, Richard Reingruber wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: >> >> - Fixed merge miss >> - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended >> - Merge fix from Richard >> - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended >> - Removed TraceSuspendDebugBits >> - Removed unused method is_ext_suspend_completed_with_lock >> - Utilize handshakes instead of is_thread_fully_suspended > > src/hotspot/share/prims/jvmtiEnvBase.hpp line 310: > >> 308: GrowableArray *owned_monitors_list); >> 309: static jvmtiError check_top_frame(Thread* current_thread, JavaThread* java_thread, >> 310: jvalue value, TosState tos, Handle* ret_ob_h); > > Maybe fix indentation? Fixed ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From rehn at openjdk.java.net Thu Oct 22 08:07:32 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 22 Oct 2020 08:07:32 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: <0yF3nXkvKHHTheoHvF4IQJ9v98AjHl-7R9tSo1qrOec=.367302e8-9414-455d-a601-046cdf5a40c8@github.com> References: <27LjwNE2Xl_ceCaXzFOuF9slhZklqN9TvLt0Vsw2sMM=.9c9d6a08-d19c-4ed3-af8f-890ad1a0bdc5@github.com> <0yF3nXkvKHHTheoHvF4IQJ9v98AjHl-7R9tSo1qrOec=.367302e8-9414-455d-a601-046cdf5a40c8@github.com> Message-ID: On Wed, 21 Oct 2020 14:20:21 GMT, Richard Reingruber wrote: >> src/hotspot/share/runtime/deoptimization.cpp line 1771: >> >>> 1769: Deoptimization::deoptimize_frame_internal(thread, id, reason); >>> 1770: } else { >>> 1771: VM_DeoptimizeFrame deopt(thread, id, reason); >> >> I guess VM_DeoptimizeFrame can be replaced with a handshake too now. > > Not in this pr of course :) I think so. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From rehn at openjdk.java.net Thu Oct 22 08:07:34 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 22 Oct 2020 08:07:34 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: Message-ID: On Wed, 21 Oct 2020 20:31:27 GMT, Erik ?sterlund wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: >> >> - Fixed merge miss >> - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended >> - Merge fix from Richard >> - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended >> - Removed TraceSuspendDebugBits >> - Removed unused method is_ext_suspend_completed_with_lock >> - Utilize handshakes instead of is_thread_fully_suspended > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1390: > >> 1388: return JVMTI_ERROR_OUT_OF_MEMORY; >> 1389: } >> 1390: if (!eb.deoptimize_objects(0)) { > > Why is the depth 0 here? That makes no sense to me. My understanding of this is that we have extracted the object deopt that would "normally" (since last week?) be done in JvmtiEnvBase::check_top_frame. And it is walking 1 frame, so shouldn't the depth be 1? @reinrich did I mess something up when merging this in? > src/hotspot/share/prims/jvmtiEnv.cpp line 1663: > >> 1661: return JVMTI_ERROR_OUT_OF_MEMORY; >> 1662: } >> 1663: if (!eb.deoptimize_objects(1)) { > > Oh and why is the depth 1 here, when two frames are deoptimized? Maybe I missed something. @reinrich did I mess something up when merging this in? ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From rrich at openjdk.java.net Thu Oct 22 08:17:16 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Thu, 22 Oct 2020 08:17:16 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 08:04:44 GMT, Robbin Ehn wrote: >> src/hotspot/share/prims/jvmtiEnvBase.cpp line 1390: >> >>> 1388: return JVMTI_ERROR_OUT_OF_MEMORY; >>> 1389: } >>> 1390: if (!eb.deoptimize_objects(0)) { >> >> Why is the depth 0 here? That makes no sense to me. My understanding of this is that we have extracted the object deopt that would "normally" (since last week?) be done in JvmtiEnvBase::check_top_frame. And it is walking 1 frame, so shouldn't the depth be 1? > > @reinrich did I mess something up when merging this in? Stack frames are counted beginning at 0. The top frame has depth 0. So object deoptimization happens in the top frame. Still the used method is not optimal because it assumes that objects of frames within the given depth are accessed and their escape state is changed. But potentially caller methods optimized on the escape state therefore it searches for caller frames passing ArgEscape objects and deoptimizes these too. With ForceEarlyReturn no objects are accessed but it is so uncommon that I did not bother optimizing this. Should I? ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From redestad at openjdk.java.net Thu Oct 22 08:20:18 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Thu, 22 Oct 2020 08:20:18 GMT Subject: RFR: 8255208: CodeStrings passed to Disassembler::decode are ignored Message-ID: CodeStrings passed directly to Disassembler::decode are wrongly ignored. This patch started out as a cleanup to clean out CodeStrings, but I realized as I was about to remove the suspiciously unused CodeStrings in the disassembler that them being unused was likely a bug. For example, -XX:+PrintInterpreter (on debug builds) included messages which can help digesting the output (if not else by emitting text hooks to the place in the source code where the asm is generated): 0x00007f1f7093becb: je 0x00007f1f7093bee5 ;; call_VM_base: heap base corrupted? <<< omitted 0x00007f1f7093bed1: mov $0x7f1f90c7ecb8,%rdi 0x00007f1f7093bedb: and $0xfffffffffffffff0,%rsp 0x00007f1f7093bedf: callq 0x00007f1f9046c0a0 = MacroAssembler::debug64(char*, long, long*) While PrintInterpreter is the only case that appears directly affected, restoring this capability seems useful in general. The cleaning up of the code also has some nice side-effects such as reducing the size of a CodeBuffer from 432 to 408 bytes and marginally improving the static size of the JVM (as measured on linux-x64) ------------- Commit messages: - Coalesce non-product fields in CodeBuffer - Merge branch 'master' into less_CodeStrings - Remove unnecessary args, minor cleanups, pass strings when decoding a CodeBuffer - Issue with printing codeStrings via InterpreterCodelet due using assign and strings getting freed. Restructure and add tracing code to allow verifying we still don't leak CodeStrings - Minor fixes - Minor fixes - Clean-up decode_env _strings initialization - Fix copy - Revert CodeStrings removal from disassemble - the fact this code was unused appears to be a regression - Remove _strings from InterpreterCodelet - ... and 11 more: https://git.openjdk.java.net/jdk/compare/1191a633...93c688ca Changes: https://git.openjdk.java.net/jdk/pull/788/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=788&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255208 Stats: 264 lines in 11 files changed: 60 ins; 146 del; 58 mod Patch: https://git.openjdk.java.net/jdk/pull/788.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/788/head:pull/788 PR: https://git.openjdk.java.net/jdk/pull/788 From stefank at openjdk.java.net Thu Oct 22 08:26:22 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 22 Oct 2020 08:26:22 GMT Subject: RFR: 8237363 remove oop iterate verification In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 07:40:03 GMT, Aleksey Shipilev wrote: >> There's verification code in the "oop iterate" framework that asserts that a pointer is "is in the heap". This works for most GCs, but ZGC *can* eagerly decommit the old relocation set pages, which means that pointers to the old / from copy of the object could point to memory that is currently not a part of the current heap. >> >> To combat this in the past I've added a way for oop iterate closures to turn off this verification. However, every single time we add a new closure we have to consider if we can allow this verification check or if we have to remove it. Personally, I think this is a false abstraction and also widens the oop iterate closure interface. I previously proposed a patch that moved the verification code down into the oop iterate closures. It wasn't a huge patch, but I got push-back that it was convenient for other GCs to get this automatic verification, and the review stalled. >> >> In this new patch I propose a different way to retain the verification. The realization is that most oop iterate closures have to deal with both compressed and non-compressed oops, so the code typically looks like this: >> >> template >> inline void G1ScanCardClosure::do_oop_work(T* p) { >> T o = RawAccess<>::oop_load(p); >> if (CompressedOops::is_null(o)) { >> return; >> } >> oop obj = CompressedOops::decode_not_null(o); >> >> Therefore the suggest new place to put the is_in verification is in the CompressedOops::decode*. This injects the assert into almost all non-ZGC closures, and also to places that don't use the oop iterate closure framework. I think this is a neat workaround, and hope this patch is accepted this time. >> >> I've tested this patch a few weeks ago, but will rerun the relevant tiers. > > src/hotspot/share/memory/filemap.cpp line 1740: > >> 1738: narrowOop n = CompressedOops::narrow_oop_cast(offset); >> 1739: if (with_current_oop_encoding_mode) { >> 1740: return cast_from_oop
(CompressedOops::decode_raw_not_null(n)); > > Why does this line skip verification now? This code is used to setup the CDS archive. At that point in time, the heap isn't mapped yet, and there's no heap region at the suggested offset, so the new assert fails. HeapRegion::is_in_reserved (this=0x0, p=0x7bfe00000) HeapRegion::is_in (this=0x0, p=0x7bfe00000) G1CollectedHeap::is_in (this=0x7ffff003a100, p=0x7bfe00000) CompressedOops::decode_not_null (v=(unknown: -134479872)) FileMapInfo::decode_start_address (this=0x7ffff05d3e60, spc=0x7ffff05d4000, with_current_oop_encoding_mode=true) FileMapInfo::start_address_as_decoded_with_current_oop_encoding_mode FileMapInfo::get_heap_regions_range_with_current_oop_encoding_mode FileMapInfo::map_heap_regions_impl FileMapInfo::map_heap_regions MetaspaceShared::map_archives MetaspaceShared::initialize_runtime_shared_and_meta_spaces () Metaspace::global_initialize universe_init () init_globals () Threads::create_vm Note that the heap region is null (this=0x0). This also mean that the returned oop is not actually a valid oop at all, and this can sort of be seen by the immediate cast to address. ------------- PR: https://git.openjdk.java.net/jdk/pull/797 From rrich at openjdk.java.net Thu Oct 22 08:26:17 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Thu, 22 Oct 2020 08:26:17 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 08:04:48 GMT, Robbin Ehn wrote: >> src/hotspot/share/prims/jvmtiEnv.cpp line 1663: >> >>> 1661: return JVMTI_ERROR_OUT_OF_MEMORY; >>> 1662: } >>> 1663: if (!eb.deoptimize_objects(1)) { >> >> Oh and why is the depth 1 here, when two frames are deoptimized? Maybe I missed something. > > @reinrich did I mess something up when merging this in? Depth 1 means top frame and its caller. In UpdateForPopTopFrameClosure::doit() line 1606(?) the 2 top frames are deoptimized. Reallocating objects while a frame pop request is processed does not work if reallocation fails therefore we use an EscapeBarrier to eagerly reallocate objects beforehand. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From stefank at openjdk.java.net Thu Oct 22 08:31:13 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 22 Oct 2020 08:31:13 GMT Subject: Integrated: 8255047: Add HotSpot UseDebuggerErgo flags In-Reply-To: <_0YMtDNWAGXhg-_EyMf0c8GoNE4cf4wf1hpyr9j-sNs=.2ec5bf8b-023e-4bbd-9692-f74cd6c97f8d@github.com> References: <_0YMtDNWAGXhg-_EyMf0c8GoNE4cf4wf1hpyr9j-sNs=.2ec5bf8b-023e-4bbd-9692-f74cd6c97f8d@github.com> Message-ID: <5OxpZ63Vhv0CXOwsoLp-aaOmtn6ziBvwJrnn32vm68M=.9041cd8b-25c9-4d9b-9ca1-eec79ce2b0a9@github.com> On Tue, 20 Oct 2020 10:10:12 GMT, Stefan Karlsson wrote: > Some debuggers don't work well with many threads, and/or incompletely restricts the number of used CPUs to one. > > This flag is intended as a catch-all for HotSpot developers (not available in product builds) to allow us to more easily use those debuggers. > > Currently, the proposal is to let the flag fix a few things: > 1) Turn down the number of JVM threads > 2) Turn off NUMA > 3) Force processor_id() to return 0 instead of values above processor_count() > > (1) is purely ergonomics: gdb, rr, valgrind is faster and seems to work much better with fewer threads. The values would still be overridable by devs. > > (2) and (3) deals with the fact that some debuggers change the reported processor count, but don't change the processor ids returned by sched_getcpu. This causes problems for ZGC and NUMA, that both assumes that they can rely on os::processor_id() < os::processor_count(). > > The current proposed flag name is -XX:+LimitedCPUsDebugging. I'm not entirely happy with that name, but I been able to find a better name. > > An alternative to having one flag, is to split this into two flags, and maybe that would solve the naming problem. However, the usability aspects will be worse. > > If we can't find a suitable name, I rather introduce a flag called: > -XX:DebuggerWorkarounds or -XX:DebuggerWorkaround1 > > Any suggestions / opinions? I really do want to at least fix the (2, 3) problem, because I keep having to add this to every single branch I'm working on. This pull request has now been integrated. Changeset: ae72b528 Author: Stefan Karlsson URL: https://git.openjdk.java.net/jdk/commit/ae72b528 Stats: 51 lines in 4 files changed: 48 ins; 0 del; 3 mod 8255047: Add HotSpot UseDebuggerErgo flags Reviewed-by: dcubed, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/763 From rrich at openjdk.java.net Thu Oct 22 08:45:18 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Thu, 22 Oct 2020 08:45:18 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: Message-ID: <1Z_x_ChC5sDgWgl7FPcgZMwuxi2g3yNlbQPb5Ah93BY=.b07aeb23-cd2c-4628-bca8-1c9530520688@github.com> On Thu, 22 Oct 2020 08:14:47 GMT, Richard Reingruber wrote: >> @reinrich did I mess something up when merging this in? > > Stack frames are counted beginning at 0. The top frame has depth 0. So object deoptimization happens in the top frame. > > Still the used method is not optimal because it assumes that objects of frames within the given depth are accessed and their escape state is changed. But potentially caller methods optimized on the escape state therefore it searches for caller frames passing ArgEscape objects and deoptimizes these too. With ForceEarlyReturn no objects are accessed but it is so uncommon that I did not bother optimizing this. Should I? @robehn you haven't messed up. Hope I havn't either. I've tested ============================== Test summary ============================== TEST TOTAL PASS FAIL ERROR jtreg:test/hotspot/jtreg:hotspot_serviceability 197 197 0 0 jtreg:test/jdk:jdk_svc 1176 1176 0 0 jtreg:test/jdk:jdk_jdi 174 174 0 0 jtreg:test/hotspot/jtreg:vmTestbase_nsk_jdi 1141 1141 0 0 jtreg:test/hotspot/jtreg:vmTestbase_nsk_jvmti 648 648 0 0 jtreg:test/hotspot/jtreg:vmTestbase_nsk_jdwp 113 113 0 0 ============================== TEST SUCCESS jdk_jdi now includes jdk/com/sun/jdi/EATests.java which tests PopFrame/ForceEarlyReturn with object reallocation with and without reallocation failures. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From rrich at openjdk.java.net Thu Oct 22 08:53:13 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Thu, 22 Oct 2020 08:53:13 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 08:23:38 GMT, Richard Reingruber wrote: >> @reinrich did I mess something up when merging this in? > > Depth 1 means top frame and its caller. In UpdateForPopTopFrameClosure::doit() line 1606(?) the 2 top frames are deoptimized. Reallocating objects while a frame pop request is processed does not work if reallocation fails therefore we use an EscapeBarrier to eagerly reallocate objects beforehand. @fisk for PopFrame the top frame needs to be deoptimized (if compiled) to be able to actually remove it when the thread is resumed. Its caller needs to be deoptimized to be able restart the call. For ForceEarlyReturn it is not necessary to restart. The target can return to a compiled caller and continue executing compiled code. So the caller frame is not deoptimized. @robehn nothing is messed up. Thanks again for doing it. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From stefank at openjdk.java.net Thu Oct 22 08:57:13 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 22 Oct 2020 08:57:13 GMT Subject: RFR: 8237363 remove oop iterate verification In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 07:44:50 GMT, Aleksey Shipilev wrote: > This has two minor drawbacks for GC implementations that verify oops with their own asserts (like Shenandoah): they would call into `CollectedHeap::is_in` twice (once from shared code assert, and once from their own), and then also fail with non-rich assert (in the shared code) when something goes wrong. Of course, that can be mitigated by calling into `_raw` versions. Yes, those are the trade-offs. Do you consider this a blocker? I personally wouldn't mind completely removing this verification from the shared code, and let all the GCs do their own oop verification. But others have had the opposite opinion. ------------- PR: https://git.openjdk.java.net/jdk/pull/797 From shade at openjdk.java.net Thu Oct 22 09:06:16 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 22 Oct 2020 09:06:16 GMT Subject: RFR: 8237363: Remove automatic is in heap verification in OopIterateClosure In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 08:54:27 GMT, Stefan Karlsson wrote: > > This has two minor drawbacks for GC implementations that verify oops with their own asserts (like Shenandoah): they would call into `CollectedHeap::is_in` twice (once from shared code assert, and once from their own), and then also fail with non-rich assert (in the shared code) when something goes wrong. Of course, that can be mitigated by calling into `_raw` versions. > > Yes, those are the trade-offs. Do you consider this a blocker? Not really a blocker, just mentioning the follow-up work that might need to be done. On one hand, we almost never omit asserts on performance grounds, but on the other hand, this adds assert on a rather frequent path. I don't mind having an extra safety there, and then let callers (e.g. GCs) to use `_raw` versions to make `fastdebug` testing a tad faster. >> src/hotspot/share/memory/filemap.cpp line 1740: >> >>> 1738: narrowOop n = CompressedOops::narrow_oop_cast(offset); >>> 1739: if (with_current_oop_encoding_mode) { >>> 1740: return cast_from_oop
(CompressedOops::decode_raw_not_null(n)); >> >> Why does this line skip verification now? > > This code is used to setup the CDS archive. At that point in time, the heap isn't mapped yet, and there's no heap region at the suggested offset, so the new assert fails. > HeapRegion::is_in_reserved (this=0x0, p=0x7bfe00000) > HeapRegion::is_in (this=0x0, p=0x7bfe00000) > G1CollectedHeap::is_in (this=0x7ffff003a100, p=0x7bfe00000) > CompressedOops::decode_not_null (v=(unknown: -134479872)) > FileMapInfo::decode_start_address (this=0x7ffff05d3e60, spc=0x7ffff05d4000, with_current_oop_encoding_mode=true) > FileMapInfo::start_address_as_decoded_with_current_oop_encoding_mode > FileMapInfo::get_heap_regions_range_with_current_oop_encoding_mode > FileMapInfo::map_heap_regions_impl > FileMapInfo::map_heap_regions > MetaspaceShared::map_archives > MetaspaceShared::initialize_runtime_shared_and_meta_spaces () > Metaspace::global_initialize > universe_init () > init_globals () > Threads::create_vm > > Note that the heap region is null (this=0x0). This also mean that the returned oop is not actually a valid oop at all, and this can sort of be seen by the immediate cast to address. Ew. That seems to imply that coops decoding is now tied to heap initialization for these asserts to work. I am pleasantly surprised it only fails in one place! I guess it is fine. ------------- PR: https://git.openjdk.java.net/jdk/pull/797 From stefank at openjdk.java.net Thu Oct 22 09:06:16 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 22 Oct 2020 09:06:16 GMT Subject: RFR: 8237363: Remove automatic is in heap verification in OopIterateClosure In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 09:01:23 GMT, Aleksey Shipilev wrote: > > > This has two minor drawbacks for GC implementations that verify oops with their own asserts (like Shenandoah): they would call into `CollectedHeap::is_in` twice (once from shared code assert, and once from their own), and then also fail with non-rich assert (in the shared code) when something goes wrong. Of course, that can be mitigated by calling into `_raw` versions. > > > > > > Yes, those are the trade-offs. Do you consider this a blocker? > > Not really a blocker, just mentioning the follow-up work that might need to be done. > > On one hand, we almost never omit asserts on performance grounds, but on the other hand, this adds assert on a rather frequent path. I don't mind having an extra safety there, and then let callers (e.g. GCs) to use `_raw` versions to make `fastdebug` testing a tad faster. OK. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/797 From shade at openjdk.java.net Thu Oct 22 09:27:18 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 22 Oct 2020 09:27:18 GMT Subject: RFR: 8142984: Zero: fast accessors should handle both getters and setters Message-ID: <0skRfs7hB88JHFy53lVD0Fvt-JlF2HWGTC05qMgHidA=.346c67ce-af20-41ba-b4fa-a24e8ca6c0e2@github.com> It started as removing the TODO item in `abstractInterpreter.cpp`. Zero is the only implementation that treats `accessor` to mean `getter`, which makes the awkward choice in the entry selection. After going back and forth (including trying to remove the fast accessor methods altogether in [JDK-8255066](https://bugs.openjdk.java.net/browse/JDK-8255066)), I settled on implementing the fast Zero `setter`-s too, plus renaming and whipping the existing `getter` code in shape. The end result seems to be more straight-forward than it was before. On the plus side, it improves `make bootcycle-images` in release mode from ~47m40s to ~46m50s, because we are saving time doing the `normal_entry` for setters. The "normal", non-Zero template interpreter is not affected, because it does not have any specializations for `accessor`, `getter` or `setter`, and instead just doing the normal entry. Testing: - [x] Linux x86_64 {fastdebug, release} Zero `make bootcycle-images` - [x] Linux aarch64 {fastdebug, release} Zero `make bootcycle-images` - [x] Linux x86_64 Zero release jcstress - [x] Linux aarch64 Zero release jcstress ------------- Commit messages: - 8142984: Zero: fast accessors should handle both getters and setters Changes: https://git.openjdk.java.net/jdk/pull/728/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=728&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8142984 Stats: 202 lines in 7 files changed: 97 ins; 38 del; 67 mod Patch: https://git.openjdk.java.net/jdk/pull/728.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/728/head:pull/728 PR: https://git.openjdk.java.net/jdk/pull/728 From eosterlund at openjdk.java.net Thu Oct 22 10:07:18 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 22 Oct 2020 10:07:18 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: <1Z_x_ChC5sDgWgl7FPcgZMwuxi2g3yNlbQPb5Ah93BY=.b07aeb23-cd2c-4628-bca8-1c9530520688@github.com> References: <1Z_x_ChC5sDgWgl7FPcgZMwuxi2g3yNlbQPb5Ah93BY=.b07aeb23-cd2c-4628-bca8-1c9530520688@github.com> Message-ID: On Thu, 22 Oct 2020 08:42:40 GMT, Richard Reingruber wrote: >> Stack frames are counted beginning at 0. The top frame has depth 0. So object deoptimization happens in the top frame. >> >> Still the used method is not optimal because it assumes that objects of frames within the given depth are accessed and their escape state is changed. But potentially caller methods optimized on the escape state therefore it searches for caller frames passing ArgEscape objects and deoptimizes these too. With ForceEarlyReturn no objects are accessed but it is so uncommon that I did not bother optimizing this. Should I? > > @robehn you haven't messed up. Hope I havn't either. I've tested > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg:hotspot_serviceability 197 197 0 0 > jtreg:test/jdk:jdk_svc 1176 1176 0 0 > jtreg:test/jdk:jdk_jdi 174 174 0 0 > jtreg:test/hotspot/jtreg:vmTestbase_nsk_jdi 1141 1141 0 0 > jtreg:test/hotspot/jtreg:vmTestbase_nsk_jvmti 648 648 0 0 > jtreg:test/hotspot/jtreg:vmTestbase_nsk_jdwp 113 113 0 0 > ============================== > TEST SUCCESS > jdk_jdi now includes jdk/com/sun/jdi/EATests.java which tests PopFrame/ForceEarlyReturn with object reallocation with and without reallocation failures. Ah. I see now the loop uses <= instead of <. So my reasoning was right but off by 1. Passing in 0 really means deopt 1 frame. Which means everything is fine and working the way I expect it to. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From eosterlund at openjdk.java.net Thu Oct 22 10:07:18 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 22 Oct 2020 10:07:18 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: Message-ID: On Wed, 21 Oct 2020 08:40:47 GMT, Robbin Ehn wrote: >> The main point of this change-set is to make it easier to implement S/R on top of handshakes. >> Which is a prerequisite for removing _suspend_flag (which duplicates the handshake functionality). >> But we also remove some complicated S/R methods. >> >> We basically just put in everything in the handshake closure, so the diff just looks much worse than what it is. >> >> TraceSuspendDebugBits have an ifdef, but in both cases it now just returns. >> But I was unsure if I should remove now or when is_ext_suspend_completed() is removed. >> >> Passes multiple t1-5 runs, locally it passes many jck:vm/nsk_jvmti/nsk_jdi/jdk-jdi runs. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: > > - Fixed merge miss > - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended > - Merge fix from Richard > - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended > - Removed TraceSuspendDebugBits > - Removed unused method is_ext_suspend_completed_with_lock > - Utilize handshakes instead of is_thread_fully_suspended Looks good. Awesome fix IMO. ------------- Marked as reviewed by eosterlund (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/729 From rehn at openjdk.java.net Thu Oct 22 10:17:26 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 22 Oct 2020 10:17:26 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v4] In-Reply-To: References: Message-ID: > The main point of this change-set is to make it easier to implement S/R on top of handshakes. > Which is a prerequisite for removing _suspend_flag (which duplicates the handshake functionality). > But we also remove some complicated S/R methods. > > We basically just put in everything in the handshake closure, so the diff just looks much worse than what it is. > > TraceSuspendDebugBits have an ifdef, but in both cases it now just returns. > But I was unsure if I should remove now or when is_ext_suspend_completed() is removed. > > Passes multiple t1-5 runs, locally it passes many jck:vm/nsk_jvmti/nsk_jdi/jdk-jdi runs. Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: - Merge - Review updates - Fixed merge miss - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended - Merge fix from Richard - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended - Removed TraceSuspendDebugBits - Removed unused method is_ext_suspend_completed_with_lock - Utilize handshakes instead of is_thread_fully_suspended ------------- Changes: https://git.openjdk.java.net/jdk/pull/729/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=729&range=03 Stats: 611 lines in 6 files changed: 174 ins; 376 del; 61 mod Patch: https://git.openjdk.java.net/jdk/pull/729.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/729/head:pull/729 PR: https://git.openjdk.java.net/jdk/pull/729 From rehn at openjdk.java.net Thu Oct 22 10:31:16 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 22 Oct 2020 10:31:16 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: <1Z_x_ChC5sDgWgl7FPcgZMwuxi2g3yNlbQPb5Ah93BY=.b07aeb23-cd2c-4628-bca8-1c9530520688@github.com> Message-ID: On Thu, 22 Oct 2020 10:04:01 GMT, Erik ?sterlund wrote: >> @robehn you haven't messed up. Hope I havn't either. I've tested >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >> jtreg:test/hotspot/jtreg:hotspot_serviceability 197 197 0 0 >> jtreg:test/jdk:jdk_svc 1176 1176 0 0 >> jtreg:test/jdk:jdk_jdi 174 174 0 0 >> jtreg:test/hotspot/jtreg:vmTestbase_nsk_jdi 1141 1141 0 0 >> jtreg:test/hotspot/jtreg:vmTestbase_nsk_jvmti 648 648 0 0 >> jtreg:test/hotspot/jtreg:vmTestbase_nsk_jdwp 113 113 0 0 >> ============================== >> TEST SUCCESS >> jdk_jdi now includes jdk/com/sun/jdi/EATests.java which tests PopFrame/ForceEarlyReturn with object reallocation with and without reallocation failures. > > Ah. I see now the loop uses <= instead of <. So my reasoning was right but off by 1. Passing in 0 really means deopt 1 frame. Which means everything is fine and working the way I expect it to. Great ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From rehn at openjdk.java.net Thu Oct 22 10:31:17 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 22 Oct 2020 10:31:17 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 08:50:54 GMT, Richard Reingruber wrote: >> Depth 1 means top frame and its caller. In UpdateForPopTopFrameClosure::doit() line 1606(?) the 2 top frames are deoptimized. Reallocating objects while a frame pop request is processed does not work if reallocation fails therefore we use an EscapeBarrier to eagerly reallocate objects beforehand. > > @fisk for PopFrame the top frame needs to be deoptimized (if compiled) to be able to actually remove it when the thread is resumed. Its caller needs to be deoptimized to be able restart the call. For ForceEarlyReturn it is not necessary to restart. The target can return to a compiled caller and continue executing compiled code. So the caller frame is not deoptimized. > > @robehn nothing is messed up. Thanks again for doing it. Great! ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From redestad at openjdk.java.net Thu Oct 22 10:41:18 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Thu, 22 Oct 2020 10:41:18 GMT Subject: RFR: 8255231: Avoid upcalls when initializing the statSampler Message-ID: Current implementation of the statSampler does upcalls to System.getProperty to collect values for a number of properties that are all provided by the VM itself. And since the sampling starts before any user code run then no property can have changed. I suggest refactoring the code so that no upcalls are made normally - while asserting this invariant holds using assert-only upcalls. This is a small startup optimization - reducing the startup sequence by approx. 300k instructions and 70k branches in my linux-x64 setup. ------------- Commit messages: - Merge branch 'master' into com_ns - Improve comments - typo - Missing definition - Extract the shorthand java.version from VersionProps and use it in StatSampler - Improve assert - Assert on missing value - Re-arrange assertions to flow TRAPS through ok - Re-arrange assertions to flow TRAPS through ok - Remove TRAPS from assert-only method - ... and 6 more: https://git.openjdk.java.net/jdk/compare/cc50c8d4...7b922c2a Changes: https://git.openjdk.java.net/jdk/pull/802/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=802&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255231 Stats: 126 lines in 8 files changed: 38 ins; 39 del; 49 mod Patch: https://git.openjdk.java.net/jdk/pull/802.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/802/head:pull/802 PR: https://git.openjdk.java.net/jdk/pull/802 From njian at openjdk.java.net Thu Oct 22 10:44:12 2020 From: njian at openjdk.java.net (Ningsheng Jian) Date: Thu, 22 Oct 2020 10:44:12 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v9] In-Reply-To: References: Message-ID: <8Ryyxuf5P2D6WNyj4riYCTgN0U6WLrLpBmxhNbnmPpQ=.b2ed5660-99d0-49d1-83e0-8b2de518d7b8@github.com> On Wed, 21 Oct 2020 12:13:27 GMT, Jatin Bhateja wrote: >> Summary: >> >> 1) Partial in-lining technique avoids call overhead penalty for small array copy operations with size less than 32 bytes. >> 2) At runtime, a conditional check based on copy length either calls an array-copy stub or executes an optimized instruction sequence using AVX-512 masked instructions emitted at the call site. >> 3) New runtime flag ArrayCopyPartialInlineSize=0/32(default)/64 bytes determines the maximum size for partial in-lining. >> 4) Based on the perf results seen in benchmarks currently partial in-lining is performed only for arraycopy involving sub-word types (bool/byte/char/short). Once PR-61 gets integrated we can extend this patch to cover all the primitive types. >> >> Performance Results: >> System : CascadeLake Server, Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz >> Micros : test/micro/org/openjdk/bench/java/lang/ArrayCopy*.java >> ArrayCopyPartialInlineSize : 32 >> >> JMH | Block Size | Baseline (ns/op) | Partial Inling (ns/op) | Gain >> -- | -- | -- | -- | -- >> ArrayCopyAligned.testByte | 1 | 5.417 | 2.696 | 2.009272997 >> ArrayCopyAligned.testByte | 3 | 5.494 | 2.702 | 2.03330866 >> ArrayCopyAligned.testByte | 5 | 5.417 | 2.637 | 2.05422829 >> ArrayCopyAligned.testByte | 10 | 5.343 | 2.703 | 1.976692564 >> ArrayCopyAligned.testByte | 20 | 5.837 | 2.636 | 2.214339909 >> ArrayCopyAligned.testByte | 70 | 5.86 | 6 | 0.976666667 >> ArrayCopyAligned.testByte | 150 | 6.766 | 6.906 | 0.979727773 >> ArrayCopyAligned.testByte | 300 | 7.605 | 7.952 | 0.956363179 >> ArrayCopyAligned.testByte | 600 | 11.989 | 12.007 | 0.998500874 >> ArrayCopyAligned.testByte | 1200 | 16.447 | 16.585 | 0.991679228 >> ArrayCopyAligned.testChar | 1 | 5.02 | 2.828 | 1.775106082 >> ArrayCopyAligned.testChar | 3 | 5.129 | 2.762 | 1.85698769 >> ArrayCopyAligned.testChar | 5 | 5.041 | 2.762 | 1.82512672 >> ArrayCopyAligned.testChar | 10 | 5.716 | 2.762 | 2.069514844 >> ArrayCopyAligned.testChar | 20 | 5.111 | 5.399 | 0.946656788 >> ArrayCopyAligned.testChar | 70 | 6.271 | 6.242 | 1.004645947 >> ArrayCopyAligned.testChar | 150 | 7.45 | 7.599 | 0.980392157 >> ArrayCopyAligned.testChar | 300 | 9.904 | 10.112 | 0.97943038 >> ArrayCopyAligned.testChar | 600 | 17.131 | 17.167 | 0.997902953 >> ArrayCopyAligned.testChar | 1200 | 29.556 | 29.851 | 0.990117584 >> ArrayCopyUnalignedBoth.testByte | 1 | 5.419 | 2.702 | 2.005551443 >> ArrayCopyUnalignedBoth.testByte | 3 | 5.558 | 2.636 | 2.108497724 >> ArrayCopyUnalignedBoth.testByte | 5 | 5.43 | 2.636 | 2.059939302 >> ArrayCopyUnalignedBoth.testByte | 10 | 5.378 | 2.637 | 2.039438756 >> ArrayCopyUnalignedBoth.testByte | 20 | 5.914 | 2.636 | 2.243550836 >> ArrayCopyUnalignedBoth.testByte | 70 | 5.882 | 5.954 | 0.987907289 >> ArrayCopyUnalignedBoth.testByte | 150 | 6.784 | 6.88 | 0.986046512 >> ArrayCopyUnalignedBoth.testByte | 300 | 7.635 | 7.968 | 0.958207831 >> ArrayCopyUnalignedBoth.testByte | 600 | 12.226 | 12.129 | 1.007997362 >> ArrayCopyUnalignedBoth.testByte | 1200 | 16.992 | 20.717 | 0.820195974 >> ArrayCopyUnalignedBoth.testChar | 1 | 5.019 | 2.828 | 1.774752475 >> ArrayCopyUnalignedBoth.testChar | 3 | 5.163 | 2.763 | 1.868621064 >> ArrayCopyUnalignedBoth.testChar | 5 | 5.042 | 2.827 | 1.783516095 >> ArrayCopyUnalignedBoth.testChar | 10 | 5.718 | 2.828 | 2.021923621 >> ArrayCopyUnalignedBoth.testChar | 20 | 5.111 | 5.404 | 0.945780903 >> ArrayCopyUnalignedBoth.testChar | 70 | 6.367 | 6.235 | 1.02117081 >> ArrayCopyUnalignedBoth.testChar | 150 | 7.367 | 8.269 | 0.890917886 >> ArrayCopyUnalignedBoth.testChar | 300 | 10.358 | 10.642 | 0.973313287 >> ArrayCopyUnalignedBoth.testChar | 600 | 20.84 | 17.522 | 1.189361945 >> ArrayCopyUnalignedBoth.testChar | 1200 | 31.895 | 31.892 | 1.000094067 >> ArrayCopyUnalignedDst.testByte | 1 | 5.455 | 2.637 | 2.068638604 >> ArrayCopyUnalignedDst.testByte | 3 | 5.562 | 2.702 | 2.058475204 >> ArrayCopyUnalignedDst.testByte | 5 | 5.427 | 2.702 | 2.008512213 >> ArrayCopyUnalignedDst.testByte | 10 | 5.367 | 2.696 | 1.990727003 >> ArrayCopyUnalignedDst.testByte | 20 | 5.839 | 2.637 | 2.214258627 >> ArrayCopyUnalignedDst.testByte | 70 | 5.888 | 5.968 | 0.986595174 >> ArrayCopyUnalignedDst.testByte | 150 | 6.785 | 6.773 | 1.001771741 >> ArrayCopyUnalignedDst.testByte | 300 | 7.606 | 7.972 | 0.954089313 >> ArrayCopyUnalignedDst.testByte | 600 | 11.986 | 21.195 | 0.565510734 >> ArrayCopyUnalignedDst.testByte | 1200 | 16.54 | 16.784 | 0.985462345 >> ArrayCopyUnalignedDst.testChar | 1 | 5.02 | 2.827 | 1.775733994 >> ArrayCopyUnalignedDst.testChar | 3 | 5.131 | 2.762 | 1.857711803 >> ArrayCopyUnalignedDst.testChar | 5 | 5.038 | 2.762 | 1.82404055 >> ArrayCopyUnalignedDst.testChar | 10 | 5.718 | 2.762 | 2.070238957 >> ArrayCopyUnalignedDst.testChar | 20 | 5.113 | 5.401 | 0.946676541 >> ArrayCopyUnalignedDst.testChar | 70 | 6.222 | 6.214 | 1.001287416 >> ArrayCopyUnalignedDst.testChar | 150 | 7.367 | 8.125 | 0.906707692 >> ArrayCopyUnalignedDst.testChar | 300 | 10.204 | 10.082 | 1.012100774 >> ArrayCopyUnalignedDst.testChar | 600 | 16.978 | 17.135 | 0.990837467 >> ArrayCopyUnalignedDst.testChar | 1200 | 32.351 | 31.996 | 1.011095137 >> ArrayCopyUnalignedSrc.testByte | 1 | 5.414 | 2.696 | 2.008160237 >> ArrayCopyUnalignedSrc.testByte | 3 | 5.494 | 2.637 | 2.083428138 >> ArrayCopyUnalignedSrc.testByte | 5 | 5.431 | 2.637 | 2.059537353 >> ArrayCopyUnalignedSrc.testByte | 10 | 5.344 | 2.703 | 1.977062523 >> ArrayCopyUnalignedSrc.testByte | 20 | 5.834 | 2.696 | 2.163946588 >> ArrayCopyUnalignedSrc.testByte | 70 | 5.883 | 6.009 | 0.979031453 >> ArrayCopyUnalignedSrc.testByte | 150 | 6.729 | 6.87 | 0.979475983 >> ArrayCopyUnalignedSrc.testByte | 300 | 7.603 | 7.97 | 0.953952321 >> ArrayCopyUnalignedSrc.testByte | 600 | 12.004 | 12.16 | 0.987171053 >> ArrayCopyUnalignedSrc.testByte | 1200 | 16.534 | 16.643 | 0.9934507 >> ArrayCopyUnalignedSrc.testChar | 1 | 5.021 | 2.762 | 1.81788559 >> ArrayCopyUnalignedSrc.testChar | 3 | 5.13 | 2.762 | 1.857349747 >> ArrayCopyUnalignedSrc.testChar | 5 | 5.042 | 2.827 | 1.783516095 >> ArrayCopyUnalignedSrc.testChar | 10 | 5.726 | 2.761 | 2.073886273 >> ArrayCopyUnalignedSrc.testChar | 20 | 5.112 | 5.401 | 0.94649139 >> ArrayCopyUnalignedSrc.testChar | 70 | 6.113 | 6.227 | 0.981692629 >> ArrayCopyUnalignedSrc.testChar | 150 | 7.493 | 7.888 | 0.949923935 >> ArrayCopyUnalignedSrc.testChar | 300 | 10.234 | 10.501 | 0.97457385 >> ArrayCopyUnalignedSrc.testChar | 600 | 17.175 | 17.142 | 1.001925096 >> ArrayCopyUnalignedSrc.testChar | 1200 | 31.926 | 31.987 | 0.998092975 >> >> Detailed Reports: >> Baseline : [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt) >> WithOpt : [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt) > > Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8252848 Thanks for the impressive work Jatin! I believe it will also be helpful for our Arm SVE work. I just took a quick look and have some questions. src/hotspot/share/opto/vectornode.cpp line 775: > 773: VectorMaskGenNode* make(int opc, Node* src, const Type* ty, const Type* ety) { > 774: return new VectorMaskGenNode(src, ty, ety); > 775: } These are not used? src/hotspot/share/opto/vectornode.hpp line 835: > 833: static VectorMaskGenNode* make(int opc, Node* src, const Type* ty, const Type* ety); > 834: private: > 835: const Type* _elemType; Will an additional field in the node valid after some optimizations, i.e. clone()? I think I know the ety, but I don't know the usage of ty. If so, do you need to have a new type like what TypeVect does for mask? src/hotspot/share/opto/vectornode.hpp line 826: > 824: class VectorMaskGenNode : public TypeNode { > 825: public: > 826: VectorMaskGenNode(Node* src, const Type* ty, const Type* ety): TypeNode(ty, 2), _elemType(ety) { Sorry, I don't quite understand the arguments here. What does 'src' mean to the mask? ------------- PR: https://git.openjdk.java.net/jdk/pull/302 From rehn at openjdk.java.net Thu Oct 22 10:47:20 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 22 Oct 2020 10:47:20 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: References: Message-ID: <4WNbKclJp0jLscBp5IV84_z1gRX8TWH1SSWPHnaR12Y=.bd000576-9225-4ea6-9561-091e41bf427e@github.com> On Thu, 22 Oct 2020 10:04:48 GMT, Erik ?sterlund wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: >> >> - Fixed merge miss >> - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended >> - Merge fix from Richard >> - Merge branch 'master' into 8223312-Utilize-handshakes-instead-of-is_thread_fully_suspended >> - Removed TraceSuspendDebugBits >> - Removed unused method is_ext_suspend_completed_with_lock >> - Utilize handshakes instead of is_thread_fully_suspended > > Looks good. Awesome fix IMO. Passes my local testing: open/test/jdk/com/sun/jdi/EATests.java, nsk_jvmti, nsk_jdi, jdk_jdi, jck:vm. Still running t1-t5 in test system. I will be integrating later today, so the ZGC/EscapeBarrier issue can be resolved (which is semi-dependent on this). Thanks @dholmes-ora, @dcubed-ojdk, @reinrich, @fisk ! ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From redestad at openjdk.java.net Thu Oct 22 11:28:22 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Thu, 22 Oct 2020 11:28:22 GMT Subject: RFR: 8255231: Avoid upcalls when initializing the statSampler [v2] In-Reply-To: References: Message-ID: <9yiGXHEPKlYTvGUxxwzhpswuKuCsG6n434AnGAFYqPQ=.d90219a0-b71f-4550-bd90-44fbfee70e20@github.com> > Current implementation of the statSampler does upcalls to System.getProperty to collect values for a number of properties that are all provided by the VM itself. And since the sampling starts before any user code run then no property can have changed. > > I suggest refactoring the code so that no upcalls are made normally - while asserting this invariant holds using assert-only upcalls. > > This is a small startup optimization - reducing the startup sequence by approx. 300k instructions and 70k branches in my linux-x64 setup. Claes Redestad has updated the pull request incrementally with one additional commit since the last revision: Revert unrelated changes to perfData ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/802/files - new: https://git.openjdk.java.net/jdk/pull/802/files/7b922c2a..5daedb01 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=802&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=802&range=00-01 Stats: 21 lines in 2 files changed: 17 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/802.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/802/head:pull/802 PR: https://git.openjdk.java.net/jdk/pull/802 From zgu at openjdk.java.net Thu Oct 22 12:23:15 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Thu, 22 Oct 2020 12:23:15 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v10] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> <5LY7Ty5Q2S8j9S30uXyeX0EE8AQWQ4BFFd02NYMzVio=.2e423faa-895a-42c9-ad9f-6c85844e1ff8@github.com> Message-ID: On Wed, 21 Oct 2020 20:07:11 GMT, Roman Kennke wrote: >> src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.hpp line 64: >> >>> 62: void load_reference_barrier_not_null(MacroAssembler* masm, Register dst, Address load_addr); >>> 63: void load_reference_barrier_native(MacroAssembler* masm, Register dst, Address load_addr, bool native); >>> 64: >> >> is_native parameter seems weird. Maybe invert to is_weak_ref? >> BTW, I think I am seeing compressed oops in conc-stack-scanning. > > I think it's clearer as it is. The motivation for this is that native references are always oops, while weak reference's referents can be oops or narrowOops. Which means that we need to call a different method for native-refs (the always-oops entry point). > The interesting differentiator is native vs. not-native, because one is always-oops, the other can be narrowOop. Weak vs not-weak is not as clear because there can also be weak/phantom native-refs. The weird part is the method name, load_reference_barrier_**native**(), then you have parameter says that it may not be a native load and the parameter does nothing to hint the possible type differential. How about may_narrow_oop? ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From mcimadamore at openjdk.java.net Thu Oct 22 13:37:17 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 22 Oct 2020 13:37:17 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v7] In-Reply-To: References: Message-ID: On Tue, 20 Oct 2020 21:31:17 GMT, Paul Sandoz wrote: >> Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 25 commits: >> >> - Merge branch 'master' into 8254231_linker >> - Fix incorrect capitalization in one copyright header >> - Update copyright years, and add classpath exception to files that were missing it >> - Use separate constants for native invoker code size >> - Re-add file erroneously deleted (detected as rename) >> - Re-add erroneously removed files >> - Merge branch 'master' into 8254231_linker >> >> - Fix tests >> - Fix more whitespaces >> - Fix whitespaces >> - Remove rejected file >> - ... and 15 more: https://git.openjdk.java.net/jdk/compare/cb6167b2...502bd980 > > src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/CLinker.java line 126: > >> 124: * >> 125: * @param symbol downcall symbol. >> 126: * @param type the method type. > > s/method type/carrier type ? Not sure about this one? E.g. in my mental model, I often have seen "carrier type" associated with j.l.Class ? > src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/CLinker.java line 139: > >> 137: * >> 138: *

The returned segment is not thread-confined, and it only features >> 139: * the {@link MemorySegment#CLOSE} access mode. When the returned segment is closed, > > Implying that it is shared? If so might be better to state that directly (with a link), and can be closed explicitly or left until can be collected by the GC? `The returned segment is not thread-confined` ? Since it features CLOSE, it can be closed explicitly - I'm not sure 100% of what additional clarification is required - but I'm happy to make this clearer (I need more info). ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From phedlin at openjdk.java.net Thu Oct 22 14:18:11 2020 From: phedlin at openjdk.java.net (Patric Hedlin) Date: Thu, 22 Oct 2020 14:18:11 GMT Subject: RFR: 8248411: [aarch64] Insufficient error handling when CodeBuffer is exhausted In-Reply-To: References: <02kM0GEZd0Twb-mEJxf0diGjUzXV3hcpOfXclg8gt6A=.d0d3a1af-9641-4179-ba9f-18401c05e6a4@github.com> Message-ID: On Wed, 21 Oct 2020 12:56:26 GMT, Andrew Haley wrote: >> The AArch64 changes are ok. >> <rant> I am not at all keen on the many format-only changes that are included in this patch since they introduce a lot of changed lines for the sole and rather specious benefit of adherence to a questionable orthographic authority. That's especially so with the relatively unscryable (sic) changes in output.cpp that modify declarations of the form `Foo *foo` to `Foo* foo`. One is initially left wondering what has changed only, at penny-drop, to replace that feeling with equal wonder as to why it was worth bothering, especially as there remain many thousands more such editorial opportunities. Meanwhile the substantive signal that constitutes the real patch is lost amid this noise. Of course, you may continue tilting at this windmill if you really wish to.</rant> > >> The AArch64 changes are ok. >> I am not at all keen on the many format-only changes that are included in this patch since they introduce a lot of changed lines for the sole and rather specious benefit of adherence to a questionable orthographic authority. That's especially so with the relatively unscryable (sic) changes in output.cpp that modify declarations of the form `Foo *foo` to `Foo* foo`. One is initially left wondering what has changed only, at penny-drop, to replace that feeling with equal wonder as to why it was worth bothering, especially as there remain many thousands more such editorial opportunities. Meanwhile the substantive signal that constitutes the real patch is lost amid this noise. Of course, you may continue tilting at this windmill if you really wish to. > > I agree. I'm happy enough with code being tidied up as we go along, but not with a patch like this one, where it's not even clear that the result is an improvement. Also, it doesn't make sense to "tidy up" code that is nothing to do with the patch. > > Changing `Foo *foo` to `Foo* foo` is simply wrong. The * operator binds to the right, so something like `int* a, b` looks like a and b are pointers; they're not. That's why we write `int *a, b` : we should be writing for the reader. May I suggest the following to lessen the burden of mundane white-space edits. ![image](https://user-images.githubusercontent.com/37185447/96883432-b3369400-1480-11eb-9bf9-2f1fd9ed9777.png) ------------- PR: https://git.openjdk.java.net/jdk/pull/765 From mcimadamore at openjdk.java.net Thu Oct 22 14:29:14 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 22 Oct 2020 14:29:14 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v9] In-Reply-To: References: Message-ID: <1p5weKaRsQH1q9GBnZFqKPupXb-5hYfeMm-NxaPiPUM=.db7147ad-67e0-4738-9fa7-d1afdabe3705@github.com> On Wed, 21 Oct 2020 16:23:16 GMT, Paul Sandoz wrote: >> Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 27 commits: >> >> - Merge branch 'master' into 8254231_linker >> - Don't use JNI when generating native wrappers >> >> - Merge branch 'master' into 8254231_linker >> - Fix incorrect capitalization in one copyright header >> - Update copyright years, and add classpath exception to files that were missing it >> - Use separate constants for native invoker code size >> - Re-add file erroneously deleted (detected as rename) >> - Re-add erroneously removed files >> - Merge branch 'master' into 8254231_linker >> >> - Fix tests >> - Fix more whitespaces >> - ... and 17 more: https://git.openjdk.java.net/jdk/compare/da97ab5c...8c7b75da > > src/jdk.incubator.foreign/share/classes/jdk/internal/foreign/AbstractNativeScope.java line 120: > >> 118: } >> 119: } >> 120: throw new AssertionError("Cannot get here!"); > > This code is a little confusing, effectively using an exception for control flow, within a retry loop. I recommend performing an explicit bounds check to determine if a new segment of `BLOCK_SIZE` is required from which to slice into. It will also be faster than the exceptional case. I hear you - that said, note that doing the bound check is not as trivial as it seems; you have to take into account the value of `sp` and add required alignment padding, and then perform a bound check against that - all logic which would need to be duplicated across the normal and the exceptional cases. Which is why the code settled the way it is. I'll see what I can do. ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From jvernee at openjdk.java.net Thu Oct 22 14:34:19 2020 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Thu, 22 Oct 2020 14:34:19 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v7] In-Reply-To: References: Message-ID: On Tue, 20 Oct 2020 21:08:26 GMT, Paul Sandoz wrote: >> Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 25 commits: >> >> - Merge branch 'master' into 8254231_linker >> - Fix incorrect capitalization in one copyright header >> - Update copyright years, and add classpath exception to files that were missing it >> - Use separate constants for native invoker code size >> - Re-add file erroneously deleted (detected as rename) >> - Re-add erroneously removed files >> - Merge branch 'master' into 8254231_linker >> >> - Fix tests >> - Fix more whitespaces >> - Fix whitespaces >> - Remove rejected file >> - ... and 15 more: https://git.openjdk.java.net/jdk/compare/cb6167b2...502bd980 > > src/java.base/share/classes/java/lang/invoke/NativeMethodHandle.java line 36: > >> 34: import static java.lang.invoke.MethodHandleStatics.newInternalError; >> 35: >> 36: /** TODO */ > > Is the TODO to make this class public later and adjust the return type of `downcallHandle`? IIRC this was added to silence a javac linter warning. Something should be added here. There is/was no plan to make this class public though. > src/java.base/share/classes/java/lang/invoke/NativeMethodHandle.java line 145: > >> 143: */ >> 144: private static class Lazy { >> 145: static Class THIS_CLASS = NativeMethodHandle.class; > > final field? Is this field needed, as `NativeMethodHandle.class` could be used directly, or use a local variable instead in the static code block. Yes, this was a leftover from old code. Can be a local var now, or remove altogether and replaced with `NativeMethodHandle.class` ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From jvernee at openjdk.java.net Thu Oct 22 14:39:18 2020 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Thu, 22 Oct 2020 14:39:18 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v7] In-Reply-To: References: Message-ID: On Tue, 20 Oct 2020 21:53:55 GMT, Paul Sandoz wrote: >> Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 25 commits: >> >> - Merge branch 'master' into 8254231_linker >> - Fix incorrect capitalization in one copyright header >> - Update copyright years, and add classpath exception to files that were missing it >> - Use separate constants for native invoker code size >> - Re-add file erroneously deleted (detected as rename) >> - Re-add erroneously removed files >> - Merge branch 'master' into 8254231_linker >> >> - Fix tests >> - Fix more whitespaces >> - Fix whitespaces >> - Remove rejected file >> - ... and 15 more: https://git.openjdk.java.net/jdk/compare/cb6167b2...502bd980 > > src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/FunctionDescriptor.java line 60: > >> 58: private FunctionDescriptor(MemoryLayout resLayout, Map attributes, MemoryLayout... argLayouts) { >> 59: this.resLayout = resLayout; >> 60: this.attributes = Collections.unmodifiableMap(attributes); > > Since `attributes` is never exposed directly or indirectly via a set of keys/values/entries there is no need to wrap it. True. Though, it might be nice to keep like this as a bit of sanity checking. The map _should not_ be modified after construction. ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From github.com+12972156+pmur at openjdk.java.net Thu Oct 22 14:59:21 2020 From: github.com+12972156+pmur at openjdk.java.net (Paul Murphy) Date: Thu, 22 Oct 2020 14:59:21 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v8] In-Reply-To: References: Message-ID: On Wed, 21 Oct 2020 20:43:30 GMT, CoreyAshford wrote: >> This patch set encompasses the following commits: >> >> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for the intrinsic. The API is similar to the existing encodeBlock intrinsic. >> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic implementation >> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic. >> - Adds a JMH microbenchmark for both Base64 encoding and encoding. >> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding. > > CoreyAshford has updated the pull request incrementally with one additional commit since the last revision: > > TestBase64.java: remove jdk.test.lib.Utils from @build which was causing Tier3 failures. I took a look at the VSX algo. I haven't looked much beyond it. I had a few questions I've inlined. It does look like a faithful VSX implementation of the linked algo. src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3817: > 3815: __ xxperm(offsets->to_vsr(), offsetLUT, higher_nibble->to_vsr()); > 3816: > 3817: // Find out which elemets are the special case character (isURL ? '/' : '-') Trivial nit, s/elemets/elements/ src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3820: > 3818: __ vcmpequb_(eq_special_case_char, input, vec_special_case_char); > 3819: // > 3820: // There's a (63/64)^16 = 77.7% chance that there are no special I think that assumes uniformly randomized data, is that a good assumption? Is it measurably faster to skip around the xxsel instead of doing it unconditionally? src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3858: > 3856: > 3857: // The Base64 characters had no errors, so add the offsets > 3858: __ vaddubm(input, input, offsets); I think this looks like a correct implementation of the algo in VSX. src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3878: > 3876: // | Element | | | | | | | | | > 3877: // +===============+=============+======================+======================+=============+=============+======================+======================+=============+ > 3878: // | after vaddubm | 00||b0:0..5 | 00||b0:6..7||b1:0..3 | 00||b1:4..7||b2:0..1 | 00||b2:2..7 | 00||b3:0..5 | 00||b3:6..7||b4:0..3 | 00||b4:4..7||b5:0..1 | 00||b5:2..7 | An extra line here showing how the 8 6-bit values above get mapping into 6 bytes greatly help my brain out. (likewise for the 3882: // | vec_0x3fs | 00111111 | 00111111 | 00111111 | 00111111 | 00111111 | 00111111 | 00111111 | 00111111 | > 3883: // +---------------+-------------+----------------------+----------------------+-------------+-------------+----------------------+----------------------+-------------+ > 3884: // | after vpextd | b5:0..7 | b4:0..7 | b3:0..7 | b2:0..7 | b1:0..7 | b0:0..7 | 00000000 | 00000000 | Are theses comments correct or am I misunderstanding this? I read the final result as something starting as `b5:2..7 || b4:4..7|| b5:0..1` from vpextd. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From rehn at openjdk.java.net Thu Oct 22 15:23:27 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 22 Oct 2020 15:23:27 GMT Subject: Integrated: 8223312: Utilize handshakes instead of is_thread_fully_suspended In-Reply-To: References: Message-ID: On Mon, 19 Oct 2020 09:59:34 GMT, Robbin Ehn wrote: > The main point of this change-set is to make it easier to implement S/R on top of handshakes. > Which is a prerequisite for removing _suspend_flag (which duplicates the handshake functionality). > But we also remove some complicated S/R methods. > > We basically just put in everything in the handshake closure, so the diff just looks much worse than what it is. > > TraceSuspendDebugBits have an ifdef, but in both cases it now just returns. > But I was unsure if I should remove now or when is_ext_suspend_completed() is removed. > > Passes multiple t1-5 runs, locally it passes many jck:vm/nsk_jvmti/nsk_jdi/jdk-jdi runs. This pull request has now been integrated. Changeset: 4634dbef Author: Robbin Ehn URL: https://git.openjdk.java.net/jdk/commit/4634dbef Stats: 611 lines in 6 files changed: 174 ins; 376 del; 61 mod 8223312: Utilize handshakes instead of is_thread_fully_suspended Reviewed-by: dholmes, rrich, dcubed, eosterlund ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From rehn at openjdk.java.net Thu Oct 22 15:23:26 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 22 Oct 2020 15:23:26 GMT Subject: RFR: 8223312: Utilize handshakes instead of is_thread_fully_suspended [v3] In-Reply-To: <4WNbKclJp0jLscBp5IV84_z1gRX8TWH1SSWPHnaR12Y=.bd000576-9225-4ea6-9561-091e41bf427e@github.com> References: <4WNbKclJp0jLscBp5IV84_z1gRX8TWH1SSWPHnaR12Y=.bd000576-9225-4ea6-9561-091e41bf427e@github.com> Message-ID: <9Vw23OOV_Gtf1aNVH3jcDccAS97sDwOGOZ7UO9D0m2s=.d79acae3-6cf8-4aab-af7d-a6c137a7809d@github.com> On Thu, 22 Oct 2020 10:44:21 GMT, Robbin Ehn wrote: >> Looks good. Awesome fix IMO. > > Passes my local testing: open/test/jdk/com/sun/jdi/EATests.java, nsk_jvmti, nsk_jdi, jdk_jdi, jck:vm. > Still running t1-t5 in test system. > > I will be integrating later today, so the ZGC/EscapeBarrier issue can be resolved (which is semi-dependent on this). > > Thanks @dholmes-ora, @dcubed-ojdk, @reinrich, @fisk ! T1-5 looked good. ------------- PR: https://git.openjdk.java.net/jdk/pull/729 From akozlov at openjdk.java.net Thu Oct 22 15:52:29 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 22 Oct 2020 15:52:29 GMT Subject: RFR: 8255254: Split os::reserve_memory and os::map_memory_to_file interfaces Message-ID: Hi, Please review a change to extract map_memory_to_file interface out of reserve_memory when the latter takes file descriptor. The change should be a pure refactoring without changes in functionality. The only part is disturbing: a comment in original os_posix.cpp:316 seems to refer to else clause and it contradicts to the actual code. ------------- Commit messages: - Split reserve and map interfaces Changes: https://git.openjdk.java.net/jdk/pull/812/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=812&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255254 Stats: 171 lines in 9 files changed: 88 ins; 56 del; 27 mod Patch: https://git.openjdk.java.net/jdk/pull/812.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/812/head:pull/812 PR: https://git.openjdk.java.net/jdk/pull/812 From eosterlund at openjdk.java.net Thu Oct 22 15:54:10 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 22 Oct 2020 15:54:10 GMT Subject: RFR: 8237363: Remove automatic is in heap verification in OopIterateClosure In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 07:30:16 GMT, Stefan Karlsson wrote: > There's verification code in the "oop iterate" framework that asserts that a pointer is "is in the heap". This works for most GCs, but ZGC *can* eagerly decommit the old relocation set pages, which means that pointers to the old / from copy of the object could point to memory that is currently not a part of the current heap. > > To combat this in the past I've added a way for oop iterate closures to turn off this verification. However, every single time we add a new closure we have to consider if we can allow this verification check or if we have to remove it. Personally, I think this is a false abstraction and also widens the oop iterate closure interface. I previously proposed a patch that moved the verification code down into the oop iterate closures. It wasn't a huge patch, but I got push-back that it was convenient for other GCs to get this automatic verification, and the review stalled. > > In this new patch I propose a different way to retain the verification. The realization is that most oop iterate closures have to deal with both compressed and non-compressed oops, so the code typically looks like this: > > template > inline void G1ScanCardClosure::do_oop_work(T* p) { > T o = RawAccess<>::oop_load(p); > if (CompressedOops::is_null(o)) { > return; > } > oop obj = CompressedOops::decode_not_null(o); > > Therefore the suggest new place to put the is_in verification is in the CompressedOops::decode*. This injects the assert into almost all non-ZGC closures, and also to places that don't use the oop iterate closure framework. I think this is a neat workaround, and hope this patch is accepted this time. > > I've tested this patch a few weeks ago, but will rerun the relevant tiers. Looks good. Great work! ------------- Marked as reviewed by eosterlund (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/797 From phedlin at openjdk.java.net Thu Oct 22 16:03:15 2020 From: phedlin at openjdk.java.net (Patric Hedlin) Date: Thu, 22 Oct 2020 16:03:15 GMT Subject: RFR: 8248411: [aarch64] Insufficient error handling when CodeBuffer is exhausted In-Reply-To: References: <02kM0GEZd0Twb-mEJxf0diGjUzXV3hcpOfXclg8gt6A=.d0d3a1af-9641-4179-ba9f-18401c05e6a4@github.com> Message-ID: <-y6v6jtsEw-RF7EQbiAbuLsMBmcSBMRFyqkmhpEVeUE=.acccbef8-f90c-4e34-af60-9ac1dc1ed73b@github.com> On Thu, 22 Oct 2020 14:15:29 GMT, Patric Hedlin wrote: >>> The AArch64 changes are ok. >>> I am not at all keen on the many format-only changes that are included in this patch since they introduce a lot of changed lines for the sole and rather specious benefit of adherence to a questionable orthographic authority. That's especially so with the relatively unscryable (sic) changes in output.cpp that modify declarations of the form `Foo *foo` to `Foo* foo`. One is initially left wondering what has changed only, at penny-drop, to replace that feeling with equal wonder as to why it was worth bothering, especially as there remain many thousands more such editorial opportunities. Meanwhile the substantive signal that constitutes the real patch is lost amid this noise. Of course, you may continue tilting at this windmill if you really wish to. >> >> I agree. I'm happy enough with code being tidied up as we go along, but not with a patch like this one, where it's not even clear that the result is an improvement. Also, it doesn't make sense to "tidy up" code that is nothing to do with the patch. >> >> Changing `Foo *foo` to `Foo* foo` is simply wrong. The * operator binds to the right, so something like `int* a, b` looks like a and b are pointers; they're not. That's why we write `int *a, b` : we should be writing for the reader. > > May I suggest the following to lessen the burden of mundane white-space edits. > > ![image](https://user-images.githubusercontent.com/37185447/96883432-b3369400-1480-11eb-9bf9-2f1fd9ed9777.png) So what about the `Foo* p` vs `Foo *p`? In short, I totally disagree. `Foo* p` supports the reading of `p` as Foo-pointer (or int-pointer, or void-pointer) which also emphasises the fact that the base type is vital to the operations available through/on `p`. This view is obscured by `Foo *p` (or `Foo p, *q`), suggesting "a pointer" is a more general concept, free of semantics imposed by its base type. For the same reason, I write `int* f()` not `int *f()`, not ever. And "we" don't write `int* p, q` nor do "we" write `int p, *q` since not only need pointers be initialised but more importantly, it's a truly bad idea to introduce different semantic entities in a single declaration. "We" write code to be read, and understood, by humans. May I also offer the following explanation to why the `*` (pointer-decl) operator binds to the right. Is it perhaps due to the fact that the original K&R C-syntax predates the first official standard by roughly 20 years, syntax picked-up from the B language (which only had one "pointer type", the address of an array, and no type-system), that we use the unary prefix "indirection operator"; `*v` ? In any case, I thank you both for reviewing the code (well, I guess 'aph' didn't actually review) even though it seems to upset the two of you, and despite the fact that I don't find the ranting particularly constructive. Thanks also to Vladimir. ------------- PR: https://git.openjdk.java.net/jdk/pull/765 From phedlin at openjdk.java.net Thu Oct 22 16:03:17 2020 From: phedlin at openjdk.java.net (Patric Hedlin) Date: Thu, 22 Oct 2020 16:03:17 GMT Subject: Integrated: 8248411: [aarch64] Insufficient error handling when CodeBuffer is exhausted In-Reply-To: References: Message-ID: On Tue, 20 Oct 2020 13:31:59 GMT, Patric Hedlin wrote: > Trampoline call generation (in the macro-assembler) may run out of CodeBuffer space without the proper error handling, resulting in asserts such as: > # Internal Error (.../open/src/hotspot/share/asm/codeBuffer.hpp:198), pid=845, tid=859 > # assert(allocates2(pc)) failed: relocation addr must be in this section > This update extends the error handling for such error cases to cover all uses of `trampoline_call()`, direct and indirect. Failure registration/recording is retained in the "**aarch64.ad**" file. This pull request has now been integrated. Changeset: f279ddfa Author: Patric Hedlin URL: https://git.openjdk.java.net/jdk/commit/f279ddfa Stats: 148 lines in 5 files changed: 83 ins; 7 del; 58 mod 8248411: [aarch64] Insufficient error handling when CodeBuffer is exhausted Reviewed-by: adinn ------------- PR: https://git.openjdk.java.net/jdk/pull/765 From rkennke at openjdk.java.net Thu Oct 22 16:04:25 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 22 Oct 2020 16:04:25 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v12] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads that make heavvy use of such weak references will therefore potentially cause significant GC pauses. > > There are 3 main items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for FinalReference, marking stops there, and does not mark through the referent. > 2. Concurrent processing is performed after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers in Reference.get() intrinsics that return NULL when the referent is not reachable. > > Testing: hotspot_gc_shenadoah (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions, tier1, tier2, vmTestbase_vm_metaspace, vmTestbase_nsk_jvmti, with -XX:+UseShenandoahGC without regressions, specjvm with various levels of verification Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Rename native argument to maybe_narrow_oop for more clarity ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/505/files - new: https://git.openjdk.java.net/jdk/pull/505/files/f2a9bb61..6418428d Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=11 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=10-11 Stats: 8 lines in 4 files changed: 0 ins; 0 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From rkennke at openjdk.java.net Thu Oct 22 16:04:26 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 22 Oct 2020 16:04:26 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v10] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> <5LY7Ty5Q2S8j9S30uXyeX0EE8AQWQ4BFFd02NYMzVio=.2e423faa-895a-42c9-ad9f-6c85844e1ff8@github.com> Message-ID: On Thu, 22 Oct 2020 12:20:24 GMT, Zhengyu Gu wrote: >> I think it's clearer as it is. The motivation for this is that native references are always oops, while weak reference's referents can be oops or narrowOops. Which means that we need to call a different method for native-refs (the always-oops entry point). >> The interesting differentiator is native vs. not-native, because one is always-oops, the other can be narrowOop. Weak vs not-weak is not as clear because there can also be weak/phantom native-refs. > > The weird part is the method name, load_reference_barrier_**native**(), then you have parameter says that it may not be a native load and the parameter does nothing to hint the possible type differential. How about may_narrow_oop? Ok I renamed the argument. It is indeed better to decouple the meanings there native <-> narrow/wide. ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From psandoz at openjdk.java.net Thu Oct 22 16:17:20 2020 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Thu, 22 Oct 2020 16:17:20 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v7] In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 13:30:13 GMT, Maurizio Cimadamore wrote: >> src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/CLinker.java line 126: >> >>> 124: * >>> 125: * @param symbol downcall symbol. >>> 126: * @param type the method type. >> >> s/method type/carrier type ? > > Not sure about this one? E.g. in my mental model, I often have seen "carrier type" associated with j.l.Class ? Ah, i see, i find it confusing that "carrier type" is mentioned in the `@throws`, and was assuming it was an alias for method type, did you really mean method type? >> src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/CLinker.java line 139: >> >>> 137: * >>> 138: *

The returned segment is not thread-confined, and it only features >>> 139: * the {@link MemorySegment#CLOSE} access mode. When the returned segment is closed, >> >> Implying that it is shared? If so might be better to state that directly (with a link), and can be closed explicitly or left until can be collected by the GC? > > `The returned segment is not thread-confined` ? Since it features CLOSE, it can be closed explicitly - I'm not sure 100% of what additional clarification is required - but I'm happy to make this clearer (I need more info). Sometimes it's clearer to state the non-negative term i.e. _shared_ which is now something more explicit e.g. > The returned segment is _shared_ [add link?] (not thread-confined) That is really what i was trying to get at, rather than the CLOSE+GC aspects. ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From psandoz at openjdk.java.net Thu Oct 22 16:17:19 2020 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Thu, 22 Oct 2020 16:17:19 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v7] In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 14:31:12 GMT, Jorn Vernee wrote: >> src/java.base/share/classes/java/lang/invoke/NativeMethodHandle.java line 36: >> >>> 34: import static java.lang.invoke.MethodHandleStatics.newInternalError; >>> 35: >>> 36: /** TODO */ >> >> Is the TODO to make this class public later and adjust the return type of `downcallHandle`? > > IIRC this was added to silence a javac linter warning. Something should be added here. There is/was no plan to make this class public though. It's odd the lint warning is triggering on a package private class and private methods. Separately, I recommend updating `make/CompileJavaModules.gmk` and adding `-Xdoclint:all/protected` for the module (i recently did this for the vector API see [here](https://github.com/openjdk/jdk/commit/000143504408ac7938e9f493c17c4dbb994045f9#diff-118e609b9974c0ce8af7950711461c7ab4620c9d4f4c99d231f598696f8e05d0) ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From psandoz at openjdk.java.net Thu Oct 22 16:17:21 2020 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Thu, 22 Oct 2020 16:17:21 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v9] In-Reply-To: <1p5weKaRsQH1q9GBnZFqKPupXb-5hYfeMm-NxaPiPUM=.db7147ad-67e0-4738-9fa7-d1afdabe3705@github.com> References: <1p5weKaRsQH1q9GBnZFqKPupXb-5hYfeMm-NxaPiPUM=.db7147ad-67e0-4738-9fa7-d1afdabe3705@github.com> Message-ID: <4LF0LKSTD3ZkvV2kWKDqqeDAxh0mxMaB7eH2hakq7O4=.34d04357-f168-4ae2-b224-212122970e7d@github.com> On Thu, 22 Oct 2020 14:26:37 GMT, Maurizio Cimadamore wrote: >> src/jdk.incubator.foreign/share/classes/jdk/internal/foreign/AbstractNativeScope.java line 120: >> >>> 118: } >>> 119: } >>> 120: throw new AssertionError("Cannot get here!"); >> >> This code is a little confusing, effectively using an exception for control flow, within a retry loop. I recommend performing an explicit bounds check to determine if a new segment of `BLOCK_SIZE` is required from which to slice into. It will also be faster than the exceptional case. > > I hear you - that said, note that doing the bound check is not as trivial as it seems; you have to take into account the value of `sp` and add required alignment padding, and then perform a bound check against that - all logic which would need to be duplicated across the normal and the exceptional cases. Which is why the code settled the way it is. I'll see what I can do. Yeah, i would have probably done the same thing initially. ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From mcimadamore at openjdk.java.net Thu Oct 22 16:34:27 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 22 Oct 2020 16:34:27 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v10] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the first incubation round of the foreign linker access API incubation > (see JEP 389 [1]). This work is meant to sit on top of the foreign memory access support (see JEP 393 [2] and associated pull request [3]). > > The main goal of this API is to provide a way to call native functions from Java code without the need of intermediate JNI glue code. In order to do this, native calls are modeled through the MethodHandle API. I suggest reading the writeup [4] I put together few weeks ago, which illustrates what the foreign linker support is, and how it should be used by clients. > > Disclaimer: the pull request mechanism isn't great at managing *dependent* reviews. For this reasons, I'm attaching a webrev which contains only the differences between this PR and the memory access PR. I will be periodically uploading new webrevs, as new iterations come out, to try and make the life of reviewers as simple as possible. > > A big thank to Jorn Vernee and Vladimir Ivanov - they are the main architects of all the hotspot changes you see here, and without their help, the foreign linker support wouldn't be what it is today. As usual, a big thank to Paul Sandoz, who provided many insights (often by trying the bits first hand). > > Thanks > Maurizio > > Webrev: > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/webrev > > Javadoc: > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/javadoc/jdk/incubator/foreign/package-summary.html > > Specdiff (relative to [3]): > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/specdiff_delta/overview-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254232 > > > > ### API Changes > > The API changes are actually rather slim: > > * `LibraryLookup` > * This class allows clients to lookup symbols in native libraries; the interface is fairly simple; you can load a library by name, or absolute path, and then lookup symbols on that library. > * `FunctionDescriptor` > * This is an abstraction that is very similar, in spirit, to `MethodType`; it is, at its core, an aggregate of memory layouts for the function arguments/return type. A function descriptor is used to describe the signature of a native function. > * `CLinker` > * This is the real star of the show. A `CLinker` has two main methods: `downcallHandle` and `upcallStub`; the first takes a native symbol (as obtained from `LibraryLookup`), a `MethodType` and a `FunctionDescriptor` and returns a `MethodHandle` instance which can be used to call the target native symbol. The second takes an existing method handle, and a `FunctionDescriptor` and returns a new `MemorySegment` corresponding to a code stub allocated by the VM which acts as a trampoline from native code to the user-provided method handle. This is very useful for implementing upcalls. > * This class also contains the various layout constants that should be used by clients when describing native signatures (e.g. `C_LONG` and friends); these layouts contain additional ABI classfication information (in the form of layout attributes) which is used by the runtime to *infer* how Java arguments should be shuffled for the native call to take place. > * Finally, this class provides some helper functions e.g. so that clients can convert Java strings into C strings and back. > * `NativeScope` > * This is an helper class which allows clients to group together logically related allocations; that is, rather than allocating separate memory segments using separate *try-with-resource* constructs, a `NativeScope` allows clients to use a _single_ block, and allocate all the required segments there. This is not only an usability boost, but also a performance boost, since not all allocation requests will be turned into `malloc` calls. > * `MemorySegment` > * Only one method added here - namely `handoff(NativeScope)` which allows a segment to be transferred onto an existing native scope. > > ### Safety > > The foreign linker API is intrinsically unsafe; many things can go wrong when requesting a native method handle. For instance, the description of the native signature might be wrong (e.g. have too many arguments) - and the runtime has, in the general case, no way to detect such mismatches. For these reasons, obtaining a `CLinker` instance is a *restricted* operation, which can be enabled by specifying the usual JDK property `-Dforeign.restricted=permit` (as it's the case for other restricted method in the foreign memory API). > > ### Implementation changes > > The Java changes associated with `LibraryLookup` are relative straightforward; the only interesting thing to note here is that library loading does _not_ depend on class loaders, so `LibraryLookup` is not subject to the same restrictions which apply to JNI library loading (e.g. same library cannot be loaded by different classloaders). > > As for `NativeScope` the changes are again relatively straightforward; it is an API which sits neatly on top of the foreign meory access API, providing some kind of allocation service which shares the same underlying memory segment(s), and turns an allocation request into a segment slice, which is a much less expensive operation. `NativeScope` comes in two variants: there are native scopes for which the allocation size is known a priori, and native scopes which can grow - these two schemes are implemented by two separate subclasses of `AbstractNativeScopeImpl`. > > Of course the bulk of the changes are to support the `CLinker` downcall/upcall routines. These changes cut pretty deep into the JVM; I'll briefly summarize the goal of some of this changes - for further details, Jorn has put together a detailed writeup which explains the rationale behind the VM support, with some references to the code [5]. > > The main idea behind foreign linker is to infer, given a Java method type (expressed as a `MethodType` instance) and the description of the signature of a native function (expressed as a `FunctionDescriptor` instance) a _recipe_ that can be used to turn a Java call into the corresponding native call targeting the requested native function. > > This inference scheme can be defined in a pretty straightforward fashion by looking at the various ABI specifications (for instance, see [6] for the SysV ABI, which is the one used on Linux/Mac). The various `CallArranger` classes, of which we have a flavor for each supported platform, do exactly that kind of inference. > > For the inference process to work, we need to attach extra information to memory layouts; it is no longer sufficient to know e.g. that a layout is 32/64 bits - we need to know whether it is meant to represent a floating point value, or an integral value; this knowledge is required because floating points are passed in different registers by most ABIs. For this reason, `CLinker` offers a set of pre-baked, platform-dependent layout constants which contain the required classification attributes (e.g. a `Clinker.TypeKind` enum value). The runtime extracts this attribute, and performs classification accordingly. > > A native call is decomposed into a sequence of basic, primitive operations, called `Binding` (see the great javadoc on the `Binding.java` class for more info). There are many such bindings - for instance the `Move` binding is used to move a value into a specific machine register/stack slot. So, the main job of the various `CallingArranger` classes is to determine, given a Java `MethodType` and `FunctionDescriptor` what is the set of bindings associated with the downcall/upcall. > > At the heart of the foreign linker support is the `ProgrammableInvoker` class. This class effectively generates a `MethodHandle` which follows the steps described by the various bindings obtained by `CallArranger`. There are actually various strategies to interpret these bindings - listed below: > > * basic intepreted mode; in this mode, all bindings are interpreted using a stack-based machine written in Java (see `BindingInterpreter`), except for the `Move` bindings. For these bindings, the move is implemented by allocating a *buffer* (whose size is ABI specific) and by moving all the lowered values into positions within this buffer. The buffer is then passed to a piece of assembly code inside the VM which takes values from the buffer and moves them in their expected registers/stack slots (note that each position in the buffer corresponds to a different register). This is the most general invocation mode, the more "customizable" one, but also the slowest - since for every call there is some extra allocation which takes place. > > * specialized interpreted mode; same as before, but instead of interpreting the bindings with a stack-based interpreter, we generate a method handle chain which effectively interprets all the bindings (again, except `Move` ones). > > * intrinsified mode; this is typically used in combination with the specialized interpreted mode described above (although it can also be used with the Java-based binding interpreter). The goal here is to remove the buffer allocation and copy by introducing an additional JVM intrinsic. If a native call recipe is constant (e.g. the set of bindings is constant, which is probably the case if the native method handle is stored in a `static`, `final` field), then the VM can generate specialized assembly code which interprets the `Move` binding without the need to go for an intermediate buffer. This gives us back performances that are on par with JNI. > > For upcalls, the support is not (yet) as advanced, and only the basic interpreted mode is available there. We plan to add support for intrinsified modes there as well, which should considerably boost perfomances (probably well beyond what JNI can offer at the moment, since the upcall support in JNI is not very well optimized). > > Again, for more readings on the internals of the foreign linker support, please refer to [5]. > > #### Test changes > > Many new tests have been added to validate the foreign linker support; we have high level tests (see `StdLibTest`) which aim at testing the linker from the perspective of code that clients could write. But we also have deeper combinatorial tests (see `TestUpcall` and `TestDowncall`) which are meant to stress every corner of the ABI implementation. There are also some great tests (see the `callarranger` folder) which test the various `CallArranger`s for all the possible platforms; these tests adopt more of a white-box approach - that is, instead of treating the linker machinery as a black box and verify that the support works by checking that the native call returned the results we expected, these tests aims at checking that the set of bindings generated by the call arranger is correct. This also mean that we can test the classification logic for Windows, Mac and Linux regardless of the platform we're executing on. > > Some additional microbenchmarks have been added to compare the performances of downcall/upcall with JNI. > > [1] - https://openjdk.java.net/jeps/389 > [2] - https://openjdk.java.net/jeps/393 > [3] - https://git.openjdk.java.net/jdk/pull/548 > [4] - https://github.com/openjdk/panama-foreign/blob/foreign-jextract/doc/panama_ffi.md > [5] - http://cr.openjdk.java.net/~jvernee/docs/Foreign-abi%20downcall%20intrinsics%20technical%20description.html Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 39 commits: - Merge pull request #4 from JornVernee/Missing_ResourceMarks Add missing resource marks before parsing descriptors - Add missing resource marks before parsing descriptors - Fix CLinker javadoc - Simplify AbstractNativeScope::allocate - Fix most review comments Fix Graal intrinsics test failure - And more copyright fixes - Fix more copyright headers - Fix copyright of AbstractNativeScope - Fix aarch issues - Remove spurious include - ... and 29 more: https://git.openjdk.java.net/jdk/compare/cc50c8d4...21f50872 ------------- Changes: https://git.openjdk.java.net/jdk/pull/634/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=634&range=09 Stats: 75773 lines in 270 files changed: 72888 ins; 1608 del; 1277 mod Patch: https://git.openjdk.java.net/jdk/pull/634.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/634/head:pull/634 PR: https://git.openjdk.java.net/jdk/pull/634 From mcimadamore at openjdk.java.net Thu Oct 22 16:34:28 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 22 Oct 2020 16:34:28 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v7] In-Reply-To: References: Message-ID: On Wed, 21 Oct 2020 19:05:42 GMT, Paul Sandoz wrote: >> Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 25 commits: >> >> - Merge branch 'master' into 8254231_linker >> - Fix incorrect capitalization in one copyright header >> - Update copyright years, and add classpath exception to files that were missing it >> - Use separate constants for native invoker code size >> - Re-add file erroneously deleted (detected as rename) >> - Re-add erroneously removed files >> - Merge branch 'master' into 8254231_linker >> >> - Fix tests >> - Fix more whitespaces >> - Fix whitespaces >> - Remove rejected file >> - ... and 15 more: https://git.openjdk.java.net/jdk/compare/cb6167b2...502bd980 > > Some of this is familiar to me from reviews in the `panama-foreign` repository, but less so than the memory API. I focused on the Java code, and ignored changes that are common with the memory API PR. > > If it helps in can provide a PR in the `panama-foreign` repository addressing editorial comments to the linker API. I've just uploaded a new iteration which addresses most of @PaulSandoz comments on the API/Java impl. ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From mcimadamore at openjdk.java.net Thu Oct 22 16:34:28 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 22 Oct 2020 16:34:28 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v7] In-Reply-To: References: Message-ID: <903tobsjq6ao1evfv2CXzPE9jLfRWcVXrghBViPU4eo=.01d1f2d9-6c6e-456c-950b-632dd6619e1b@github.com> On Thu, 22 Oct 2020 15:46:33 GMT, Paul Sandoz wrote: >> IIRC this was added to silence a javac linter warning. Something should be added here. There is/was no plan to make this class public though. > > It's odd the lint warning is triggering on a package private class and private methods. Separately, I recommend updating `make/CompileJavaModules.gmk` and adding `-Xdoclint:all/protected` for the module (i recently did this for the vector API see [here](https://github.com/openjdk/jdk/commit/000143504408ac7938e9f493c17c4dbb994045f9#diff-118e609b9974c0ce8af7950711461c7ab4620c9d4f4c99d231f598696f8e05d0) There's no lint warning for private method, a fix is coming. As for makefile changes, I'd prefer to postpone after integration, or add them to the foreign memory access PR, since that's not specific to this PR. ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From mcimadamore at openjdk.java.net Thu Oct 22 16:34:29 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 22 Oct 2020 16:34:29 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v9] In-Reply-To: References: Message-ID: <729-YxJZTBRSEYbheYgTYKRo2FpSlyaPCrCSfI0i67Y=.5445d130-fa23-46f3-b8ab-86ba8fafeaee@github.com> On Wed, 21 Oct 2020 17:42:53 GMT, Paul Sandoz wrote: > Some design considerations, to consider later maybe. > > The IR representation could be simplified to use record classes (which should be exiting preview in 16), implementing a Binding interface. The interpreter and specializer (compiler) could be separate if need be, operating on a sequence of instructions that just hold the data. Pattern matching could be used on the binding instances. It may be simpler and more efficient if the compiler generated explicit byte code rather than using MH combinators. Good suggestions - I think this is good to consider post-integration. ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From aph at openjdk.java.net Thu Oct 22 16:39:14 2020 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 22 Oct 2020 16:39:14 GMT Subject: RFR: 8248411: [aarch64] Insufficient error handling when CodeBuffer is exhausted In-Reply-To: <-y6v6jtsEw-RF7EQbiAbuLsMBmcSBMRFyqkmhpEVeUE=.acccbef8-f90c-4e34-af60-9ac1dc1ed73b@github.com> References: <02kM0GEZd0Twb-mEJxf0diGjUzXV3hcpOfXclg8gt6A=.d0d3a1af-9641-4179-ba9f-18401c05e6a4@github.com> <-y6v6jtsEw-RF7EQbiAbuLsMBmcSBMRFyqkmhpEVeUE=.acccbef8-f90c-4e34-af60-9ac1dc1ed73b@github.com> Message-ID: On Thu, 22 Oct 2020 15:54:58 GMT, Patric Hedlin wrote: > In any case, I thank you both for reviewing the code (well, I guess 'aph' didn't actually review) even though it seems to upset the two of you, and despite the fact that I don't find the ranting particularly constructive. We're not upset, at least I'm not, but I don't think we should change existing code based on preference. In other words, despite the fact that I have a strong preference, I don't go round changing instances of `int* p` to `int *p` when I see them, and I don't think you should either. HotSpot doesn't have a history of micro-managing code layout, and that's a good thing. One consequence of this is minor local variations. Also, churn is bad, and in this case we have a bunch of edits to a file which has nothing to do with the bug being fixed. It's taken me a while to be sure, but I can find no changes to opto/output.cpp except layout. This complicates the diffs for the reviewer and makes future diff searches harder for people in the future. Sure, if layout is unusual to the point of being misleading,we should change it. > May I also offer the following explanation to why the * (pointer-decl) operator binds to the right. Is it perhaps due to the fact that the original K&R C-syntax predates the first official standard by roughly 20 years The reason for the syntax is that Richie had the idea that "declaration follows use". http://www.gotw.ca/publications/c_family_interview.htm ------------- PR: https://git.openjdk.java.net/jdk/pull/765 From kvn at openjdk.java.net Thu Oct 22 16:56:16 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 22 Oct 2020 16:56:16 GMT Subject: RFR: 8255208: CodeStrings passed to Disassembler::decode are ignored In-Reply-To: References: Message-ID: On Wed, 21 Oct 2020 23:50:51 GMT, Claes Redestad wrote: > CodeStrings passed directly to Disassembler::decode are wrongly ignored. > > This patch started out as a cleanup to clean out CodeStrings, but I realized as I was about to remove the suspiciously unused CodeStrings in the disassembler that them being unused was likely a bug. > > For example, -XX:+PrintInterpreter (on debug builds) included messages which can help digesting the output (if not else by emitting text hooks to the place in the source code where the asm is generated): > > 0x00007f1f7093becb: je 0x00007f1f7093bee5 > ;; call_VM_base: heap base corrupted? <<< omitted > 0x00007f1f7093bed1: mov $0x7f1f90c7ecb8,%rdi > 0x00007f1f7093bedb: and $0xfffffffffffffff0,%rsp > 0x00007f1f7093bedf: callq 0x00007f1f9046c0a0 = MacroAssembler::debug64(char*, long, long*) > > While PrintInterpreter is the only case that appears directly affected, restoring this capability seems useful in general. > > The cleaning up of the code also has some nice side-effects such as reducing the size of a CodeBuffer from 432 to 408 bytes and marginally improving the static size of the JVM (as measured on linux-x64) very nice clean up ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/788 From mcimadamore at openjdk.java.net Thu Oct 22 16:57:30 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 22 Oct 2020 16:57:30 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v11] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the first incubation round of the foreign linker access API incubation > (see JEP 389 [1]). This work is meant to sit on top of the foreign memory access support (see JEP 393 [2] and associated pull request [3]). > > The main goal of this API is to provide a way to call native functions from Java code without the need of intermediate JNI glue code. In order to do this, native calls are modeled through the MethodHandle API. I suggest reading the writeup [4] I put together few weeks ago, which illustrates what the foreign linker support is, and how it should be used by clients. > > Disclaimer: the pull request mechanism isn't great at managing *dependent* reviews. For this reasons, I'm attaching a webrev which contains only the differences between this PR and the memory access PR. I will be periodically uploading new webrevs, as new iterations come out, to try and make the life of reviewers as simple as possible. > > A big thank to Jorn Vernee and Vladimir Ivanov - they are the main architects of all the hotspot changes you see here, and without their help, the foreign linker support wouldn't be what it is today. As usual, a big thank to Paul Sandoz, who provided many insights (often by trying the bits first hand). > > Thanks > Maurizio > > Webrev: > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/webrev > > Javadoc: > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/javadoc/jdk/incubator/foreign/package-summary.html > > Specdiff (relative to [3]): > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/specdiff_delta/overview-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254232 > > > > ### API Changes > > The API changes are actually rather slim: > > * `LibraryLookup` > * This class allows clients to lookup symbols in native libraries; the interface is fairly simple; you can load a library by name, or absolute path, and then lookup symbols on that library. > * `FunctionDescriptor` > * This is an abstraction that is very similar, in spirit, to `MethodType`; it is, at its core, an aggregate of memory layouts for the function arguments/return type. A function descriptor is used to describe the signature of a native function. > * `CLinker` > * This is the real star of the show. A `CLinker` has two main methods: `downcallHandle` and `upcallStub`; the first takes a native symbol (as obtained from `LibraryLookup`), a `MethodType` and a `FunctionDescriptor` and returns a `MethodHandle` instance which can be used to call the target native symbol. The second takes an existing method handle, and a `FunctionDescriptor` and returns a new `MemorySegment` corresponding to a code stub allocated by the VM which acts as a trampoline from native code to the user-provided method handle. This is very useful for implementing upcalls. > * This class also contains the various layout constants that should be used by clients when describing native signatures (e.g. `C_LONG` and friends); these layouts contain additional ABI classfication information (in the form of layout attributes) which is used by the runtime to *infer* how Java arguments should be shuffled for the native call to take place. > * Finally, this class provides some helper functions e.g. so that clients can convert Java strings into C strings and back. > * `NativeScope` > * This is an helper class which allows clients to group together logically related allocations; that is, rather than allocating separate memory segments using separate *try-with-resource* constructs, a `NativeScope` allows clients to use a _single_ block, and allocate all the required segments there. This is not only an usability boost, but also a performance boost, since not all allocation requests will be turned into `malloc` calls. > * `MemorySegment` > * Only one method added here - namely `handoff(NativeScope)` which allows a segment to be transferred onto an existing native scope. > > ### Safety > > The foreign linker API is intrinsically unsafe; many things can go wrong when requesting a native method handle. For instance, the description of the native signature might be wrong (e.g. have too many arguments) - and the runtime has, in the general case, no way to detect such mismatches. For these reasons, obtaining a `CLinker` instance is a *restricted* operation, which can be enabled by specifying the usual JDK property `-Dforeign.restricted=permit` (as it's the case for other restricted method in the foreign memory API). > > ### Implementation changes > > The Java changes associated with `LibraryLookup` are relative straightforward; the only interesting thing to note here is that library loading does _not_ depend on class loaders, so `LibraryLookup` is not subject to the same restrictions which apply to JNI library loading (e.g. same library cannot be loaded by different classloaders). > > As for `NativeScope` the changes are again relatively straightforward; it is an API which sits neatly on top of the foreign meory access API, providing some kind of allocation service which shares the same underlying memory segment(s), and turns an allocation request into a segment slice, which is a much less expensive operation. `NativeScope` comes in two variants: there are native scopes for which the allocation size is known a priori, and native scopes which can grow - these two schemes are implemented by two separate subclasses of `AbstractNativeScopeImpl`. > > Of course the bulk of the changes are to support the `CLinker` downcall/upcall routines. These changes cut pretty deep into the JVM; I'll briefly summarize the goal of some of this changes - for further details, Jorn has put together a detailed writeup which explains the rationale behind the VM support, with some references to the code [5]. > > The main idea behind foreign linker is to infer, given a Java method type (expressed as a `MethodType` instance) and the description of the signature of a native function (expressed as a `FunctionDescriptor` instance) a _recipe_ that can be used to turn a Java call into the corresponding native call targeting the requested native function. > > This inference scheme can be defined in a pretty straightforward fashion by looking at the various ABI specifications (for instance, see [6] for the SysV ABI, which is the one used on Linux/Mac). The various `CallArranger` classes, of which we have a flavor for each supported platform, do exactly that kind of inference. > > For the inference process to work, we need to attach extra information to memory layouts; it is no longer sufficient to know e.g. that a layout is 32/64 bits - we need to know whether it is meant to represent a floating point value, or an integral value; this knowledge is required because floating points are passed in different registers by most ABIs. For this reason, `CLinker` offers a set of pre-baked, platform-dependent layout constants which contain the required classification attributes (e.g. a `Clinker.TypeKind` enum value). The runtime extracts this attribute, and performs classification accordingly. > > A native call is decomposed into a sequence of basic, primitive operations, called `Binding` (see the great javadoc on the `Binding.java` class for more info). There are many such bindings - for instance the `Move` binding is used to move a value into a specific machine register/stack slot. So, the main job of the various `CallingArranger` classes is to determine, given a Java `MethodType` and `FunctionDescriptor` what is the set of bindings associated with the downcall/upcall. > > At the heart of the foreign linker support is the `ProgrammableInvoker` class. This class effectively generates a `MethodHandle` which follows the steps described by the various bindings obtained by `CallArranger`. There are actually various strategies to interpret these bindings - listed below: > > * basic intepreted mode; in this mode, all bindings are interpreted using a stack-based machine written in Java (see `BindingInterpreter`), except for the `Move` bindings. For these bindings, the move is implemented by allocating a *buffer* (whose size is ABI specific) and by moving all the lowered values into positions within this buffer. The buffer is then passed to a piece of assembly code inside the VM which takes values from the buffer and moves them in their expected registers/stack slots (note that each position in the buffer corresponds to a different register). This is the most general invocation mode, the more "customizable" one, but also the slowest - since for every call there is some extra allocation which takes place. > > * specialized interpreted mode; same as before, but instead of interpreting the bindings with a stack-based interpreter, we generate a method handle chain which effectively interprets all the bindings (again, except `Move` ones). > > * intrinsified mode; this is typically used in combination with the specialized interpreted mode described above (although it can also be used with the Java-based binding interpreter). The goal here is to remove the buffer allocation and copy by introducing an additional JVM intrinsic. If a native call recipe is constant (e.g. the set of bindings is constant, which is probably the case if the native method handle is stored in a `static`, `final` field), then the VM can generate specialized assembly code which interprets the `Move` binding without the need to go for an intermediate buffer. This gives us back performances that are on par with JNI. > > For upcalls, the support is not (yet) as advanced, and only the basic interpreted mode is available there. We plan to add support for intrinsified modes there as well, which should considerably boost perfomances (probably well beyond what JNI can offer at the moment, since the upcall support in JNI is not very well optimized). > > Again, for more readings on the internals of the foreign linker support, please refer to [5]. > > #### Test changes > > Many new tests have been added to validate the foreign linker support; we have high level tests (see `StdLibTest`) which aim at testing the linker from the perspective of code that clients could write. But we also have deeper combinatorial tests (see `TestUpcall` and `TestDowncall`) which are meant to stress every corner of the ABI implementation. There are also some great tests (see the `callarranger` folder) which test the various `CallArranger`s for all the possible platforms; these tests adopt more of a white-box approach - that is, instead of treating the linker machinery as a black box and verify that the support works by checking that the native call returned the results we expected, these tests aims at checking that the set of bindings generated by the call arranger is correct. This also mean that we can test the classification logic for Windows, Mac and Linux regardless of the platform we're executing on. > > Some additional microbenchmarks have been added to compare the performances of downcall/upcall with JNI. > > [1] - https://openjdk.java.net/jeps/389 > [2] - https://openjdk.java.net/jeps/393 > [3] - https://git.openjdk.java.net/jdk/pull/548 > [4] - https://github.com/openjdk/panama-foreign/blob/foreign-jextract/doc/panama_ffi.md > [5] - http://cr.openjdk.java.net/~jvernee/docs/Foreign-abi%20downcall%20intrinsics%20technical%20description.html Maurizio Cimadamore has updated the pull request incrementally with three additional commits since the last revision: - Merge pull request #2 from JornVernee/No_JNI_Comments_v2 Address Coleen's review comments on "Don't use JNI when generating native wrappers" - Merge branch '8254231_linker' into No_JNI_Comments_v2 - Review comments: - Use TempNewSymbol - JVM_* -> JNI_* - rename parseXX -> parse_x_x - remove ref to global function using '::' - make 2 functions static, instead of ForeignGlobals class members ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/634/files - new: https://git.openjdk.java.net/jdk/pull/634/files/21f50872..0c892293 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=634&range=10 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=634&range=09-10 Stats: 64 lines in 12 files changed: 6 ins; 20 del; 38 mod Patch: https://git.openjdk.java.net/jdk/pull/634.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/634/head:pull/634 PR: https://git.openjdk.java.net/jdk/pull/634 From jvernee at openjdk.java.net Thu Oct 22 16:57:31 2020 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Thu, 22 Oct 2020 16:57:31 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v7] In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 16:31:00 GMT, Maurizio Cimadamore wrote: >> Some of this is familiar to me from reviews in the `panama-foreign` repository, but less so than the memory API. I focused on the Java code, and ignored changes that are common with the memory API PR. >> >> If it helps in can provide a PR in the `panama-foreign` repository addressing editorial comments to the linker API. > > I've just uploaded a new iteration which addresses most of @PaulSandoz comments on the API/Java impl. The most recent changes address Coleen's review comments on: #1 The changes I've made are: - Use TempNewSymbol when creating a symbol in field_offset & find_InstanceKlass - change JVM_* macros -> JNI_* - rename parseABIDescritpor & parseBufferLayout -> parse_abi_descriptor & parse_buffer_layout - remove ref to global function using '::', by merging the 2 functions together - make 2 functions static, instead of ForeignGlobals class members ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From redestad at openjdk.java.net Thu Oct 22 17:03:12 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Thu, 22 Oct 2020 17:03:12 GMT Subject: RFR: 8255208: CodeStrings passed to Disassembler::decode are ignored In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 16:53:25 GMT, Vladimir Kozlov wrote: > very nice clean up Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/788 From zgu at openjdk.java.net Thu Oct 22 17:04:18 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Thu, 22 Oct 2020 17:04:18 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v12] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: On Thu, 22 Oct 2020 16:04:25 GMT, Roman Kennke wrote: >> Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads that make heavvy use of such weak references will therefore potentially cause significant GC pauses. >> >> There are 3 main items that contribute to pause time linear to number of references, or worse: >> - We need to scan and consider each reference on the various 'discovered' lists. >> - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is theoretically only bounded by the live data set size. >> - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' >> >> The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. >> >> The solution to this is two-fold: >> 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for FinalReference, marking stops there, and does not mark through the referent. >> 2. Concurrent processing is performed after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers in Reference.get() intrinsics that return NULL when the referent is not reachable. >> >> Testing: hotspot_gc_shenadoah (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions, tier1, tier2, vmTestbase_vm_metaspace, vmTestbase_nsk_jvmti, with -XX:+UseShenandoahGC without regressions, specjvm with various levels of verification > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Rename native argument to maybe_narrow_oop for more clarity Marked as reviewed by zgu (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From mcimadamore at openjdk.java.net Thu Oct 22 17:04:34 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 22 Oct 2020 17:04:34 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v12] In-Reply-To: References: Message-ID: <2IKx6cpc-IGP3jZtr0s2I14BWM6ptyFD26szPl3b1ng=.9d956b98-dfe6-4a45-a371-bf86923214fb@github.com> > This patch contains the changes associated with the first incubation round of the foreign linker access API incubation > (see JEP 389 [1]). This work is meant to sit on top of the foreign memory access support (see JEP 393 [2] and associated pull request [3]). > > The main goal of this API is to provide a way to call native functions from Java code without the need of intermediate JNI glue code. In order to do this, native calls are modeled through the MethodHandle API. I suggest reading the writeup [4] I put together few weeks ago, which illustrates what the foreign linker support is, and how it should be used by clients. > > Disclaimer: the pull request mechanism isn't great at managing *dependent* reviews. For this reasons, I'm attaching a webrev which contains only the differences between this PR and the memory access PR. I will be periodically uploading new webrevs, as new iterations come out, to try and make the life of reviewers as simple as possible. > > A big thank to Jorn Vernee and Vladimir Ivanov - they are the main architects of all the hotspot changes you see here, and without their help, the foreign linker support wouldn't be what it is today. As usual, a big thank to Paul Sandoz, who provided many insights (often by trying the bits first hand). > > Thanks > Maurizio > > Webrev: > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/webrev > > Javadoc: > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/javadoc/jdk/incubator/foreign/package-summary.html > > Specdiff (relative to [3]): > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/specdiff_delta/overview-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254232 > > > > ### API Changes > > The API changes are actually rather slim: > > * `LibraryLookup` > * This class allows clients to lookup symbols in native libraries; the interface is fairly simple; you can load a library by name, or absolute path, and then lookup symbols on that library. > * `FunctionDescriptor` > * This is an abstraction that is very similar, in spirit, to `MethodType`; it is, at its core, an aggregate of memory layouts for the function arguments/return type. A function descriptor is used to describe the signature of a native function. > * `CLinker` > * This is the real star of the show. A `CLinker` has two main methods: `downcallHandle` and `upcallStub`; the first takes a native symbol (as obtained from `LibraryLookup`), a `MethodType` and a `FunctionDescriptor` and returns a `MethodHandle` instance which can be used to call the target native symbol. The second takes an existing method handle, and a `FunctionDescriptor` and returns a new `MemorySegment` corresponding to a code stub allocated by the VM which acts as a trampoline from native code to the user-provided method handle. This is very useful for implementing upcalls. > * This class also contains the various layout constants that should be used by clients when describing native signatures (e.g. `C_LONG` and friends); these layouts contain additional ABI classfication information (in the form of layout attributes) which is used by the runtime to *infer* how Java arguments should be shuffled for the native call to take place. > * Finally, this class provides some helper functions e.g. so that clients can convert Java strings into C strings and back. > * `NativeScope` > * This is an helper class which allows clients to group together logically related allocations; that is, rather than allocating separate memory segments using separate *try-with-resource* constructs, a `NativeScope` allows clients to use a _single_ block, and allocate all the required segments there. This is not only an usability boost, but also a performance boost, since not all allocation requests will be turned into `malloc` calls. > * `MemorySegment` > * Only one method added here - namely `handoff(NativeScope)` which allows a segment to be transferred onto an existing native scope. > > ### Safety > > The foreign linker API is intrinsically unsafe; many things can go wrong when requesting a native method handle. For instance, the description of the native signature might be wrong (e.g. have too many arguments) - and the runtime has, in the general case, no way to detect such mismatches. For these reasons, obtaining a `CLinker` instance is a *restricted* operation, which can be enabled by specifying the usual JDK property `-Dforeign.restricted=permit` (as it's the case for other restricted method in the foreign memory API). > > ### Implementation changes > > The Java changes associated with `LibraryLookup` are relative straightforward; the only interesting thing to note here is that library loading does _not_ depend on class loaders, so `LibraryLookup` is not subject to the same restrictions which apply to JNI library loading (e.g. same library cannot be loaded by different classloaders). > > As for `NativeScope` the changes are again relatively straightforward; it is an API which sits neatly on top of the foreign meory access API, providing some kind of allocation service which shares the same underlying memory segment(s), and turns an allocation request into a segment slice, which is a much less expensive operation. `NativeScope` comes in two variants: there are native scopes for which the allocation size is known a priori, and native scopes which can grow - these two schemes are implemented by two separate subclasses of `AbstractNativeScopeImpl`. > > Of course the bulk of the changes are to support the `CLinker` downcall/upcall routines. These changes cut pretty deep into the JVM; I'll briefly summarize the goal of some of this changes - for further details, Jorn has put together a detailed writeup which explains the rationale behind the VM support, with some references to the code [5]. > > The main idea behind foreign linker is to infer, given a Java method type (expressed as a `MethodType` instance) and the description of the signature of a native function (expressed as a `FunctionDescriptor` instance) a _recipe_ that can be used to turn a Java call into the corresponding native call targeting the requested native function. > > This inference scheme can be defined in a pretty straightforward fashion by looking at the various ABI specifications (for instance, see [6] for the SysV ABI, which is the one used on Linux/Mac). The various `CallArranger` classes, of which we have a flavor for each supported platform, do exactly that kind of inference. > > For the inference process to work, we need to attach extra information to memory layouts; it is no longer sufficient to know e.g. that a layout is 32/64 bits - we need to know whether it is meant to represent a floating point value, or an integral value; this knowledge is required because floating points are passed in different registers by most ABIs. For this reason, `CLinker` offers a set of pre-baked, platform-dependent layout constants which contain the required classification attributes (e.g. a `Clinker.TypeKind` enum value). The runtime extracts this attribute, and performs classification accordingly. > > A native call is decomposed into a sequence of basic, primitive operations, called `Binding` (see the great javadoc on the `Binding.java` class for more info). There are many such bindings - for instance the `Move` binding is used to move a value into a specific machine register/stack slot. So, the main job of the various `CallingArranger` classes is to determine, given a Java `MethodType` and `FunctionDescriptor` what is the set of bindings associated with the downcall/upcall. > > At the heart of the foreign linker support is the `ProgrammableInvoker` class. This class effectively generates a `MethodHandle` which follows the steps described by the various bindings obtained by `CallArranger`. There are actually various strategies to interpret these bindings - listed below: > > * basic intepreted mode; in this mode, all bindings are interpreted using a stack-based machine written in Java (see `BindingInterpreter`), except for the `Move` bindings. For these bindings, the move is implemented by allocating a *buffer* (whose size is ABI specific) and by moving all the lowered values into positions within this buffer. The buffer is then passed to a piece of assembly code inside the VM which takes values from the buffer and moves them in their expected registers/stack slots (note that each position in the buffer corresponds to a different register). This is the most general invocation mode, the more "customizable" one, but also the slowest - since for every call there is some extra allocation which takes place. > > * specialized interpreted mode; same as before, but instead of interpreting the bindings with a stack-based interpreter, we generate a method handle chain which effectively interprets all the bindings (again, except `Move` ones). > > * intrinsified mode; this is typically used in combination with the specialized interpreted mode described above (although it can also be used with the Java-based binding interpreter). The goal here is to remove the buffer allocation and copy by introducing an additional JVM intrinsic. If a native call recipe is constant (e.g. the set of bindings is constant, which is probably the case if the native method handle is stored in a `static`, `final` field), then the VM can generate specialized assembly code which interprets the `Move` binding without the need to go for an intermediate buffer. This gives us back performances that are on par with JNI. > > For upcalls, the support is not (yet) as advanced, and only the basic interpreted mode is available there. We plan to add support for intrinsified modes there as well, which should considerably boost perfomances (probably well beyond what JNI can offer at the moment, since the upcall support in JNI is not very well optimized). > > Again, for more readings on the internals of the foreign linker support, please refer to [5]. > > #### Test changes > > Many new tests have been added to validate the foreign linker support; we have high level tests (see `StdLibTest`) which aim at testing the linker from the perspective of code that clients could write. But we also have deeper combinatorial tests (see `TestUpcall` and `TestDowncall`) which are meant to stress every corner of the ABI implementation. There are also some great tests (see the `callarranger` folder) which test the various `CallArranger`s for all the possible platforms; these tests adopt more of a white-box approach - that is, instead of treating the linker machinery as a black box and verify that the support works by checking that the native call returned the results we expected, these tests aims at checking that the set of bindings generated by the call arranger is correct. This also mean that we can test the classification logic for Windows, Mac and Linux regardless of the platform we're executing on. > > Some additional microbenchmarks have been added to compare the performances of downcall/upcall with JNI. > > [1] - https://openjdk.java.net/jeps/389 > [2] - https://openjdk.java.net/jeps/393 > [3] - https://git.openjdk.java.net/jdk/pull/548 > [4] - https://github.com/openjdk/panama-foreign/blob/foreign-jextract/doc/panama_ffi.md > [5] - http://cr.openjdk.java.net/~jvernee/docs/Foreign-abi%20downcall%20intrinsics%20technical%20description.html Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Fix whitespaces ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/634/files - new: https://git.openjdk.java.net/jdk/pull/634/files/0c892293..2c2d2a70 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=634&range=11 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=634&range=10-11 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/634.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/634/head:pull/634 PR: https://git.openjdk.java.net/jdk/pull/634 From gziemski at openjdk.java.net Thu Oct 22 17:31:18 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Thu, 22 Oct 2020 17:31:18 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) Message-ID: hi all, Please review this fix for POSIX platforms, which addresses a time out that occurs while handling a crash with UseOSErrorReporting ON The timeout was caused by the crash handling code looping infinitively because code incorrectly assumed that the signal handlers were reset to their defaults, while what really was happening was that the code was resetting the signal handlers to our default signal handlers. To avoid similar confusion in the future I did the following: - renamed the VMError::reset_signal_handlers() to VMError:: rearm_signal_handlers() - introduced a new API VMError::clear_signal_handlers() which is implemented in PosixSignals PosixSignals::clear_signal_handlers() is where the actual fix is done and it simply resets all signal handlers to their system defaults. A similar problem occurs on Windows, with the only difference being that before a process times out (takes 2 minutes) it runs out of stack space in about 250 loops, so that's the only reason it doesn't linger for that long. Windows issue is tracked separately by https://bugs.openjdk.java.net/browse/JDK-8250782 ------------- Commit messages: - reset signal handlers to their system defaults if handling crash with UseOSErrorReporting Changes: https://git.openjdk.java.net/jdk/pull/813/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=813&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8250637 Stats: 39 lines in 6 files changed: 30 ins; 0 del; 9 mod Patch: https://git.openjdk.java.net/jdk/pull/813.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/813/head:pull/813 PR: https://git.openjdk.java.net/jdk/pull/813 From mdoerr at openjdk.java.net Thu Oct 22 18:13:23 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Thu, 22 Oct 2020 18:13:23 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v8] In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 14:01:05 GMT, Paul Murphy wrote: >> CoreyAshford has updated the pull request incrementally with one additional commit since the last revision: >> >> TestBase64.java: remove jdk.test.lib.Utils from @build which was causing Tier3 failures. > > src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3820: > >> 3818: __ vcmpequb_(eq_special_case_char, input, vec_special_case_char); >> 3819: // >> 3820: // There's a (63/64)^16 = 77.7% chance that there are no special > > I think that assumes uniformly randomized data, is that a good assumption? Is it measurably faster to skip around the xxsel instead of doing it unconditionally? Thanks for this question. I also stumbled over it when reviewing. I guess a branch which gets mispredicted in ~22% of the cases leads to a big performance loss. (In addition, the branch target is not aligned.) ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From github.com+51754783+coreyashford at openjdk.java.net Thu Oct 22 19:51:20 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Thu, 22 Oct 2020 19:51:20 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v8] In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 14:00:49 GMT, Paul Murphy wrote: >> CoreyAshford has updated the pull request incrementally with one additional commit since the last revision: >> >> TestBase64.java: remove jdk.test.lib.Utils from @build which was causing Tier3 failures. > > src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3817: > >> 3815: __ xxperm(offsets->to_vsr(), offsetLUT, higher_nibble->to_vsr()); >> 3816: >> 3817: // Find out which elemets are the special case character (isURL ? '/' : '-') > > Trivial nit, s/elemets/elements/ Will fix. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From redestad at openjdk.java.net Thu Oct 22 20:10:14 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Thu, 22 Oct 2020 20:10:14 GMT Subject: RFR: 8255271: Avoid generating duplicate interpreter entries for subword types Message-ID: Top-of-stack optimizations are generally missing for byte, char, short and boolean, and for several types of entry points we already avoid generating entries and instead redirect to the int variant. Consolidate the code for and examine if this can be extende for more or all type of entries. ------------- Commit messages: - Revert accidental changes in templateInterpreterGenerator_x86 - Introduce EntryPoint(ailfdv) and simplify - Merge branch 'master' into interpreter_init_opts - typos in assert - Gardening - Add missing increment - Coalesce more entry points - Allow legitimate uses of pop/push(b-stos) - Avoid generating TOS entries for subword types - Inline locator_address Changes: https://git.openjdk.java.net/jdk/pull/816/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=816&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255271 Stats: 75 lines in 5 files changed: 32 ins; 27 del; 16 mod Patch: https://git.openjdk.java.net/jdk/pull/816.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/816/head:pull/816 PR: https://git.openjdk.java.net/jdk/pull/816 From stuefe at openjdk.java.net Thu Oct 22 20:16:15 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 22 Oct 2020 20:16:15 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 16:40:43 GMT, Gerard Ziemski wrote: > hi all, > > Please review this fix for POSIX platforms, which addresses a time out that occurs while handling a crash with UseOSErrorReporting ON > > The timeout was caused by the crash handling code, looping infinitively, because it incorrectly assumed that the signal handlers were reset to their defaults, while what really was happening was that the code was resetting the signal handlers to our default signal handler. > > To avoid similar confusion in the future I did the following: > > - renamed the VMError::reset_signal_handlers() to VMError:: rearm_signal_handlers() > - introduced a new API VMError::clear_signal_handlers() which is implemented in PosixSignals > > PosixSignals::clear_signal_handlers() is where the actual fix is done and it simply resets all signal handlers to their system defaults. > > A similar problem occurs on Windows, with the only difference being that before a process times out (takes 2 minutes) it runs out of stack space in about 250 loops, so that's the only reason it doesn't linger for that long. Windows issue is tracked separately by https://bugs.openjdk.java.net/browse/JDK-8250782 Hi Gerard, I have general concerns about the usefulness of this switch, see the comments in the JBS issue. Beyond that, some remarks below. Cheers, Thomas src/hotspot/os/posix/vmError_posix.cpp line 141: > 139: } > 140: > 141: void VMError::rearm_signal_handlers() { I think "re-arm" is a misnomer. Nothing is re-armed if by "armed" you mean "a signal handler was installed". There was one installed before and there gets a different one installed now. I agree that "reset_signal_handlers" is just plain wrong, and has bugged me too. May I suggest "install_secondary_signal_handlers()" ? Or possibly "install_secondary_crash_handlers" which would have the added benefit of removing the term "signal" out of the shared name space, which is confusing on windows. src/hotspot/os/posix/signals_posix.cpp line 1425: > 1423: for (int i = 0; i < NSIG + 1; i++) { > 1424: sigaction(i, &defaulthandler, NULL); > 1425: } Some issues: - 0 is not a valid signal number - NSIG is obscure; may be ridiculously large (e.g. AIX I believe 1024) and still may not contain all signals for some platforms (e.g. real time signals on BSD (?) are not included) - You now flatten here all signal handlers across the board. What if some third party app had installed their own. I would only reset those which had been originally installed for hotspot usage (beware ReduceSignalUsage). Which brings me to the point that if we fail to re-raise the original error signal, now we run with signal handlers completely disabled :( ... see my original concerns. ------------- Changes requested by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/813 From stuefe at openjdk.java.net Thu Oct 22 20:33:19 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 22 Oct 2020 20:33:19 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 20:12:55 GMT, Thomas Stuefe wrote: >> hi all, >> >> Please review this fix for POSIX platforms, which addresses a time out that occurs while handling a crash with UseOSErrorReporting ON >> >> The timeout was caused by the crash handling code, looping infinitively, because it incorrectly assumed that the signal handlers were reset to their defaults, while what really was happening was that the code was resetting the signal handlers to our default signal handler. >> >> To avoid similar confusion in the future I did the following: >> >> - renamed the VMError::reset_signal_handlers() to VMError:: rearm_signal_handlers() >> - introduced a new API VMError::clear_signal_handlers() which is implemented in PosixSignals >> >> PosixSignals::clear_signal_handlers() is where the actual fix is done and it simply resets all signal handlers to their system defaults. >> >> A similar problem occurs on Windows, with the only difference being that before a process times out (takes 2 minutes) it runs out of stack space in about 250 loops, so that's the only reason it doesn't linger for that long. Windows issue is tracked separately by https://bugs.openjdk.java.net/browse/JDK-8250782 > > src/hotspot/os/posix/signals_posix.cpp line 1425: > >> 1423: for (int i = 0; i < NSIG + 1; i++) { >> 1424: sigaction(i, &defaulthandler, NULL); >> 1425: } > > Some issues: > - 0 is not a valid signal number > - NSIG is obscure; may be ridiculously large (e.g. AIX I believe 1024) and still may not contain all signals for some platforms (e.g. real time signals on BSD (?) are not included) > - You now flatten here all signal handlers across the board. What if some third party app had installed their own. I would only reset those which had been originally installed for hotspot usage (beware ReduceSignalUsage). > > Which brings me to the point that if we fail to re-raise the original error signal, now we run with signal handlers completely disabled :( ... see my original concerns. Thinking further, we probably want to just the default handler for the error signal that happened. And we also want to do this only for synchronous error signals (segv, sigill, sigbus, sigfpe), and possibly only if they are "real" (had not been sent by kill or pthread_kill). ------------- PR: https://git.openjdk.java.net/jdk/pull/813 From github.com+51754783+coreyashford at openjdk.java.net Thu Oct 22 20:43:15 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Thu, 22 Oct 2020 20:43:15 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v8] In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 18:10:02 GMT, Martin Doerr wrote: >> src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3820: >> >>> 3818: __ vcmpequb_(eq_special_case_char, input, vec_special_case_char); >>> 3819: // >>> 3820: // There's a (63/64)^16 = 77.7% chance that there are no special >> >> I think that assumes uniformly randomized data, is that a good assumption? Is it measurably faster to skip around the xxsel instead of doing it unconditionally? > > Thanks for this question. I also stumbled over it when reviewing. I guess a branch which gets mispredicted in ~22% of the cases leads to a big performance loss. (In addition, the branch target is not aligned.) Yes, it assumes uniformly random data, but also recall that the unencoded data bytes get shifted by 2, 4, 6 bits into the encoded bytes, which I'm guessing would tend to make the data somewhat more uniform, even if the source data has low entropy. That said, I didn't actually benchmark it. I will do that to make sure there is a gain, and if there isn't I will remove the conditional branch. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From iklam at openjdk.java.net Thu Oct 22 20:48:17 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 22 Oct 2020 20:48:17 GMT Subject: RFR: 8255208: CodeStrings passed to Disassembler::decode are ignored In-Reply-To: References: Message-ID: On Wed, 21 Oct 2020 23:50:51 GMT, Claes Redestad wrote: > CodeStrings passed directly to Disassembler::decode are wrongly ignored. > > This patch started out as a cleanup to clean out CodeStrings, but I realized as I was about to remove the suspiciously unused CodeStrings in the disassembler that them being unused was likely a bug. > > For example, -XX:+PrintInterpreter (on debug builds) included messages which can help digesting the output (if not else by emitting text hooks to the place in the source code where the asm is generated): > > 0x00007f1f7093becb: je 0x00007f1f7093bee5 > ;; call_VM_base: heap base corrupted? <<< omitted > 0x00007f1f7093bed1: mov $0x7f1f90c7ecb8,%rdi > 0x00007f1f7093bedb: and $0xfffffffffffffff0,%rsp > 0x00007f1f7093bedf: callq 0x00007f1f9046c0a0 = MacroAssembler::debug64(char*, long, long*) > > While PrintInterpreter is the only case that appears directly affected, restoring this capability seems useful in general. > > The cleaning up of the code also has some nice side-effects such as reducing the size of a CodeBuffer from 432 to 408 bytes and marginally improving the static size of the JVM (as measured on linux-x64) LGTM ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/788 From iklam at openjdk.java.net Thu Oct 22 20:52:11 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 22 Oct 2020 20:52:11 GMT Subject: RFR: 8255271: Avoid generating duplicate interpreter entries for subword types In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 18:19:59 GMT, Claes Redestad wrote: > Top-of-stack optimizations are generally missing for byte, char, short and boolean, and for several types of entry points we already avoid generating entries and instead redirect to the int variant. > > This patch removes generation of "specialized" variants for byte, char, short and boolean more thoroughly after verifying they all generate the same code as their int specialization. This slightly reduces overhead of generating the interpreter and the size thereof. LGTM ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/816 From github.com+51754783+coreyashford at openjdk.java.net Thu Oct 22 21:58:14 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Thu, 22 Oct 2020 21:58:14 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v8] In-Reply-To: References: Message-ID: <_ddbVr7GqBFrju0E3eJ1PHH306bzPxNDyMiXEceBvSE=.d7db184f-0b82-480d-bea3-85e6a342385f@github.com> On Thu, 22 Oct 2020 14:32:40 GMT, Paul Murphy wrote: >> CoreyAshford has updated the pull request incrementally with one additional commit since the last revision: >> >> TestBase64.java: remove jdk.test.lib.Utils from @build which was causing Tier3 failures. > > src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3884: > >> 3882: // | vec_0x3fs | 00111111 | 00111111 | 00111111 | 00111111 | 00111111 | 00111111 | 00111111 | 00111111 | >> 3883: // +---------------+-------------+----------------------+----------------------+-------------+-------------+----------------------+----------------------+-------------+ >> 3884: // | after vpextd | b5:0..7 | b4:0..7 | b3:0..7 | b2:0..7 | b1:0..7 | b0:0..7 | 00000000 | 00000000 | > > Are theses comments correct or am I misunderstanding this? I read the final result as something starting as `b5:2..7 || b4:4..7|| b5:0..1` from vpextd. Because the bytes are displayed e15..e8, instead of the other way around, it's hard to follow. As an example, consider just the last four bytes of the table, but displayed in the reverse order: 00||b0:0..5 00||b0:6..7||b1:0..3 00||b1:4..7||b2:0..1 00||b2:2..7 After vpextd with bit select pattern 00111111 for all bytes: b0:0..5||b0:6..7 b1:0..3||1:4..7 b2:0..1||b2:2..7 = b0:0..7 b1:0..7 b2:0..7 Should I reverse the order of this table with a comment at the top, to explain the reason for the reversal? It seems like a good idea. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From github.com+51754783+coreyashford at openjdk.java.net Thu Oct 22 22:09:17 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Thu, 22 Oct 2020 22:09:17 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v8] In-Reply-To: References: Message-ID: <_JR-e3ZsRFwvZCR7ws34z5jLjp2kJQ1bu4gyl0RG1XU=.ec3040cf-8147-4dcd-b87d-4fd9be4eb59e@github.com> On Thu, 22 Oct 2020 14:24:34 GMT, Paul Murphy wrote: >> CoreyAshford has updated the pull request incrementally with one additional commit since the last revision: >> >> TestBase64.java: remove jdk.test.lib.Utils from @build which was causing Tier3 failures. > > src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3878: > >> 3876: // | Element | | | | | | | | | >> 3877: // +===============+=============+======================+======================+=============+=============+======================+======================+=============+ >> 3878: // | after vaddubm | 00||b0:0..5 | 00||b0:6..7||b1:0..3 | 00||b1:4..7||b2:0..1 | 00||b2:2..7 | 00||b3:0..5 | 00||b3:6..7||b4:0..3 | 00||b4:4..7||b5:0..1 | 00||b5:2..7 | > > An extra line here showing how the 8 6-bit values above get mapping into 6 bytes greatly help my brain out. (likewise for the References: <4cLDGQwOXZqws-dr9QulUFnxYJ646Y_ObUzfjEfSfD4=.4f10936d-286c-406e-b72d-4849e3d4a5a6@github.com> Message-ID: On Mon, 19 Oct 2020 12:12:51 GMT, Markus Gr?nlund wrote: > Greetings, > > `VM_Version_Ext::max_qualified_cpu_freq_from_brand_string()` attempts to extract the CPU frequency, by inspecting the CPU brand string (as per the document "Intel? Processor Identification and the CPUID Instruction. Application note 845 May 2012"). > > There is a bug with the current implementation, because it is naive in using the following construct: > > const char* Hz_location = strchr(brand_string, 'H'); > > This likely works for most CPU models / brands, but not when the brand string is for example of the following form: > > "Intel(R) Core(TM) i7-9850H CPU @ 2.60GHz" > > The 'H' in "9850H" will be matched, but since there is no 'z' in the next position, the code will fall through and report a frequency of 0. > > Testing: > - [x] jdk_jfr > - [x] debug verification for brand string "Intel(R) Core(TM) i7-9850H CPU @ 2.60GHz" > > Comment: The doc link is stale and therefore removed. No stable links for the Application note 845 versions could be located, hence the doc is here referenced by name instead. > > Thanks > Markus Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/736 From iklam at openjdk.java.net Fri Oct 23 06:52:46 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 23 Oct 2020 06:52:46 GMT Subject: RFR: 8255285: Move JVMFlag origins into a new enum JVMFlagOrigin Message-ID: Many JVM function take an `JVMFlag::Flags` parameter to indicate the origin of the flag -- i.e., "who is setting this flag". E.g., in arguments.hpp: static bool parse_argument(const char* arg, JVMFlag::Flags origin); However, `JVMFlag::Flags` contains many other bits that are unrelated to the origin. We should add a new enum `JVMFlagOrigin` that has only the valid values for the origin. This makes it possible to do more type-safety checks at C++ compilation time. This patch also renamed the confusing bit `JVMFlag::ORIG_COMMAND_LINE` to `WAS_SET_IN_COMMAND_LINE` and added documentation, so that it won't be confused with `JVMFlagOrigin::COMMAND_LINE`. ------------- Commit messages: - fixed whitespaces - jvmflagorigin Changes: https://git.openjdk.java.net/jdk/pull/823/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=823&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255285 Stats: 173 lines in 19 files changed: 49 ins; 14 del; 110 mod Patch: https://git.openjdk.java.net/jdk/pull/823.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/823/head:pull/823 PR: https://git.openjdk.java.net/jdk/pull/823 From tschatzl at openjdk.java.net Fri Oct 23 07:18:58 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Fri, 23 Oct 2020 07:18:58 GMT Subject: RFR: 8255232: G1: Make G1BiasedMappedArray freeable Message-ID: <966v_gu-j37Do5XhKCgwhCu0DDdwyKPQ04oUQkvzEIs=.96808fbd-a650-448e-96af-58a2bf7b9c2d@github.com> Hi all, can I have reviews for this change that makes G1BiasedMappedArray freeable? Previously all G1BiasedMappedArray were created as unfreeable i.e. assigned to static variables. However with JDK-8253600 I need one such biased map for the full collector which is created and deleted during full GC. So the biased array should also be freed as necessary to avoid a memory leak. The alternative would be to statically allocate that map anyway and provide it to the current G1FullCollector instance, but I do not think the single malloc call is perf sensitive compared to full collector work and there is much point in doing something more complicated at this time. In the future I hope that the young gen collector will also be extracted from G1CollectedHeap with the same need. If/when allocation of these helper data structures becomes a problem I would suggest looking into this again. One option then could be using some ResoureArea for these things in the future. For this change there should be no change in behavior at all. Testing: tier1-5 Thanks, Thomas ------------- Commit messages: - Initial import Changes: https://git.openjdk.java.net/jdk/pull/808/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=808&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255232 Stats: 43 lines in 4 files changed: 35 ins; 2 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/808.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/808/head:pull/808 PR: https://git.openjdk.java.net/jdk/pull/808 From redestad at openjdk.java.net Fri Oct 23 07:34:35 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Fri, 23 Oct 2020 07:34:35 GMT Subject: RFR: 8255208: CodeStrings passed to Disassembler::decode are ignored In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 20:45:36 GMT, Ioi Lam wrote: >> CodeStrings passed directly to Disassembler::decode are wrongly ignored. >> >> This patch started out as a cleanup to clean out CodeStrings, but I realized as I was about to remove the suspiciously unused CodeStrings in the disassembler that them being unused was likely a bug. >> >> For example, -XX:+PrintInterpreter (on debug builds) included messages which can help digesting the output (if not else by emitting text hooks to the place in the source code where the asm is generated): >> >> 0x00007f1f7093becb: je 0x00007f1f7093bee5 >> ;; call_VM_base: heap base corrupted? <<< omitted >> 0x00007f1f7093bed1: mov $0x7f1f90c7ecb8,%rdi >> 0x00007f1f7093bedb: and $0xfffffffffffffff0,%rsp >> 0x00007f1f7093bedf: callq 0x00007f1f9046c0a0 = MacroAssembler::debug64(char*, long, long*) >> >> While PrintInterpreter is the only case that appears directly affected, restoring this capability seems useful in general. >> >> The cleaning up of the code also has some nice side-effects such as reducing the size of a CodeBuffer from 432 to 408 bytes and marginally improving the static size of the JVM (as measured on linux-x64) > > LGTM @vnkozlov @iklam - thank you for reviewing! ------------- PR: https://git.openjdk.java.net/jdk/pull/788 From david.holmes at oracle.com Fri Oct 23 07:34:45 2020 From: david.holmes at oracle.com (David Holmes) Date: Fri, 23 Oct 2020 17:34:45 +1000 Subject: RFR: 8255285: Move JVMFlag origins into a new enum JVMFlagOrigin In-Reply-To: References: Message-ID: <41616155-a0cf-d74e-9463-5f2a133a5025@oracle.com> Hi Ioi, On 23/10/2020 4:52 pm, Ioi Lam wrote: > Many JVM function take an `JVMFlag::Flags` parameter to indicate the origin of the flag -- i.e., "who is setting this flag". E.g., in arguments.hpp: > > static bool parse_argument(const char* arg, JVMFlag::Flags origin); > > However, `JVMFlag::Flags` contains many other bits that are unrelated to the origin. We should add a new enum `JVMFlagOrigin` that has only the valid values for the origin. This makes it possible to do more type-safety checks at C++ compilation time. > > This patch also renamed the confusing bit `JVMFlag::ORIG_COMMAND_LINE` to `WAS_SET_IN_COMMAND_LINE` and added documentation, so that it won't be confused with `JVMFlagOrigin::COMMAND_LINE`. I'm still confused :) Why are we reporting "command line" for a flag that was ergonomically set, or vice-versa? Surely a flag is either set via the command-line or via ergonomics but not both ?? I was under the assumption that ergonomics should not touch a flag explicitly set on the command-line as that defeats the purpose of setting it. > enum class JVMFlagOrigin Why not define this as enum class Origin inside class JVMFlag, so that it is then referred to as JVMFlag::Origin? > static const JVMFlagOrigin DEFAULT = JVMFlagOrigin::DEFAULT; Why is this needed?? To avoid re-typing JVMFlagOrigin? Cheers, David ----- > ------------- > > Commit messages: > - fixed whitespaces > - jvmflagorigin > > Changes: https://git.openjdk.java.net/jdk/pull/823/files > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=823&range=00 > Issue: https://bugs.openjdk.java.net/browse/JDK-8255285 > Stats: 173 lines in 19 files changed: 49 ins; 14 del; 110 mod > Patch: https://git.openjdk.java.net/jdk/pull/823.diff > Fetch: git fetch https://git.openjdk.java.net/jdk pull/823/head:pull/823 > > PR: https://git.openjdk.java.net/jdk/pull/823 > From redestad at openjdk.java.net Fri Oct 23 07:34:36 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Fri, 23 Oct 2020 07:34:36 GMT Subject: Integrated: 8255208: CodeStrings passed to Disassembler::decode are ignored In-Reply-To: References: Message-ID: <_dmAgmfNG3-aYZcjdW06oHU2FavsQkATHuAFQFY7xfU=.b4536e2f-0cae-4b03-8039-b34e690553fb@github.com> On Wed, 21 Oct 2020 23:50:51 GMT, Claes Redestad wrote: > CodeStrings passed directly to Disassembler::decode are wrongly ignored. > > This patch started out as a cleanup to clean out CodeStrings, but I realized as I was about to remove the suspiciously unused CodeStrings in the disassembler that them being unused was likely a bug. > > For example, -XX:+PrintInterpreter (on debug builds) included messages which can help digesting the output (if not else by emitting text hooks to the place in the source code where the asm is generated): > > 0x00007f1f7093becb: je 0x00007f1f7093bee5 > ;; call_VM_base: heap base corrupted? <<< omitted > 0x00007f1f7093bed1: mov $0x7f1f90c7ecb8,%rdi > 0x00007f1f7093bedb: and $0xfffffffffffffff0,%rsp > 0x00007f1f7093bedf: callq 0x00007f1f9046c0a0 = MacroAssembler::debug64(char*, long, long*) > > While PrintInterpreter is the only case that appears directly affected, restoring this capability seems useful in general. > > The cleaning up of the code also has some nice side-effects such as reducing the size of a CodeBuffer from 432 to 408 bytes and marginally improving the static size of the JVM (as measured on linux-x64) This pull request has now been integrated. Changeset: c1524c59 Author: Claes Redestad URL: https://git.openjdk.java.net/jdk/commit/c1524c59 Stats: 264 lines in 11 files changed: 60 ins; 146 del; 58 mod 8255208: CodeStrings passed to Disassembler::decode are ignored Reviewed-by: kvn, iklam ------------- PR: https://git.openjdk.java.net/jdk/pull/788 From tschatzl at openjdk.java.net Fri Oct 23 07:54:48 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Fri, 23 Oct 2020 07:54:48 GMT Subject: RFR: 8253600: G1: Fully support pinned regions for full gc Message-ID: Hi all, can I get reviews for this change that implements "proper" support for pinned regions in the G1 full collector? By proper I mean that at the end of gc, pinned regions contain the correct TAMS and bitmap markings under the TAMS so that dead objects within them are supported? Currently all (pinned) regions have their TAMS set to bottom() and their bitmap above TAMS cleared (at least logically :) ). This works as long objects within these regions can't be dead as it is the case now: - humongous regions are either live or fully reclaimed. - all other pinned regions are archive regions at the moment that are always treated as fully live (and do not contain dead objects). This change is a requirement for fixing JDK-8253081 as some earlier change made it possible to have dead objects within open archive regions. It also enables supporting removal of gclocker use for g1, i.e. using region pinning. Based on the PR#808 (https://github.com/openjdk/jdk/pull/808). Testing: tier1-8, testing with prototype for region pinning, testing with prototype for JDK-8253081. Performance testing: no regressions Some comments for questions that might come up during review: - how does this work with the bitmaps now: - at start of full gc the next bitmap is cleared - full gc marks the next bitmap - for all pinned regions, keep TAMS and top() (*), otherwise set TAMS to bottom - swap bitmaps - clear next bitmap for next marking (*) this means that from a usage POV pinned regions are considered full. This is inaccurate, but sufficient: full gc clears all remembered sets anyway, so we do not need that information for gc efficiency purposes anyway to evacuate later. The next marking before old gen evacuation will update it to the correct values anyway. G1 does not support allocation into "holes" in pinned regions that can be open archive only at this time too, so there is no need to be more exact. - use of a region attribute table for phase 2+ only: compared to before we need fast access to information whether a given reference goes into a pinned region (as opposed to an archive region) wrt to adjusting that pointer to avoid doing work for these references. Phase 1 marking could have used this information for the do-we-need-to-preserve-the-mark check too: however this would have required g1 to add an extra another pass over all regions to update that. This seemed slower than just checking this information "more slowly" for the objects that need mark preservation. Tests showed that this is the case for <0.00% (yeah, these references that need mark preservation are rounding errors in cases it matters) of overall references, so I did not add that pass. (Additionally g1 full gc is a last-ditch effort, and while marking takes a significant time, it does not completely dominate it). I.e. the second clause in the condition of this hunk is intentionally slower than could be: @@ -52,7 +52,9 @@ inline bool G1FullGCMarker::mark_object(oop obj) { // Marked by us, preserve if needed. markWord mark = obj->mark(); if (obj->mark_must_be_preserved(mark) && // It is not necessary to preserve marks for objects in pinned regions because // we do not change their headers (i.e. forward them). !G1CollectedHeap::heap()->heap_region_containing(obj)->is_pinned()) { preserved_stack()->push(obj, mark); } - there is no code yet that checks for empty pinned regions yet. Only JDK-8253081 introduces that because still all contents of all archive regions are live forever. Also please note that the 51b297b change is from the #808 change. Thanks, Thomas ------------- Commit messages: - Initial import - Initial import Changes: https://git.openjdk.java.net/jdk/pull/824/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=824&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253600 Stats: 285 lines in 23 files changed: 175 ins; 53 del; 57 mod Patch: https://git.openjdk.java.net/jdk/pull/824.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/824/head:pull/824 PR: https://git.openjdk.java.net/jdk/pull/824 From redestad at openjdk.java.net Fri Oct 23 08:03:51 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Fri, 23 Oct 2020 08:03:51 GMT Subject: RFR: 8255299: Drop explicit zeroing at instantiation of Atomic* objects In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 20:46:15 GMT, ?????? ??????? wrote: > As discussed in https://github.com/openjdk/jdk/pull/510 there is never a reason to explicitly instantiate any instance of `Atomic*` class with its default value, i.e. `new AtomicInteger(0)` could be replaced with `new AtomicInteger()` which is faster: > @State(Scope.Thread) > @OutputTimeUnit(TimeUnit.NANOSECONDS) > @BenchmarkMode(value = Mode.AverageTime) > public class AtomicBenchmark { > @Benchmark > public Object defaultValue() { > return new AtomicInteger(); > } > @Benchmark > public Object explicitValue() { > return new AtomicInteger(0); > } > } > THis benchmark demonstrates that `explicitValue()` is much slower: > Benchmark Mode Cnt Score Error Units > AtomicBenchmark.defaultValue avgt 30 4.778 ? 0.403 ns/op > AtomicBenchmark.explicitValue avgt 30 11.846 ? 0.273 ns/op > So meanwhile https://bugs.openjdk.java.net/browse/JDK-8145948 is still in progress we could trivially replace explicit zeroing with default constructors gaining some performance benefit with no risk. > > I've tested the changes locally, both tier1 and tier 2 are ok. > > Could one create an issue for tracking this? Filed [8255299](https://bugs.openjdk.java.net/browse/JDK-8255299) for this. Prefix the name of the PR with "8255299: " and it should pass checks. ------------- PR: https://git.openjdk.java.net/jdk/pull/818 From github.com+10835776+stsypanov at openjdk.java.net Fri Oct 23 08:03:51 2020 From: github.com+10835776+stsypanov at openjdk.java.net (=?UTF-8?B?0KHQtdGA0LPQtdC5?= =?UTF-8?B?IA==?= =?UTF-8?B?0KbRi9C/0LDQvdC+0LI=?=) Date: Fri, 23 Oct 2020 08:03:51 GMT Subject: RFR: 8255299: Drop explicit zeroing at instantiation of Atomic* objects Message-ID: As discussed in https://github.com/openjdk/jdk/pull/510 there is never a reason to explicitly instantiate any instance of `Atomic*` class with its default value, i.e. `new AtomicInteger(0)` could be replaced with `new AtomicInteger()` which is faster: @State(Scope.Thread) @OutputTimeUnit(TimeUnit.NANOSECONDS) @BenchmarkMode(value = Mode.AverageTime) public class AtomicBenchmark { @Benchmark public Object defaultValue() { return new AtomicInteger(); } @Benchmark public Object explicitValue() { return new AtomicInteger(0); } } THis benchmark demonstrates that `explicitValue()` is much slower: Benchmark Mode Cnt Score Error Units AtomicBenchmark.defaultValue avgt 30 4.778 ? 0.403 ns/op AtomicBenchmark.explicitValue avgt 30 11.846 ? 0.273 ns/op So meanwhile https://bugs.openjdk.java.net/browse/JDK-8145948 is still in progress we could trivially replace explicit zeroing with default constructors gaining some performance benefit with no risk. I've tested the changes locally, both tier1 and tier 2 are ok. Could one create an issue for tracking this? ------------- Commit messages: - 8255299: Drop explicit zeroing at instantiation of Atomic* objects Changes: https://git.openjdk.java.net/jdk/pull/818/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=818&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255299 Stats: 19 lines in 17 files changed: 0 ins; 3 del; 16 mod Patch: https://git.openjdk.java.net/jdk/pull/818.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/818/head:pull/818 PR: https://git.openjdk.java.net/jdk/pull/818 From redestad at openjdk.java.net Fri Oct 23 08:08:35 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Fri, 23 Oct 2020 08:08:35 GMT Subject: RFR: 8255299: Drop explicit zeroing at instantiation of Atomic* objects In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 20:46:15 GMT, ?????? ??????? wrote: > As discussed in https://github.com/openjdk/jdk/pull/510 there is never a reason to explicitly instantiate any instance of `Atomic*` class with its default value, i.e. `new AtomicInteger(0)` could be replaced with `new AtomicInteger()` which is faster: > @State(Scope.Thread) > @OutputTimeUnit(TimeUnit.NANOSECONDS) > @BenchmarkMode(value = Mode.AverageTime) > public class AtomicBenchmark { > @Benchmark > public Object defaultValue() { > return new AtomicInteger(); > } > @Benchmark > public Object explicitValue() { > return new AtomicInteger(0); > } > } > THis benchmark demonstrates that `explicitValue()` is much slower: > Benchmark Mode Cnt Score Error Units > AtomicBenchmark.defaultValue avgt 30 4.778 ? 0.403 ns/op > AtomicBenchmark.explicitValue avgt 30 11.846 ? 0.273 ns/op > So meanwhile https://bugs.openjdk.java.net/browse/JDK-8145948 is still in progress we could trivially replace explicit zeroing with default constructors gaining some performance benefit with no risk. > > I've tested the changes locally, both tier1 and tier 2 are ok. > > Could one create an issue for tracking this? Marked as reviewed by redestad (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/818 From serb at openjdk.java.net Fri Oct 23 08:17:40 2020 From: serb at openjdk.java.net (Sergey Bylokhov) Date: Fri, 23 Oct 2020 08:17:40 GMT Subject: RFR: 8255299: Drop explicit zeroing at instantiation of Atomic* objects In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 20:46:15 GMT, ?????? ??????? wrote: > As discussed in https://github.com/openjdk/jdk/pull/510 there is never a reason to explicitly instantiate any instance of `Atomic*` class with its default value, i.e. `new AtomicInteger(0)` could be replaced with `new AtomicInteger()` which is faster: > @State(Scope.Thread) > @OutputTimeUnit(TimeUnit.NANOSECONDS) > @BenchmarkMode(value = Mode.AverageTime) > public class AtomicBenchmark { > @Benchmark > public Object defaultValue() { > return new AtomicInteger(); > } > @Benchmark > public Object explicitValue() { > return new AtomicInteger(0); > } > } > THis benchmark demonstrates that `explicitValue()` is much slower: > Benchmark Mode Cnt Score Error Units > AtomicBenchmark.defaultValue avgt 30 4.778 ? 0.403 ns/op > AtomicBenchmark.explicitValue avgt 30 11.846 ? 0.273 ns/op > So meanwhile https://bugs.openjdk.java.net/browse/JDK-8145948 is still in progress we could trivially replace explicit zeroing with default constructors gaining some performance benefit with no risk. > > I've tested the changes locally, both tier1 and tier 2 are ok. > > Could one create an issue for tracking this? The changes in src/java.desktop looks fine. ------------- Marked as reviewed by serb (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/818 From github.com+4146708+a74nh at openjdk.java.net Fri Oct 23 08:50:52 2020 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Fri, 23 Oct 2020 08:50:52 GMT Subject: RFR: 8221554: aarch64 cross-modifying code [v6] In-Reply-To: References: Message-ID: <1nKnthro_DGnukTjVKanK4y3FQwiQyGfJmtwc_Qm3Ik=.08f3cee7-8c4a-4f04-8073-d02e30774600@github.com> > The AArch64 port uses maybe_isb in places where an ISB might be required > because the code may have safepointed. These maybe_isbs are very conservative > and are used in many places are used when a safepoint has not happened. > > cross_modify_fence was added in common code to place a barrier in all the > places after a safepoint has occurred. All the uses of it are in common code, > yet it remains unimplemented on AArch64. > > This set of patches implements cross_modify_fence for AArch64 and reconsiders > every uses of maybe_isb, discarding many of them. In addition, it introduces > a new diagnostic option, which when enabled on AArch64 tests the correct > usage of the barriers. > > Advantage of this patch is threefold: > * Reducing the number of ISBs - giving a theoretical performance improvement. > * Use of common code instead of backend specific code. > * Additional test diagnostic options > > Patch 1: Split cross_modify_fence > ================================= > This is simply refactoring work split out to simplify the other two patches. > > instruction_fence() is provided by each target and simply places > a fence for the instruction stream. > > cross_modify_fence() is now a member of JavaThread and just calls > instruction_fence. This function will be extended in Patch 3. > > Patch 2: Use cross_modify_fence instead of maybe_isb > ==================================================== > > The [n] References refer to the comments for cross_modify_fence in > thread.hpp. > > This is all the existing uses of maybe_isb in the AArch64 target: > > 1) Instances of Java code calling a VM function > * This encapsulates the changes to: > ** MacroAssembler::call_VM_leaf_base() > ** generate_fast_get_int_field0() > ** stubGenerator_aarch64 generate_throw_exception() > ** sharedRuntime_aarch64 generate_handler_blob() > ** SharedRuntime::generate_resolve_blob() > ** C1 LIR_Assembler::rt_call > ** C1 StubAssembler::call_RT(): used by Used by generate_exception_throw, > generate_handle_exception, generate_code_for. > ** OptoRuntime::generate_exception_blob() > * Any changes will be caught due to calls to [2] or [3] by the VM function. > * Any calls that do not call [2] or [3] do not require an ISB. > * This patch is more optimal for these cases. > > 2) Instances of Java code calling a JNI function > * This encapsulates the changes to: > ** SharedRuntime::generate_native_wrapper() > ** TemplateInterpreterGenerator::generate_native_entry() > * A safepoint still in progress after the call with be caught by [4]. > * An ISB is still required for the case where there was a safepoint > but it completed during the call. This happens if the code doesn't > branch on safepoint_in_progress > * In the SharedRuntime version, the two possible calls to > reguard_yellow_pages and complete_monitor_unlocking_C are after the thread > goes back into it's original state, so are covered by [2] and [3], the > same as a normal VM call. > * This patch is only more optimal for the two post-JNI calls. > > 3) Patching functions > * This encapsulates the changes to: > ** patch_callers_callsite() (called by gen_c2i_adapter()) > * This results in code being patched, but does not safepoint > * Therefore an ISB is required. > * This patch introduces no change here. > > 4) C1 MacroAssembler::emit_static_call_stub() > * Calls ISB (not maybe_isb) > * By design, the patching doesn't require that the up-to-date > destination is required for proper functioning. > * However, the ISB makes it most likely that the new destination will > be picked up. > * This patch introduces no change here. > > Patch 3: Add cross modify fence verification > ============================================ > > The VerifyCrossModifyFence diagnostic flag enables confirmation to the correct > usage of instruction barriers. It can safely be enabled on any Java run. > > Enabling it will cause the following: > > * Once all threads have been brought to a safepoint, each thread will be > marked. > > * On a cross_modify_fence and safepoint_fence the mark for that thread > will be cleared. > > * On entry to a method and in a safepoint poll, then the thread is checked. > If it is marked, then the code will error. Alan Hayward has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: - Merge master Change-Id: I97df4e7686699478f0f89451ec0a3537d38cfd6d - Merge master Change-Id: I5e1715fdb11305191fe7bf86cbfb7a6da446b3dc - Remove inlasm_isb define Change-Id: I2d0ef8a78292dac875f3f65d2253981cdb7a497a - AArch64: Add cross modify fence verification - AArch64: Use cross_modify_fence instead of maybe_isb - Split cross_modify_fence ------------- Changes: https://git.openjdk.java.net/jdk/pull/428/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=428&range=05 Stats: 170 lines in 25 files changed: 123 ins; 8 del; 39 mod Patch: https://git.openjdk.java.net/jdk/pull/428.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/428/head:pull/428 PR: https://git.openjdk.java.net/jdk/pull/428 From mgronlun at openjdk.java.net Fri Oct 23 09:07:38 2020 From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Fri, 23 Oct 2020 09:07:38 GMT Subject: Integrated: 8249675: x86: frequency extraction from cpu brand string is incomplete In-Reply-To: <4cLDGQwOXZqws-dr9QulUFnxYJ646Y_ObUzfjEfSfD4=.4f10936d-286c-406e-b72d-4849e3d4a5a6@github.com> References: <4cLDGQwOXZqws-dr9QulUFnxYJ646Y_ObUzfjEfSfD4=.4f10936d-286c-406e-b72d-4849e3d4a5a6@github.com> Message-ID: On Mon, 19 Oct 2020 12:12:51 GMT, Markus Gr?nlund wrote: > Greetings, > > `VM_Version_Ext::max_qualified_cpu_freq_from_brand_string()` attempts to extract the CPU frequency, by inspecting the CPU brand string (as per the document "Intel? Processor Identification and the CPUID Instruction. Application note 845 May 2012"). > > There is a bug with the current implementation, because it is naive in using the following construct: > > const char* Hz_location = strchr(brand_string, 'H'); > > This likely works for most CPU models / brands, but not when the brand string is for example of the following form: > > "Intel(R) Core(TM) i7-9850H CPU @ 2.60GHz" > > The 'H' in "9850H" will be matched, but since there is no 'z' in the next position, the code will fall through and report a frequency of 0. > > Testing: > - [x] jdk_jfr > - [x] debug verification for brand string "Intel(R) Core(TM) i7-9850H CPU @ 2.60GHz" > > Comment: The doc link is stale and therefore removed. No stable links for the Application note 845 versions could be located, hence the doc is here referenced by name instead. > > Thanks > Markus This pull request has now been integrated. Changeset: 63ce304e Author: Markus Gr?nlund URL: https://git.openjdk.java.net/jdk/commit/63ce304e Stats: 50 lines in 2 files changed: 6 ins; 12 del; 32 mod 8249675: x86: frequency extraction from cpu brand string is incomplete Reviewed-by: egahlin, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/736 From dfuchs at openjdk.java.net Fri Oct 23 09:14:43 2020 From: dfuchs at openjdk.java.net (Daniel Fuchs) Date: Fri, 23 Oct 2020 09:14:43 GMT Subject: RFR: 8255299: Drop explicit zeroing at instantiation of Atomic* objects In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 20:46:15 GMT, ?????? ??????? wrote: > As discussed in https://github.com/openjdk/jdk/pull/510 there is never a reason to explicitly instantiate any instance of `Atomic*` class with its default value, i.e. `new AtomicInteger(0)` could be replaced with `new AtomicInteger()` which is faster: > @State(Scope.Thread) > @OutputTimeUnit(TimeUnit.NANOSECONDS) > @BenchmarkMode(value = Mode.AverageTime) > public class AtomicBenchmark { > @Benchmark > public Object defaultValue() { > return new AtomicInteger(); > } > @Benchmark > public Object explicitValue() { > return new AtomicInteger(0); > } > } > THis benchmark demonstrates that `explicitValue()` is much slower: > Benchmark Mode Cnt Score Error Units > AtomicBenchmark.defaultValue avgt 30 4.778 ? 0.403 ns/op > AtomicBenchmark.explicitValue avgt 30 11.846 ? 0.273 ns/op > So meanwhile https://bugs.openjdk.java.net/browse/JDK-8145948 is still in progress we could trivially replace explicit zeroing with default constructors gaining some performance benefit with no risk. > > I've tested the changes locally, both tier1 and tier 2 are ok. > > Could one create an issue for tracking this? src/java.base/share/classes/sun/net/ResourceManager.java line 65: > 63: } catch (NumberFormatException e) {} > 64: maxSockets = defmax; > 65: numSockets = new AtomicInteger(); Changes in sun/net look good to me. ------------- PR: https://git.openjdk.java.net/jdk/pull/818 From dfuchs at openjdk.java.net Fri Oct 23 09:17:36 2020 From: dfuchs at openjdk.java.net (Daniel Fuchs) Date: Fri, 23 Oct 2020 09:17:36 GMT Subject: RFR: 8255299: Drop explicit zeroing at instantiation of Atomic* objects In-Reply-To: References: Message-ID: On Fri, 23 Oct 2020 08:15:00 GMT, Sergey Bylokhov wrote: >> As discussed in https://github.com/openjdk/jdk/pull/510 there is never a reason to explicitly instantiate any instance of `Atomic*` class with its default value, i.e. `new AtomicInteger(0)` could be replaced with `new AtomicInteger()` which is faster: >> @State(Scope.Thread) >> @OutputTimeUnit(TimeUnit.NANOSECONDS) >> @BenchmarkMode(value = Mode.AverageTime) >> public class AtomicBenchmark { >> @Benchmark >> public Object defaultValue() { >> return new AtomicInteger(); >> } >> @Benchmark >> public Object explicitValue() { >> return new AtomicInteger(0); >> } >> } >> THis benchmark demonstrates that `explicitValue()` is much slower: >> Benchmark Mode Cnt Score Error Units >> AtomicBenchmark.defaultValue avgt 30 4.778 ? 0.403 ns/op >> AtomicBenchmark.explicitValue avgt 30 11.846 ? 0.273 ns/op >> So meanwhile https://bugs.openjdk.java.net/browse/JDK-8145948 is still in progress we could trivially replace explicit zeroing with default constructors gaining some performance benefit with no risk. >> >> I've tested the changes locally, both tier1 and tier 2 are ok. >> >> Could one create an issue for tracking this? > > The changes in src/java.desktop looks fine. Changes to `java.logging` and `java.net.http` also look good to me. ------------- PR: https://git.openjdk.java.net/jdk/pull/818 From adinn at openjdk.java.net Fri Oct 23 09:33:37 2020 From: adinn at openjdk.java.net (Andrew Dinn) Date: Fri, 23 Oct 2020 09:33:37 GMT Subject: RFR: 8248411: [aarch64] Insufficient error handling when CodeBuffer is exhausted In-Reply-To: References: <02kM0GEZd0Twb-mEJxf0diGjUzXV3hcpOfXclg8gt6A=.d0d3a1af-9641-4179-ba9f-18401c05e6a4@github.com> <-y6v6jtsEw-RF7EQbiAbuLsMBmcSBMRFyqkmhpEVeUE=.acccbef8-f90c-4e34-af60-9ac1dc1ed73b@github.com> Message-ID: On Thu, 22 Oct 2020 16:36:30 GMT, Andrew Haley wrote: >> So what about the `Foo* p` vs `Foo *p`? In short, I totally disagree. `Foo* p` supports the reading of `p` as Foo-pointer (or int-pointer, or void-pointer) which also emphasises the fact that the base type is vital to the operations available through/on `p`. This view is obscured by `Foo *p` (or `Foo p, *q`), suggesting "a pointer" is a more general concept, free of semantics imposed by its base type. For the same reason, I write `int* f()` not `int *f()`, not ever. And "we" don't write `int* p, q` nor do "we" write `int p, *q` since not only need pointers be initialised but more importantly, it's a truly bad idea to introduce different semantic entities in a single declaration. "We" write code to be read, and understood, by humans. >> >> May I also offer the following explanation to why the `*` (pointer-decl) operator binds to the right. Is it perhaps due to the fact that the original K&R C-syntax predates the first official standard by roughly 20 years, syntax picked-up from the B language (which only had one "pointer type", the address of an array, and no type-system), that we use the unary prefix "indirection operator"; `*v` ? >> >> In any case, I thank you both for reviewing the code (well, I guess 'aph' didn't actually review) even though it seems to upset the two of you, and despite the fact that I don't find the ranting particularly constructive. >> >> Thanks also to Vladimir. > >> In any case, I thank you both for reviewing the code (well, I guess 'aph' didn't actually review) even though it seems to upset the two of you, and despite the fact that I don't find the ranting particularly constructive. > > We're not upset, at least I'm not, but I don't think we should change existing code based on preference. In other words, despite the fact that I have a strong preference, I don't go round changing instances of `int* p` to `int *p` when I see them, and I don't think you should either. HotSpot doesn't have a history of micro-managing code layout, and that's a good thing. One consequence of this is minor local variations. > > Also, churn is bad, and in this case we have a bunch of edits to a file which has nothing to do with the bug being fixed. It's taken me a while to be sure, but I can find no changes to opto/output.cpp except layout. This complicates the diffs for the reviewer and makes future diff searches harder for people in the future. Sure, if layout is unusual to the point of being misleading,we should change it. > >> May I also offer the following explanation to why the * (pointer-decl) operator binds to the right. Is it perhaps due to the fact that the original K&R C-syntax predates the first official standard by roughly 20 years > > The reason for the syntax is that Richie had the idea that "declaration follows use". http://www.gotw.ca/publications/c_family_interview.htm Hi Patrick. You mistake my intention as well as Andrew's. I also am far from upset. The jist of my 'rant' was actually an argument, not an emotional reaction; the rant tags were merely offered by way of levity. I guess I actually need to reiterate the argument straight to address the concern both Andrew and I share. My purpose was not to crusade against your preferred orthography with a complementary jihad (n.b. I am as agnostic about asterisk placement as I am about which end of a boiled egg to attack). The point I thought I had rendered clear (but clearly not clear enough) is that orthographic crusades to correct perceived errors across the code base are worse than futile. Offences to partisan pedantry aside, they serve to introduce noise into the code base, the result being to mask the all-important signal i.e. what changes were actually needed to fix an issue. That outcome is a red flag for any reviewer/maintainer not just because we have to decipher that signal at review time, but also because it is problematic 1) at backport time and 2) when debugging, diagnosing and fixing problems in the same or closely related parts of the code base and backporting their patches. Of course, you are correct when you point out that this noise can be filtered in the github interface. However, that does not remove the noise from the code nor does it mean such redress is available from all the other tools -- including and most especially the human eye -- that have to process the code base when performing the review and susbequent review and maintenance tasks. Webrevs, file version diffs, git blame reports, conflict free patch backports all work better and require less human intervention if what are essentially irrelevant changes are avoided. That's advice that has been offered many times by other reviewers and that I repeat based on experience not ideology. I consider it far more valuable, albeit more boringly prosaic, than any conclusions one might arrive at debating the history of and intention behind K&R's syntactic choices. ------------- PR: https://git.openjdk.java.net/jdk/pull/765 From fyang at openjdk.java.net Fri Oct 23 09:40:53 2020 From: fyang at openjdk.java.net (Fei Yang) Date: Fri, 23 Oct 2020 09:40:53 GMT Subject: RFR: 8255287: aarch64: fix SVE patterns for vector shift count Message-ID: SVE patterns for vector shift count cannot be matched due to bad matching rules. Also code gen is not correct in certain cases for vlslS_imm and vlsrS_imm. Please refer to JDK-8255287 for details. Patch passed tier1 tests using QEMU system emulator which supports SVE. ------------- Commit messages: - 8255287: aarch64: fix SVE patterns for vector shift count Changes: https://git.openjdk.java.net/jdk/pull/827/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=827&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255287 Stats: 149 lines in 7 files changed: 109 ins; 0 del; 40 mod Patch: https://git.openjdk.java.net/jdk/pull/827.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/827/head:pull/827 PR: https://git.openjdk.java.net/jdk/pull/827 From eosterlund at openjdk.java.net Fri Oct 23 09:51:47 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 23 Oct 2020 09:51:47 GMT Subject: RFR: 8255233: InterpreterRuntime::at_unwind should be a JRT_LEAF Message-ID: InterpreterRuntime::at_unwind is called at the very beginning of remove_activation(), to notify concurrent stack processing that a frame is about to be unwound. It is currently a JRT_ENTRY, because it needs a last_Java_frame to see what frame is about to get unwound. However, there are special return paths used by JVMTI pop frame, that checks if the caller frame is deoptimized, then calls a special path that removes the top activation, assuming that does not enter the deopt handler. The new JRT_ENTRY makes that reasoning invalid. Therefore, we need this to be a JRT_LEAF, that sets a last Java frame, to make everyone happy. This patch performs that change. I have run tier 1-5 testing, and manually tested: while true; do make test JTREG="RETAIN=all" TEST=open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/ForceEarlyReturn/ForceEarlyReturn002 TEST_OPTS_JAVA_OPTIONS="-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=0.0001 -XX:ZFragmentationLimit=0.01 -XX:+VerifyOops -XX:+ZVerifyViews" ; done Before the fix it crashes ~1/15 runs with a bad oop. After the fix, it doesn't crash. I have run it more times than my tmux buffer fits (for a day), and it does not fail any more with this fix. Unfortunately, my testing on AArch64 has been stalled for a day, so I have sent out this PR without the testing of those bits being finished. I won't push until I get the results back, of course. But I am expecting that to be fine, as there is nothing special going on there and it compiles. Will post a comment when the complete results have arrived. ------------- Commit messages: - 8255233: InterpreterRuntime::at_unwind should be a JRT_LEAF Changes: https://git.openjdk.java.net/jdk/pull/828/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=828&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255233 Stats: 7 lines in 3 files changed: 4 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/828.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/828/head:pull/828 PR: https://git.openjdk.java.net/jdk/pull/828 From github.com+10835776+stsypanov at openjdk.java.net Fri Oct 23 09:59:41 2020 From: github.com+10835776+stsypanov at openjdk.java.net (=?UTF-8?B?0KHQtdGA0LPQtdC5?= =?UTF-8?B?IA==?= =?UTF-8?B?0KbRi9C/0LDQvdC+0LI=?=) Date: Fri, 23 Oct 2020 09:59:41 GMT Subject: RFR: 8255299: Drop explicit zeroing at instantiation of Atomic* objects In-Reply-To: References: Message-ID: On Fri, 23 Oct 2020 09:12:15 GMT, Daniel Fuchs wrote: >> As discussed in https://github.com/openjdk/jdk/pull/510 there is never a reason to explicitly instantiate any instance of `Atomic*` class with its default value, i.e. `new AtomicInteger(0)` could be replaced with `new AtomicInteger()` which is faster: >> @State(Scope.Thread) >> @OutputTimeUnit(TimeUnit.NANOSECONDS) >> @BenchmarkMode(value = Mode.AverageTime) >> public class AtomicBenchmark { >> @Benchmark >> public Object defaultValue() { >> return new AtomicInteger(); >> } >> @Benchmark >> public Object explicitValue() { >> return new AtomicInteger(0); >> } >> } >> THis benchmark demonstrates that `explicitValue()` is much slower: >> Benchmark Mode Cnt Score Error Units >> AtomicBenchmark.defaultValue avgt 30 4.778 ? 0.403 ns/op >> AtomicBenchmark.explicitValue avgt 30 11.846 ? 0.273 ns/op >> So meanwhile https://bugs.openjdk.java.net/browse/JDK-8145948 is still in progress we could trivially replace explicit zeroing with default constructors gaining some performance benefit with no risk. >> >> I've tested the changes locally, both tier1 and tier 2 are ok. >> >> Could one create an issue for tracking this? > > src/java.base/share/classes/sun/net/ResourceManager.java line 65: > >> 63: } catch (NumberFormatException e) {} >> 64: maxSockets = defmax; >> 65: numSockets = new AtomicInteger(); > > Changes in sun/net look good to me. @dfuch Could you then sponsor this PR? ------------- PR: https://git.openjdk.java.net/jdk/pull/818 From adinn at openjdk.java.net Fri Oct 23 10:08:37 2020 From: adinn at openjdk.java.net (Andrew Dinn) Date: Fri, 23 Oct 2020 10:08:37 GMT Subject: RFR: 8255287: aarch64: fix SVE patterns for vector shift count In-Reply-To: References: Message-ID: On Fri, 23 Oct 2020 09:35:11 GMT, Fei Yang wrote: > SVE patterns for vector shift count cannot be matched due to bad matching rules. > Also code gen is not correct in certain cases for vlslS_imm and vlsrS_imm. > Please refer to JDK-8255287 for details. > Patch passed tier1 tests using QEMU system emulator which supports SVE. These changes look ok. ------------- Marked as reviewed by adinn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/827 From eosterlund at openjdk.java.net Fri Oct 23 10:30:49 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 23 Oct 2020 10:30:49 GMT Subject: RFR: 8255243: Reinforce escape barrier interactions with ZGC conc stack processing In-Reply-To: References: Message-ID: On Fri, 23 Oct 2020 10:25:43 GMT, Erik ?sterlund wrote: > The escape barrier reallocates scalarized objects potentially deep into the stack of a remote thread. Each allocation can safepoint, causing referenced frames to be invalid. Some sprinklings were added that deal with that, but I believe it was subsequently broken with the integration of the new vector API, that has its own new deoptimization code that did not know about this. Not surprisingly, the integration of the new vector API had no idea about this subtlety, and allocates an object, and then reads an object deep from the stack of a remote thread (using an escape barrier). I suppose the issue is that all these 3 things were integrated at almost the same time. The problematic code sequence is in VectorSupport::allocate_vector() in vectorSupport.cpp, which is called from Deoptimization::realloc_objects(). It first allocates an oop (possibly safepointing), and then reads a vector oop from the stack. This is usually fine, but not through the escape barrier, with concurrent stack sc anning. While I have not seen any crashes yet, I can see from code inspection, that there is no way that this works correctly. > > In order to make this less fragile for future changes, we should really have a RAII object that keeps the target thread's stack of the escape barrier, stable and processed, across safepoints. This patch fixes that. Then it becomes much easier to reason about its correctness, compared to hoping the various hooks are applied after each safepoint. > > With this new robustness fix, the thread running the escape barrier, keeps the target thread stack processed, straight through safepoints on the requesting thread, making it easy and intuitive to understand why this works correctly. The RAII object basically just has to cover the code block that pokes at the remote stack and goes in and out of safepoints, arbitrarily. Arguably, this escape barrier doesn't need to be blazingly fast, and can afford keeping stacks sane through its operation. @reinrich Since you wrote the escape barrier, I thought I'd ping you in case you are interested. This makes the mechanism more robust and reliable, w.r.t. concurrent stack processing. No issue observed, but looks wrong with the new vector API deoptimization code that was integrated recently. This makes it more future-proof and easy to reason about. ------------- PR: https://git.openjdk.java.net/jdk/pull/832 From eosterlund at openjdk.java.net Fri Oct 23 10:30:49 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 23 Oct 2020 10:30:49 GMT Subject: RFR: 8255243: Reinforce escape barrier interactions with ZGC conc stack processing Message-ID: The escape barrier reallocates scalarized objects potentially deep into the stack of a remote thread. Each allocation can safepoint, causing referenced frames to be invalid. Some sprinklings were added that deal with that, but I believe it was subsequently broken with the integration of the new vector API, that has its own new deoptimization code that did not know about this. Not surprisingly, the integration of the new vector API had no idea about this subtlety, and allocates an object, and then reads an object deep from the stack of a remote thread (using an escape barrier). I suppose the issue is that all these 3 things were integrated at almost the same time. The problematic code sequence is in VectorSupport::allocate_vector() in vectorSupport.cpp, which is called from Deoptimization::realloc_objects(). It first allocates an oop (possibly safepointing), and then reads a vector oop from the stack. This is usually fine, but not through the escape barrier, with concurrent stack scan ning. While I have not seen any crashes yet, I can see from code inspection, that there is no way that this works correctly. In order to make this less fragile for future changes, we should really have a RAII object that keeps the target thread's stack of the escape barrier, stable and processed, across safepoints. This patch fixes that. Then it becomes much easier to reason about its correctness, compared to hoping the various hooks are applied after each safepoint. With this new robustness fix, the thread running the escape barrier, keeps the target thread stack processed, straight through safepoints on the requesting thread, making it easy and intuitive to understand why this works correctly. The RAII object basically just has to cover the code block that pokes at the remote stack and goes in and out of safepoints, arbitrarily. Arguably, this escape barrier doesn't need to be blazingly fast, and can afford keeping stacks sane through its operation. ------------- Commit messages: - 8255243: Reinforce escape barrier interactions with ZGC conc stack Changes: https://git.openjdk.java.net/jdk/pull/832/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=832&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255243 Stats: 149 lines in 10 files changed: 137 ins; 9 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/832.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/832/head:pull/832 PR: https://git.openjdk.java.net/jdk/pull/832 From eosterlund at openjdk.java.net Fri Oct 23 10:41:38 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 23 Oct 2020 10:41:38 GMT Subject: RFR: 8255243: Reinforce escape barrier interactions with ZGC conc stack processing In-Reply-To: References: Message-ID: On Fri, 23 Oct 2020 10:28:21 GMT, Erik ?sterlund wrote: >> The escape barrier reallocates scalarized objects potentially deep into the stack of a remote thread. Each allocation can safepoint, causing referenced frames to be invalid. Some sprinklings were added that deal with that, but I believe it was subsequently broken with the integration of the new vector API, that has its own new deoptimization code that did not know about this. Not surprisingly, the integration of the new vector API had no idea about this subtlety, and allocates an object, and then reads an object deep from the stack of a remote thread (using an escape barrier). I suppose the issue is that all these 3 things were integrated at almost the same time. The problematic code sequence is in VectorSupport::allocate_vector() in vectorSupport.cpp, which is called from Deoptimization::realloc_objects(). It first allocates an oop (possibly safepointing), and then reads a vector oop from the stack. This is usually fine, but not through the escape barrier, with concurrent stack s canning. While I have not seen any crashes yet, I can see from code inspection, that there is no way that this works correctly. >> >> In order to make this less fragile for future changes, we should really have a RAII object that keeps the target thread's stack of the escape barrier, stable and processed, across safepoints. This patch fixes that. Then it becomes much easier to reason about its correctness, compared to hoping the various hooks are applied after each safepoint. >> >> With this new robustness fix, the thread running the escape barrier, keeps the target thread stack processed, straight through safepoints on the requesting thread, making it easy and intuitive to understand why this works correctly. The RAII object basically just has to cover the code block that pokes at the remote stack and goes in and out of safepoints, arbitrarily. Arguably, this escape barrier doesn't need to be blazingly fast, and can afford keeping stacks sane through its operation. > > @reinrich Since you wrote the escape barrier, I thought I'd ping you in case you are interested. This makes the mechanism more robust and reliable, w.r.t. concurrent stack processing. No issue observed, but looks wrong with the new vector API deoptimization code that was integrated recently. This makes it more future-proof and easy to reason about. Forgot to mention, but I tested this from tier1-5, to sanity check that the new solution doesn't introduce any new issues. ------------- PR: https://git.openjdk.java.net/jdk/pull/832 From ihse at openjdk.java.net Fri Oct 23 11:04:41 2020 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Fri, 23 Oct 2020 11:04:41 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v12] In-Reply-To: <2IKx6cpc-IGP3jZtr0s2I14BWM6ptyFD26szPl3b1ng=.9d956b98-dfe6-4a45-a371-bf86923214fb@github.com> References: <2IKx6cpc-IGP3jZtr0s2I14BWM6ptyFD26szPl3b1ng=.9d956b98-dfe6-4a45-a371-bf86923214fb@github.com> Message-ID: On Thu, 22 Oct 2020 17:04:34 GMT, Maurizio Cimadamore wrote: >> This patch contains the changes associated with the first incubation round of the foreign linker access API incubation >> (see JEP 389 [1]). This work is meant to sit on top of the foreign memory access support (see JEP 393 [2] and associated pull request [3]). >> >> The main goal of this API is to provide a way to call native functions from Java code without the need of intermediate JNI glue code. In order to do this, native calls are modeled through the MethodHandle API. I suggest reading the writeup [4] I put together few weeks ago, which illustrates what the foreign linker support is, and how it should be used by clients. >> >> Disclaimer: the pull request mechanism isn't great at managing *dependent* reviews. For this reasons, I'm attaching a webrev which contains only the differences between this PR and the memory access PR. I will be periodically uploading new webrevs, as new iterations come out, to try and make the life of reviewers as simple as possible. >> >> A big thank to Jorn Vernee and Vladimir Ivanov - they are the main architects of all the hotspot changes you see here, and without their help, the foreign linker support wouldn't be what it is today. As usual, a big thank to Paul Sandoz, who provided many insights (often by trying the bits first hand). >> >> Thanks >> Maurizio >> >> Webrev: >> http://cr.openjdk.java.net/~mcimadamore/8254231_v1/webrev >> >> Javadoc: >> >> http://cr.openjdk.java.net/~mcimadamore/8254231_v1/javadoc/jdk/incubator/foreign/package-summary.html >> >> Specdiff (relative to [3]): >> >> http://cr.openjdk.java.net/~mcimadamore/8254231_v1/specdiff_delta/overview-summary.html >> >> CSR: >> >> https://bugs.openjdk.java.net/browse/JDK-8254232 >> >> >> >> ### API Changes >> >> The API changes are actually rather slim: >> >> * `LibraryLookup` >> * This class allows clients to lookup symbols in native libraries; the interface is fairly simple; you can load a library by name, or absolute path, and then lookup symbols on that library. >> * `FunctionDescriptor` >> * This is an abstraction that is very similar, in spirit, to `MethodType`; it is, at its core, an aggregate of memory layouts for the function arguments/return type. A function descriptor is used to describe the signature of a native function. >> * `CLinker` >> * This is the real star of the show. A `CLinker` has two main methods: `downcallHandle` and `upcallStub`; the first takes a native symbol (as obtained from `LibraryLookup`), a `MethodType` and a `FunctionDescriptor` and returns a `MethodHandle` instance which can be used to call the target native symbol. The second takes an existing method handle, and a `FunctionDescriptor` and returns a new `MemorySegment` corresponding to a code stub allocated by the VM which acts as a trampoline from native code to the user-provided method handle. This is very useful for implementing upcalls. >> * This class also contains the various layout constants that should be used by clients when describing native signatures (e.g. `C_LONG` and friends); these layouts contain additional ABI classfication information (in the form of layout attributes) which is used by the runtime to *infer* how Java arguments should be shuffled for the native call to take place. >> * Finally, this class provides some helper functions e.g. so that clients can convert Java strings into C strings and back. >> * `NativeScope` >> * This is an helper class which allows clients to group together logically related allocations; that is, rather than allocating separate memory segments using separate *try-with-resource* constructs, a `NativeScope` allows clients to use a _single_ block, and allocate all the required segments there. This is not only an usability boost, but also a performance boost, since not all allocation requests will be turned into `malloc` calls. >> * `MemorySegment` >> * Only one method added here - namely `handoff(NativeScope)` which allows a segment to be transferred onto an existing native scope. >> >> ### Safety >> >> The foreign linker API is intrinsically unsafe; many things can go wrong when requesting a native method handle. For instance, the description of the native signature might be wrong (e.g. have too many arguments) - and the runtime has, in the general case, no way to detect such mismatches. For these reasons, obtaining a `CLinker` instance is a *restricted* operation, which can be enabled by specifying the usual JDK property `-Dforeign.restricted=permit` (as it's the case for other restricted method in the foreign memory API). >> >> ### Implementation changes >> >> The Java changes associated with `LibraryLookup` are relative straightforward; the only interesting thing to note here is that library loading does _not_ depend on class loaders, so `LibraryLookup` is not subject to the same restrictions which apply to JNI library loading (e.g. same library cannot be loaded by different classloaders). >> >> As for `NativeScope` the changes are again relatively straightforward; it is an API which sits neatly on top of the foreign meory access API, providing some kind of allocation service which shares the same underlying memory segment(s), and turns an allocation request into a segment slice, which is a much less expensive operation. `NativeScope` comes in two variants: there are native scopes for which the allocation size is known a priori, and native scopes which can grow - these two schemes are implemented by two separate subclasses of `AbstractNativeScopeImpl`. >> >> Of course the bulk of the changes are to support the `CLinker` downcall/upcall routines. These changes cut pretty deep into the JVM; I'll briefly summarize the goal of some of this changes - for further details, Jorn has put together a detailed writeup which explains the rationale behind the VM support, with some references to the code [5]. >> >> The main idea behind foreign linker is to infer, given a Java method type (expressed as a `MethodType` instance) and the description of the signature of a native function (expressed as a `FunctionDescriptor` instance) a _recipe_ that can be used to turn a Java call into the corresponding native call targeting the requested native function. >> >> This inference scheme can be defined in a pretty straightforward fashion by looking at the various ABI specifications (for instance, see [6] for the SysV ABI, which is the one used on Linux/Mac). The various `CallArranger` classes, of which we have a flavor for each supported platform, do exactly that kind of inference. >> >> For the inference process to work, we need to attach extra information to memory layouts; it is no longer sufficient to know e.g. that a layout is 32/64 bits - we need to know whether it is meant to represent a floating point value, or an integral value; this knowledge is required because floating points are passed in different registers by most ABIs. For this reason, `CLinker` offers a set of pre-baked, platform-dependent layout constants which contain the required classification attributes (e.g. a `Clinker.TypeKind` enum value). The runtime extracts this attribute, and performs classification accordingly. >> >> A native call is decomposed into a sequence of basic, primitive operations, called `Binding` (see the great javadoc on the `Binding.java` class for more info). There are many such bindings - for instance the `Move` binding is used to move a value into a specific machine register/stack slot. So, the main job of the various `CallingArranger` classes is to determine, given a Java `MethodType` and `FunctionDescriptor` what is the set of bindings associated with the downcall/upcall. >> >> At the heart of the foreign linker support is the `ProgrammableInvoker` class. This class effectively generates a `MethodHandle` which follows the steps described by the various bindings obtained by `CallArranger`. There are actually various strategies to interpret these bindings - listed below: >> >> * basic intepreted mode; in this mode, all bindings are interpreted using a stack-based machine written in Java (see `BindingInterpreter`), except for the `Move` bindings. For these bindings, the move is implemented by allocating a *buffer* (whose size is ABI specific) and by moving all the lowered values into positions within this buffer. The buffer is then passed to a piece of assembly code inside the VM which takes values from the buffer and moves them in their expected registers/stack slots (note that each position in the buffer corresponds to a different register). This is the most general invocation mode, the more "customizable" one, but also the slowest - since for every call there is some extra allocation which takes place. >> >> * specialized interpreted mode; same as before, but instead of interpreting the bindings with a stack-based interpreter, we generate a method handle chain which effectively interprets all the bindings (again, except `Move` ones). >> >> * intrinsified mode; this is typically used in combination with the specialized interpreted mode described above (although it can also be used with the Java-based binding interpreter). The goal here is to remove the buffer allocation and copy by introducing an additional JVM intrinsic. If a native call recipe is constant (e.g. the set of bindings is constant, which is probably the case if the native method handle is stored in a `static`, `final` field), then the VM can generate specialized assembly code which interprets the `Move` binding without the need to go for an intermediate buffer. This gives us back performances that are on par with JNI. >> >> For upcalls, the support is not (yet) as advanced, and only the basic interpreted mode is available there. We plan to add support for intrinsified modes there as well, which should considerably boost perfomances (probably well beyond what JNI can offer at the moment, since the upcall support in JNI is not very well optimized). >> >> Again, for more readings on the internals of the foreign linker support, please refer to [5]. >> >> #### Test changes >> >> Many new tests have been added to validate the foreign linker support; we have high level tests (see `StdLibTest`) which aim at testing the linker from the perspective of code that clients could write. But we also have deeper combinatorial tests (see `TestUpcall` and `TestDowncall`) which are meant to stress every corner of the ABI implementation. There are also some great tests (see the `callarranger` folder) which test the various `CallArranger`s for all the possible platforms; these tests adopt more of a white-box approach - that is, instead of treating the linker machinery as a black box and verify that the support works by checking that the native call returned the results we expected, these tests aims at checking that the set of bindings generated by the call arranger is correct. This also mean that we can test the classification logic for Windows, Mac and Linux regardless of the platform we're executing on. >> >> Some additional microbenchmarks have been added to compare the performances of downcall/upcall with JNI. >> >> [1] - https://openjdk.java.net/jeps/389 >> [2] - https://openjdk.java.net/jeps/393 >> [3] - https://git.openjdk.java.net/jdk/pull/548 >> [4] - https://github.com/openjdk/panama-foreign/blob/foreign-jextract/doc/panama_ffi.md >> [5] - http://cr.openjdk.java.net/~jvernee/docs/Foreign-abi%20downcall%20intrinsics%20technical%20description.html > > Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: > > Fix whitespaces Changes requested by ihse (Reviewer). make/modules/java.base/gensrc/GensrcScopedMemoryAccess.gmk line 148: > 146: > 147: $(DEST): $(BUILD_TOOLS_JDK) $(SCOPED_MEMORY_ACCESS_TEMPLATE) $(SCOPED_MEMORY_ACCESS_BIN_TEMPLATE) > 148: $(MKDIR) -p $(SCOPED_MEMORY_ACCESS_GENSRC_DIR) Please use `$(call MakeDir, $(SCOPED_MEMORY_ACCESS_GENSRC_DIR))` instead. make/modules/java.base/gensrc/GensrcScopedMemoryAccess.gmk line 34: > 32: SCOPED_MEMORY_ACCESS_TEMPLATE := $(SCOPED_MEMORY_ACCESS_SRC_DIR)/X-ScopedMemoryAccess.java.template > 33: SCOPED_MEMORY_ACCESS_BIN_TEMPLATE := $(SCOPED_MEMORY_ACCESS_SRC_DIR)/X-ScopedMemoryAccess-bin.java.template > 34: DEST := $(SCOPED_MEMORY_ACCESS_GENSRC_DIR)/ScopedMemoryAccess.java `DEST` is a very generic and not really informative name. Maybe `SCOPED_MEMORY_ACCESS_GENSRC_DEST` to fit in with the rest of the names? And/or, maybe, to cut down on the excessive length, shorten `SCOPED_MEMORY_ACCESS` to `SMA` in all variables. make/modules/java.base/gensrc/GensrcScopedMemoryAccess.gmk line 26: > 24: # > 25: > 26: GENSRC_SCOPED_MEMORY_ACCESS := This variable does not seem to be used. A left-over from previous iterations? Also, please cut down a bit on the consecutive empty lines. ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From jbhateja at openjdk.java.net Fri Oct 23 12:05:49 2020 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Fri, 23 Oct 2020 12:05:49 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v10] In-Reply-To: References: Message-ID: <8Gz1zaDPTixfCBIfO8-_CxIXUNvweZkas3FZq0voghA=.ee27fdf6-49a2-4e88-8ecf-a07d86414a9f@github.com> > Summary: > > 1) Partial in-lining technique avoids call overhead penalty for small array copy operations with size less than 32 bytes. > 2) At runtime, a conditional check based on copy length either calls an array-copy stub or executes an optimized instruction sequence using AVX-512 masked instructions emitted at the call site. > 3) New runtime flag ArrayCopyPartialInlineSize=0/32(default)/64 bytes determines the maximum size for partial in-lining. > 4) Based on the perf results seen in benchmarks currently partial in-lining is performed only for arraycopy involving sub-word types (bool/byte/char/short). Once PR-61 gets integrated we can extend this patch to cover all the primitive types. > > Performance Results: > System : CascadeLake Server, Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz > Micros : test/micro/org/openjdk/bench/java/lang/ArrayCopy*.java > ArrayCopyPartialInlineSize : 32 > > JMH | Block Size | Baseline (ns/op) | Partial Inling (ns/op) | Gain > -- | -- | -- | -- | -- > ArrayCopyAligned.testByte | 1 | 5.417 | 2.696 | 2.009272997 > ArrayCopyAligned.testByte | 3 | 5.494 | 2.702 | 2.03330866 > ArrayCopyAligned.testByte | 5 | 5.417 | 2.637 | 2.05422829 > ArrayCopyAligned.testByte | 10 | 5.343 | 2.703 | 1.976692564 > ArrayCopyAligned.testByte | 20 | 5.837 | 2.636 | 2.214339909 > ArrayCopyAligned.testByte | 70 | 5.86 | 6 | 0.976666667 > ArrayCopyAligned.testByte | 150 | 6.766 | 6.906 | 0.979727773 > ArrayCopyAligned.testByte | 300 | 7.605 | 7.952 | 0.956363179 > ArrayCopyAligned.testByte | 600 | 11.989 | 12.007 | 0.998500874 > ArrayCopyAligned.testByte | 1200 | 16.447 | 16.585 | 0.991679228 > ArrayCopyAligned.testChar | 1 | 5.02 | 2.828 | 1.775106082 > ArrayCopyAligned.testChar | 3 | 5.129 | 2.762 | 1.85698769 > ArrayCopyAligned.testChar | 5 | 5.041 | 2.762 | 1.82512672 > ArrayCopyAligned.testChar | 10 | 5.716 | 2.762 | 2.069514844 > ArrayCopyAligned.testChar | 20 | 5.111 | 5.399 | 0.946656788 > ArrayCopyAligned.testChar | 70 | 6.271 | 6.242 | 1.004645947 > ArrayCopyAligned.testChar | 150 | 7.45 | 7.599 | 0.980392157 > ArrayCopyAligned.testChar | 300 | 9.904 | 10.112 | 0.97943038 > ArrayCopyAligned.testChar | 600 | 17.131 | 17.167 | 0.997902953 > ArrayCopyAligned.testChar | 1200 | 29.556 | 29.851 | 0.990117584 > ArrayCopyUnalignedBoth.testByte | 1 | 5.419 | 2.702 | 2.005551443 > ArrayCopyUnalignedBoth.testByte | 3 | 5.558 | 2.636 | 2.108497724 > ArrayCopyUnalignedBoth.testByte | 5 | 5.43 | 2.636 | 2.059939302 > ArrayCopyUnalignedBoth.testByte | 10 | 5.378 | 2.637 | 2.039438756 > ArrayCopyUnalignedBoth.testByte | 20 | 5.914 | 2.636 | 2.243550836 > ArrayCopyUnalignedBoth.testByte | 70 | 5.882 | 5.954 | 0.987907289 > ArrayCopyUnalignedBoth.testByte | 150 | 6.784 | 6.88 | 0.986046512 > ArrayCopyUnalignedBoth.testByte | 300 | 7.635 | 7.968 | 0.958207831 > ArrayCopyUnalignedBoth.testByte | 600 | 12.226 | 12.129 | 1.007997362 > ArrayCopyUnalignedBoth.testByte | 1200 | 16.992 | 20.717 | 0.820195974 > ArrayCopyUnalignedBoth.testChar | 1 | 5.019 | 2.828 | 1.774752475 > ArrayCopyUnalignedBoth.testChar | 3 | 5.163 | 2.763 | 1.868621064 > ArrayCopyUnalignedBoth.testChar | 5 | 5.042 | 2.827 | 1.783516095 > ArrayCopyUnalignedBoth.testChar | 10 | 5.718 | 2.828 | 2.021923621 > ArrayCopyUnalignedBoth.testChar | 20 | 5.111 | 5.404 | 0.945780903 > ArrayCopyUnalignedBoth.testChar | 70 | 6.367 | 6.235 | 1.02117081 > ArrayCopyUnalignedBoth.testChar | 150 | 7.367 | 8.269 | 0.890917886 > ArrayCopyUnalignedBoth.testChar | 300 | 10.358 | 10.642 | 0.973313287 > ArrayCopyUnalignedBoth.testChar | 600 | 20.84 | 17.522 | 1.189361945 > ArrayCopyUnalignedBoth.testChar | 1200 | 31.895 | 31.892 | 1.000094067 > ArrayCopyUnalignedDst.testByte | 1 | 5.455 | 2.637 | 2.068638604 > ArrayCopyUnalignedDst.testByte | 3 | 5.562 | 2.702 | 2.058475204 > ArrayCopyUnalignedDst.testByte | 5 | 5.427 | 2.702 | 2.008512213 > ArrayCopyUnalignedDst.testByte | 10 | 5.367 | 2.696 | 1.990727003 > ArrayCopyUnalignedDst.testByte | 20 | 5.839 | 2.637 | 2.214258627 > ArrayCopyUnalignedDst.testByte | 70 | 5.888 | 5.968 | 0.986595174 > ArrayCopyUnalignedDst.testByte | 150 | 6.785 | 6.773 | 1.001771741 > ArrayCopyUnalignedDst.testByte | 300 | 7.606 | 7.972 | 0.954089313 > ArrayCopyUnalignedDst.testByte | 600 | 11.986 | 21.195 | 0.565510734 > ArrayCopyUnalignedDst.testByte | 1200 | 16.54 | 16.784 | 0.985462345 > ArrayCopyUnalignedDst.testChar | 1 | 5.02 | 2.827 | 1.775733994 > ArrayCopyUnalignedDst.testChar | 3 | 5.131 | 2.762 | 1.857711803 > ArrayCopyUnalignedDst.testChar | 5 | 5.038 | 2.762 | 1.82404055 > ArrayCopyUnalignedDst.testChar | 10 | 5.718 | 2.762 | 2.070238957 > ArrayCopyUnalignedDst.testChar | 20 | 5.113 | 5.401 | 0.946676541 > ArrayCopyUnalignedDst.testChar | 70 | 6.222 | 6.214 | 1.001287416 > ArrayCopyUnalignedDst.testChar | 150 | 7.367 | 8.125 | 0.906707692 > ArrayCopyUnalignedDst.testChar | 300 | 10.204 | 10.082 | 1.012100774 > ArrayCopyUnalignedDst.testChar | 600 | 16.978 | 17.135 | 0.990837467 > ArrayCopyUnalignedDst.testChar | 1200 | 32.351 | 31.996 | 1.011095137 > ArrayCopyUnalignedSrc.testByte | 1 | 5.414 | 2.696 | 2.008160237 > ArrayCopyUnalignedSrc.testByte | 3 | 5.494 | 2.637 | 2.083428138 > ArrayCopyUnalignedSrc.testByte | 5 | 5.431 | 2.637 | 2.059537353 > ArrayCopyUnalignedSrc.testByte | 10 | 5.344 | 2.703 | 1.977062523 > ArrayCopyUnalignedSrc.testByte | 20 | 5.834 | 2.696 | 2.163946588 > ArrayCopyUnalignedSrc.testByte | 70 | 5.883 | 6.009 | 0.979031453 > ArrayCopyUnalignedSrc.testByte | 150 | 6.729 | 6.87 | 0.979475983 > ArrayCopyUnalignedSrc.testByte | 300 | 7.603 | 7.97 | 0.953952321 > ArrayCopyUnalignedSrc.testByte | 600 | 12.004 | 12.16 | 0.987171053 > ArrayCopyUnalignedSrc.testByte | 1200 | 16.534 | 16.643 | 0.9934507 > ArrayCopyUnalignedSrc.testChar | 1 | 5.021 | 2.762 | 1.81788559 > ArrayCopyUnalignedSrc.testChar | 3 | 5.13 | 2.762 | 1.857349747 > ArrayCopyUnalignedSrc.testChar | 5 | 5.042 | 2.827 | 1.783516095 > ArrayCopyUnalignedSrc.testChar | 10 | 5.726 | 2.761 | 2.073886273 > ArrayCopyUnalignedSrc.testChar | 20 | 5.112 | 5.401 | 0.94649139 > ArrayCopyUnalignedSrc.testChar | 70 | 6.113 | 6.227 | 0.981692629 > ArrayCopyUnalignedSrc.testChar | 150 | 7.493 | 7.888 | 0.949923935 > ArrayCopyUnalignedSrc.testChar | 300 | 10.234 | 10.501 | 0.97457385 > ArrayCopyUnalignedSrc.testChar | 600 | 17.175 | 17.142 | 1.001925096 > ArrayCopyUnalignedSrc.testChar | 1200 | 31.926 | 31.987 | 0.998092975 > > Detailed Reports: > Baseline : [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt) > WithOpt : [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt) Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: - JDK-8252848 : Replacing generic assembler routine evmovdqu with macro assembly routine calling type specific leaf level assembly functions. - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8252848 - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8252848 - JDK-8252848 : Review comments resolution. - Merge remote-tracking branch 'upstream' into JDK-8252848 - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8252848 - Replacing explicit type checks with existing type checking routines - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8252848 - 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions. ------------- Changes: https://git.openjdk.java.net/jdk/pull/302/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=302&range=09 Stats: 549 lines in 27 files changed: 499 ins; 23 del; 27 mod Patch: https://git.openjdk.java.net/jdk/pull/302.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/302/head:pull/302 PR: https://git.openjdk.java.net/jdk/pull/302 From jbhateja at openjdk.java.net Fri Oct 23 12:05:51 2020 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Fri, 23 Oct 2020 12:05:51 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v9] In-Reply-To: <8Ryyxuf5P2D6WNyj4riYCTgN0U6WLrLpBmxhNbnmPpQ=.b2ed5660-99d0-49d1-83e0-8b2de518d7b8@github.com> References: <8Ryyxuf5P2D6WNyj4riYCTgN0U6WLrLpBmxhNbnmPpQ=.b2ed5660-99d0-49d1-83e0-8b2de518d7b8@github.com> Message-ID: On Thu, 22 Oct 2020 09:48:51 GMT, Ningsheng Jian wrote: >> Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. > > src/hotspot/share/opto/vectornode.hpp line 835: > >> 833: static VectorMaskGenNode* make(int opc, Node* src, const Type* ty, const Type* ety); >> 834: private: >> 835: const Type* _elemType; > > Will an additional field in the node valid after some optimizations, i.e. clone()? I think I know the ety, but I don't know the usage of ty. If so, do you need to have a new type like what TypeVect does for mask? As currently there is no support for mask registers in RA, for X86 long ideal type is sufficient for a mask producing node (def operand is a mask register) ; But for complete support returning Op_RegVMask as an ideal_reg() type for masked Ideal node should do the trick without creating an explicit new ideal Type for mask generating nodes. Spill sizes and number of slots may be different for X86 and ARM (SVE). Shallow copy during Node::clone should be sufficient here since encapsulated element type will be preserved. > src/hotspot/share/opto/vectornode.cpp line 775: > >> 773: VectorMaskGenNode* make(int opc, Node* src, const Type* ty, const Type* ety) { >> 774: return new VectorMaskGenNode(src, ty, ety); >> 775: } > > These are not used? This is a just a helper routine not used currently though. > src/hotspot/share/opto/vectornode.hpp line 826: > >> 824: class VectorMaskGenNode : public TypeNode { >> 825: public: >> 826: VectorMaskGenNode(Node* src, const Type* ty, const Type* ety): TypeNode(ty, 2), _elemType(ety) { > > Sorry, I don't quite understand the arguments here. What does 'src' mean to the mask? ty -> Node type , long in this case since for X86 mask register is 64 bit wide. ety -> Mask element type, currently used during LoadVectorMasked/StoreVectorMasked idealization to compute the block sizes for constant masks and replace masked vector operations with non-masked if block size is equal to vector size. Src has been replaced by a better name "length" used for mask computation. ------------- PR: https://git.openjdk.java.net/jdk/pull/302 From mdoerr at openjdk.java.net Fri Oct 23 12:13:49 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Fri, 23 Oct 2020 12:13:49 GMT Subject: RFR: 8255340: [s390] build failure after JDK-8255208 Message-ID: <5-PWZE5B-5kF1XTkF1uwvup2HuGHQy78n2N1b4pgA_o=.302b95bb-53b6-462a-8416-bf8cc22cb9c8@github.com> vm_version_s390.cpp:839:52: error: no matching function for call to 'Disassembler::decode(CodeBuffer*, u_char*&, u_char*&, outputStream*&)' This function was removed. I'm using the one without the CodeBuffer* argument (like on PPC64). ------------- Commit messages: - 8255340: [s390] build failure after JDK-8255208 Changes: https://git.openjdk.java.net/jdk/pull/835/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=835&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255340 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/835.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/835/head:pull/835 PR: https://git.openjdk.java.net/jdk/pull/835 From shade at openjdk.java.net Fri Oct 23 12:25:37 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 23 Oct 2020 12:25:37 GMT Subject: RFR: 8255340: [s390] build failure after JDK-8255208 In-Reply-To: <5-PWZE5B-5kF1XTkF1uwvup2HuGHQy78n2N1b4pgA_o=.302b95bb-53b6-462a-8416-bf8cc22cb9c8@github.com> References: <5-PWZE5B-5kF1XTkF1uwvup2HuGHQy78n2N1b4pgA_o=.302b95bb-53b6-462a-8416-bf8cc22cb9c8@github.com> Message-ID: On Fri, 23 Oct 2020 12:07:45 GMT, Martin Doerr wrote: > vm_version_s390.cpp:839:52: error: no matching function for call to 'Disassembler::decode(CodeBuffer*, u_char*&, u_char*&, outputStream*&)' > This function was removed. I'm using the one without the CodeBuffer* argument (like on PPC64). Looks fine to me. ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/835 From mdoerr at openjdk.java.net Fri Oct 23 12:25:37 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Fri, 23 Oct 2020 12:25:37 GMT Subject: RFR: 8255340: [s390] build failure after JDK-8255208 In-Reply-To: References: <5-PWZE5B-5kF1XTkF1uwvup2HuGHQy78n2N1b4pgA_o=.302b95bb-53b6-462a-8416-bf8cc22cb9c8@github.com> Message-ID: On Fri, 23 Oct 2020 12:17:26 GMT, Aleksey Shipilev wrote: >> vm_version_s390.cpp:839:52: error: no matching function for call to 'Disassembler::decode(CodeBuffer*, u_char*&, u_char*&, outputStream*&)' >> This function was removed. I'm using the one without the CodeBuffer* argument (like on PPC64). > > Looks fine to me. Thanks for the review! ------------- PR: https://git.openjdk.java.net/jdk/pull/835 From mdoerr at openjdk.java.net Fri Oct 23 12:25:38 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Fri, 23 Oct 2020 12:25:38 GMT Subject: Integrated: 8255340: [s390] build failure after JDK-8255208 In-Reply-To: <5-PWZE5B-5kF1XTkF1uwvup2HuGHQy78n2N1b4pgA_o=.302b95bb-53b6-462a-8416-bf8cc22cb9c8@github.com> References: <5-PWZE5B-5kF1XTkF1uwvup2HuGHQy78n2N1b4pgA_o=.302b95bb-53b6-462a-8416-bf8cc22cb9c8@github.com> Message-ID: On Fri, 23 Oct 2020 12:07:45 GMT, Martin Doerr wrote: > vm_version_s390.cpp:839:52: error: no matching function for call to 'Disassembler::decode(CodeBuffer*, u_char*&, u_char*&, outputStream*&)' > This function was removed. I'm using the one without the CodeBuffer* argument (like on PPC64). This pull request has now been integrated. Changeset: 12daf2b6 Author: Martin Doerr URL: https://git.openjdk.java.net/jdk/commit/12daf2b6 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8255340: [s390] build failure after JDK-8255208 Reviewed-by: shade ------------- PR: https://git.openjdk.java.net/jdk/pull/835 From coleenp at openjdk.java.net Fri Oct 23 13:11:41 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 23 Oct 2020 13:11:41 GMT Subject: RFR: 8255271: Avoid generating duplicate interpreter entries for subword types In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 18:19:59 GMT, Claes Redestad wrote: > Top-of-stack optimizations are generally missing for byte, char, short and boolean, and for several types of entry points we already avoid generating entries and instead redirect to the int variant. > > This patch removes generation of "specialized" variants for byte, char, short and boolean more thoroughly after verifying they all generate the same code as their int specialization. This slightly reduces overhead of generating the interpreter and the size thereof. Looks good to me. src/hotspot/share/interpreter/templateInterpreter.cpp line 112: > 110: _entry[ztos] = ientry; > 111: _entry[ctos] = ientry; > 112: _entry[stos] = ientry; next patch is to get rid of these entry points since I think nothing branches to them. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/816 From fyang at openjdk.java.net Fri Oct 23 13:16:35 2020 From: fyang at openjdk.java.net (Fei Yang) Date: Fri, 23 Oct 2020 13:16:35 GMT Subject: RFR: 8255287: aarch64: fix SVE patterns for vector shift count In-Reply-To: References: Message-ID: On Fri, 23 Oct 2020 10:05:31 GMT, Andrew Dinn wrote: >> SVE patterns for vector shift count cannot be matched due to bad matching rules. >> Also code gen is not correct in certain cases for vlslS_imm and vlsrS_imm. >> Please refer to JDK-8255287 for details. >> Patch passed tier1 tests using QEMU system emulator which supports SVE. > > These changes look ok. Thanks for the quick review :-) @adinn ------------- PR: https://git.openjdk.java.net/jdk/pull/827 From rrich at openjdk.java.net Fri Oct 23 13:21:38 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Fri, 23 Oct 2020 13:21:38 GMT Subject: RFR: 8255243: Reinforce escape barrier interactions with ZGC conc stack processing In-Reply-To: References: Message-ID: <3NyYY7GWmEuNZIVsQctAsrM0N6oR0RepvdQMdBDw0k4=.ea03bdd0-6cc6-4423-8d7c-d11f1c7d47b0@github.com> On Fri, 23 Oct 2020 10:39:10 GMT, Erik ?sterlund wrote: >> @reinrich Since you wrote the escape barrier, I thought I'd ping you in case you are interested. This makes the mechanism more robust and reliable, w.r.t. concurrent stack processing. No issue observed, but looks wrong with the new vector API deoptimization code that was integrated recently. This makes it more future-proof and easy to reason about. > > Forgot to mention, but I tested this from tier1-5, to sanity check that the new solution doesn't introduce any new issues. @fisk thanks for pinging. I remember reading about a new vector API a while ago and noticed that it potentially interferes with escape barriers but then I lost track. I'll look at this. ------------- PR: https://git.openjdk.java.net/jdk/pull/832 From fyang at openjdk.java.net Fri Oct 23 13:21:41 2020 From: fyang at openjdk.java.net (Fei Yang) Date: Fri, 23 Oct 2020 13:21:41 GMT Subject: Integrated: 8255287: aarch64: fix SVE patterns for vector shift count In-Reply-To: References: Message-ID: <4JzgyHd1mK4S3onigtOC9s8nhwFaFTS0xpswG1rAyNM=.4192a337-a851-4f77-8d17-18f8bb3fbb56@github.com> On Fri, 23 Oct 2020 09:35:11 GMT, Fei Yang wrote: > SVE patterns for vector shift count cannot be matched due to bad matching rules. > Also code gen is not correct in certain cases for vlslS_imm and vlsrS_imm. > Please refer to JDK-8255287 for details. > Patch passed tier1 tests using QEMU system emulator which supports SVE. This pull request has now been integrated. Changeset: 5ec1b80c Author: Fei Yang URL: https://git.openjdk.java.net/jdk/commit/5ec1b80c Stats: 149 lines in 7 files changed: 109 ins; 0 del; 40 mod 8255287: aarch64: fix SVE patterns for vector shift count Co-authored-by: Yanhong Zhu Reviewed-by: adinn ------------- PR: https://git.openjdk.java.net/jdk/pull/827 From redestad at openjdk.java.net Fri Oct 23 13:28:37 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Fri, 23 Oct 2020 13:28:37 GMT Subject: RFR: 8255271: Avoid generating duplicate interpreter entries for subword types In-Reply-To: References: Message-ID: On Fri, 23 Oct 2020 13:08:27 GMT, Coleen Phillimore wrote: >> Top-of-stack optimizations are generally missing for byte, char, short and boolean, and for several types of entry points we already avoid generating entries and instead redirect to the int variant. >> >> This patch removes generation of "specialized" variants for byte, char, short and boolean more thoroughly after verifying they all generate the same code as their int specialization. This slightly reduces overhead of generating the interpreter and the size thereof. > > src/hotspot/share/interpreter/templateInterpreter.cpp line 112: > >> 110: _entry[ztos] = ientry; >> 111: _entry[ctos] = ientry; >> 112: _entry[stos] = ientry; > > next patch is to get rid of these entry points since I think nothing branches to them. Thanks for reviewing. I'll file an RFE to keep at this. I didn't want to keep pulling at this thread for this RFE since it seems like quite a chunk of code will be unravelled here. ------------- PR: https://git.openjdk.java.net/jdk/pull/816 From dcubed at openjdk.java.net Fri Oct 23 13:57:35 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 23 Oct 2020 13:57:35 GMT Subject: RFR: 8255243: Reinforce escape barrier interactions with ZGC conc stack processing In-Reply-To: <3NyYY7GWmEuNZIVsQctAsrM0N6oR0RepvdQMdBDw0k4=.ea03bdd0-6cc6-4423-8d7c-d11f1c7d47b0@github.com> References: <3NyYY7GWmEuNZIVsQctAsrM0N6oR0RepvdQMdBDw0k4=.ea03bdd0-6cc6-4423-8d7c-d11f1c7d47b0@github.com> Message-ID: On Fri, 23 Oct 2020 13:18:53 GMT, Richard Reingruber wrote: >> Forgot to mention, but I tested this from tier1-5, to sanity check that the new solution doesn't introduce any new issues. > > @fisk thanks for pinging. I remember reading about a new vector API a while ago and noticed that it potentially interferes with escape barriers but then I lost track. I'll look at this. Thanks for adding the tier testing info. Sounds like there is no need for new tests since this is "just" a RAII object to reinforce existing code interactions, but please confirm that... ------------- PR: https://git.openjdk.java.net/jdk/pull/832 From mcimadamore at openjdk.java.net Fri Oct 23 14:09:39 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Fri, 23 Oct 2020 14:09:39 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v12] In-Reply-To: References: <2IKx6cpc-IGP3jZtr0s2I14BWM6ptyFD26szPl3b1ng=.9d956b98-dfe6-4a45-a371-bf86923214fb@github.com> Message-ID: On Fri, 23 Oct 2020 11:02:11 GMT, Magnus Ihse Bursie wrote: >> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix whitespaces > > Changes requested by ihse (Reviewer). @magicus the files you commented on are not part of this PR, but they are introduced as part of: https://git.openjdk.java.net/jdk/pull/548 (you seemed to have approved the changes there - but it's also likely that this PR doesn't include the latest changes in that PR). Sorry for the confusion - but please do report any comment you have on the build changes on that PR! ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From tschatzl at openjdk.java.net Fri Oct 23 15:29:41 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Fri, 23 Oct 2020 15:29:41 GMT Subject: RFR: 8255298: Remove SurvivorAlignmentInBytes functionality Message-ID: Hi all, can I have reviews to remove the SurvivorAlignmentInBytes functionality? It has not been in use for a long time if ever, and can be removed. Searching the web also indicates that apart from the usual lists of all options and CRs it is never mentioned. SurvivorAlignmentInBytes is an experimental option so no further process is required. Testing: tier1-5 Thanks, Thomas ------------- Commit messages: - Initial commit Changes: https://git.openjdk.java.net/jdk/pull/838/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=838&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255298 Stats: 1420 lines in 24 files changed: 0 ins; 1414 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/838.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/838/head:pull/838 PR: https://git.openjdk.java.net/jdk/pull/838 From redestad at openjdk.java.net Fri Oct 23 15:41:37 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Fri, 23 Oct 2020 15:41:37 GMT Subject: Integrated: 8255271: Avoid generating duplicate interpreter entries for subword types In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 18:19:59 GMT, Claes Redestad wrote: > Top-of-stack optimizations are generally missing for byte, char, short and boolean, and for several types of entry points we already avoid generating entries and instead redirect to the int variant. > > This patch removes generation of "specialized" variants for byte, char, short and boolean more thoroughly after verifying they all generate the same code as their int specialization. This slightly reduces overhead of generating the interpreter and the size thereof. This pull request has now been integrated. Changeset: cc861134 Author: Claes Redestad URL: https://git.openjdk.java.net/jdk/commit/cc861134 Stats: 75 lines in 5 files changed: 32 ins; 27 del; 16 mod 8255271: Avoid generating duplicate interpreter entries for subword types Reviewed-by: iklam, coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/816 From shade at openjdk.java.net Fri Oct 23 15:55:39 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 23 Oct 2020 15:55:39 GMT Subject: RFR: 8255298: Remove SurvivorAlignmentInBytes functionality In-Reply-To: References: Message-ID: On Fri, 23 Oct 2020 15:16:57 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews to remove the SurvivorAlignmentInBytes functionality? It has not been in use for a long time if ever, and can be removed. Searching the web also indicates that apart from the usual lists of all options and CRs it is never mentioned. > > SurvivorAlignmentInBytes is an experimental option so no further process is required. > > Testing: tier1-5 > > Thanks, > Thomas Nifty cleanup! One nit I see: src/hotspot/share/gc/parallel/psPromotionLAB.inline.hpp line 36: > 34: // assert(_state != flushed, "Sanity"); > 35: HeapWord* obj = top(); > 36: if (obj == NULL) { `obj` cannot be `NULL` now? Thus the following null-check is redundant? ------------- Changes requested by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/838 From gziemski at openjdk.java.net Fri Oct 23 16:19:40 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Fri, 23 Oct 2020 16:19:40 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 20:30:54 GMT, Thomas Stuefe wrote: >> src/hotspot/os/posix/signals_posix.cpp line 1425: >> >>> 1423: for (int i = 0; i < NSIG + 1; i++) { >>> 1424: sigaction(i, &defaulthandler, NULL); >>> 1425: } >> >> Some issues: >> - 0 is not a valid signal number >> - NSIG is obscure; may be ridiculously large (e.g. AIX I believe 1024) and still may not contain all signals for some platforms (e.g. real time signals on BSD (?) are not included) >> - You now flatten here all signal handlers across the board. What if some third party app had installed their own. I would only reset those which had been originally installed for hotspot usage (beware ReduceSignalUsage). >> >> Which brings me to the point that if we fail to re-raise the original error signal, now we run with signal handlers completely disabled :( ... see my original concerns. > > Thinking further, we probably want to just the default handler for the error signal that happened. And we also want to do this only for synchronous error signals (segv, sigill, sigbus, sigfpe), and possibly only if they are "real" (had not been sent by kill or pthread_kill). > Some issues: > > * 0 is not a valid signal number Right, I used to start at 1, but then copied the "for loop" code from another part of existing code for "NSIG" and forgot to fix the start value" I will fix that. > * NSIG is obscure; may be ridiculously large (e.g. AIX I believe 1024) and still may not contain all signals for some platforms (e.g. real time signals on BSD (?) are not included) 1. The fix will work for any NSIG value, why would its value be relevant here? That's the value we use anywhere else, so if it's wrong here, then it's wring everywhere else and we have a bigger issue to deal here than just this fix. 2. We are not accounting for real time signals on BSD anywhere else, so why do we need to do it here? > * You now flatten here all signal handlers across the board. What if some third party app had installed their own. I would only reset those which had been originally installed for hotspot usage (beware ReduceSignalUsage). The purpose of the "UseOSErrorReporting" flag is to let the process die, so that it can be handled by the OS. If we let third party app signal handlers jump in, we can't guarantee that "UseOSErrorReporting" does its job. > Which brings me to the point that if we fail to re-raise the original error signal, now we run with signal handlers completely disabled :( ... see my original concerns. I will loosely quote what David Holmes likes to say about signal handlers and hope I get it right: "this is a best effort, nothing is guaranteed in signal handlers". By that I mean yes, you are right that we can't guarantee correct behavior, but at this point of process lifecycle we caught the crash and produced hs_err log, anything else is just great if it works and no big loss if things go wrong now (how bad exactly can things go at this point?). From my experience, however, it works. ------------- PR: https://git.openjdk.java.net/jdk/pull/813 From gziemski at openjdk.java.net Fri Oct 23 16:23:38 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Fri, 23 Oct 2020 16:23:38 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) In-Reply-To: References: Message-ID: <2ZmsAgqfhGGvOGL5GNHWPKOHPnEFlhr5iNOLswmPXKA=.cacdb3bf-d3b3-40fa-8ead-95e803546fcf@github.com> On Fri, 23 Oct 2020 16:17:21 GMT, Gerard Ziemski wrote: >> Thinking further, we probably want to just the default handler for the error signal that happened. And we also want to do this only for synchronous error signals (segv, sigill, sigbus, sigfpe), and possibly only if they are "real" (had not been sent by kill or pthread_kill). > >> Some issues: >> >> * 0 is not a valid signal number > > Right, I used to start at 1, but then copied the "for loop" code from another part of existing code for "NSIG" and forgot to fix the start value" I will fix that. > >> * NSIG is obscure; may be ridiculously large (e.g. AIX I believe 1024) and still may not contain all signals for some platforms (e.g. real time signals on BSD (?) are not included) > > 1. The fix will work for any NSIG value, why would its value be relevant here? That's the value we use anywhere else, so if it's wrong here, then it's wrong everywhere else and we have a bigger issue to deal here than just this fix. > 2. We are not accounting for real time signals on BSD anywhere else, so why do we need to do it here? > >> * You now flatten here all signal handlers across the board. What if some third party app had installed their own. I would only reset those which had been originally installed for hotspot usage (beware ReduceSignalUsage). > > The purpose of the "UseOSErrorReporting" flag is to let the process die, so that it can be handled by the OS. If we let third party app signal handlers jump in, we can't guarantee that "UseOSErrorReporting" does its job. > >> Which brings me to the point that if we fail to re-raise the original error signal, now we run with signal handlers completely disabled :( ... see my original concerns. > > I will loosely quote what David Holmes likes to say about signal handlers and hope I get it right: "this is a best effort, nothing is guaranteed in signal handlers". By that I mean yes, you are right that we can't guarantee correct behavior, but at this point of process lifecycle we caught the crash and produced hs_err log, anything else is just great if it works and no big loss if things go wrong now (how bad exactly can things go at this point?). From my experience, however, it works. > Thinking further, we probably want to just the default handler for the error signal that happened. And we also want to do this only for synchronous error signals (segv, sigill, sigbus, sigfpe), and possibly only if they are "real" (had not been sent by kill or pthread_kill). Handling "UseOSErrorReporting" is the last thing we ask the process to do before letting it crash and get caught by the OS. I'm not sure we need to finesse here with which signals we want to flatten. This keeps the code simple, though I will add a comment to "PosixSignals::clear_signal_handlers()" describing the assumptions and expectations. ------------- PR: https://git.openjdk.java.net/jdk/pull/813 From gziemski at openjdk.java.net Fri Oct 23 16:27:36 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Fri, 23 Oct 2020 16:27:36 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 19:52:04 GMT, Thomas Stuefe wrote: > I think "re-arm" is a misnomer. Nothing is re-armed if by "armed" you mean "a signal handler was installed". There was one installed before and there gets a different one installed now. > > I agree that "reset_signal_handlers" is just plain wrong, and has bugged me too. May I suggest "install_secondary_signal_handlers()" ? Or possibly "install_secondary_crash_handlers" which would have the added benefit of removing the term "signal" out of the shared name space, which is confusing on windows. I like "install_secondary_crash_handlers" ------------- PR: https://git.openjdk.java.net/jdk/pull/813 From tschatzl at openjdk.java.net Fri Oct 23 17:26:50 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Fri, 23 Oct 2020 17:26:50 GMT Subject: RFR: 8255298: Remove SurvivorAlignmentInBytes functionality [v2] In-Reply-To: References: Message-ID: > Hi all, > > can I have reviews to remove the SurvivorAlignmentInBytes functionality? It has not been in use for a long time if ever, and can be removed. Searching the web also indicates that apart from the usual lists of all options and CRs it is never mentioned. > > SurvivorAlignmentInBytes is an experimental option so no further process is required. > > Testing: tier1-5 > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: shade review ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/838/files - new: https://git.openjdk.java.net/jdk/pull/838/files/b1407f0b..878cac21 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=838&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=838&range=00-01 Stats: 5 lines in 2 files changed: 0 ins; 5 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/838.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/838/head:pull/838 PR: https://git.openjdk.java.net/jdk/pull/838 From shade at openjdk.java.net Fri Oct 23 17:26:51 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 23 Oct 2020 17:26:51 GMT Subject: RFR: 8255298: Remove SurvivorAlignmentInBytes functionality [v2] In-Reply-To: References: Message-ID: On Fri, 23 Oct 2020 17:23:42 GMT, Thomas Schatzl wrote: >> Hi all, >> >> can I have reviews to remove the SurvivorAlignmentInBytes functionality? It has not been in use for a long time if ever, and can be removed. Searching the web also indicates that apart from the usual lists of all options and CRs it is never mentioned. >> >> SurvivorAlignmentInBytes is an experimental option so no further process is required. >> >> Testing: tier1-5 >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > shade review These look fine to me. ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/838 From tschatzl at openjdk.java.net Fri Oct 23 17:26:52 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Fri, 23 Oct 2020 17:26:52 GMT Subject: RFR: 8255298: Remove SurvivorAlignmentInBytes functionality [v2] In-Reply-To: References: Message-ID: <5Gj8DiyNTQx622mxSunid3votfrGMGzS7twZg9DifQY=.b8386dcc-7ebe-45cc-8b90-15d8a82c5b70@github.com> On Fri, 23 Oct 2020 15:51:57 GMT, Aleksey Shipilev wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> shade review > > Nifty cleanup! One nit I see: Done, thanks for catching this. ------------- PR: https://git.openjdk.java.net/jdk/pull/838 From eosterlund at openjdk.java.net Fri Oct 23 17:51:38 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 23 Oct 2020 17:51:38 GMT Subject: RFR: 8255243: Reinforce escape barrier interactions with ZGC conc stack processing In-Reply-To: References: <3NyYY7GWmEuNZIVsQctAsrM0N6oR0RepvdQMdBDw0k4=.ea03bdd0-6cc6-4423-8d7c-d11f1c7d47b0@github.com> Message-ID: On Fri, 23 Oct 2020 13:54:38 GMT, Daniel D. Daugherty wrote: > Thanks for adding the tier testing info. Sounds like there is no need for new > > tests since this is "just" a RAII object to reinforce existing code interactions, > > but please confirm that... Exactly; confirmed. ------------- PR: https://git.openjdk.java.net/jdk/pull/832 From dfuchs at openjdk.java.net Fri Oct 23 18:33:35 2020 From: dfuchs at openjdk.java.net (Daniel Fuchs) Date: Fri, 23 Oct 2020 18:33:35 GMT Subject: RFR: 8255299: Drop explicit zeroing at instantiation of Atomic* objects In-Reply-To: References: Message-ID: On Fri, 23 Oct 2020 09:14:48 GMT, Daniel Fuchs wrote: >> The changes in src/java.desktop looks fine. > > Changes to `java.logging` and `java.net.http` also look good to me. Hi Sergey, I'll give it some testing and sponsor it next week unless someone else steps up. best regards, -- daniel ------------- PR: https://git.openjdk.java.net/jdk/pull/818 From hseigel at openjdk.java.net Fri Oct 23 18:50:44 2020 From: hseigel at openjdk.java.net (Harold Seigel) Date: Fri, 23 Oct 2020 18:50:44 GMT Subject: RFR: 8238263: Create at-requires mechanism for containers Message-ID: Please review this change to add an @requires mechanism called "jdk.containerized" to help mark tests that are incompatible with containers. Users would add "@requires jdk.containerized != true" to the incompatible tests and then use "make test ... OPTIONS=-Djdk.containerized=true" or "bash jib.sh mach5 -- remote-build-and-test ... --test-make-args JTREG=OPTIONS=-Djdk.containerized=true" to exclude those tests when testing with containers. ------------- Commit messages: - 8238263: Create at-requires mechanism for containers Changes: https://git.openjdk.java.net/jdk/pull/844/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=844&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8238263 Stats: 4 lines in 2 files changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/844.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/844/head:pull/844 PR: https://git.openjdk.java.net/jdk/pull/844 From bob.vandette at oracle.com Fri Oct 23 19:12:25 2020 From: bob.vandette at oracle.com (Bob Vandette) Date: Fri, 23 Oct 2020 15:12:25 -0400 Subject: RFR: 8238263: Create at-requires mechanism for containers In-Reply-To: References: Message-ID: <9A1F573B-E6DD-4168-A89D-2425A9BDF045@oracle.com> I wonder if it makes sense to add this option to open/test/jtreg-ext/requires/VMProps.java and have it automatically enable this option based on an environment variable so we don?t have to remember the cryptic command line sequence. It?s too bad we can?t automatically detect that we are running in a container. Bob. > On Oct 23, 2020, at 2:50 PM, Harold Seigel wrote: > > Please review this change to add an @requires mechanism called "jdk.containerized" to help mark tests that are incompatible with containers. Users would add "@requires jdk.containerized != true" to the incompatible tests and then use "make test ... OPTIONS=-Djdk.containerized=true" or "bash jib.sh mach5 -- remote-build-and-test ... --test-make-args JTREG=OPTIONS=-Djdk.containerized=true" to exclude those tests when testing with containers. > > ------------- > > Commit messages: > - 8238263: Create at-requires mechanism for containers > > Changes: https://git.openjdk.java.net/jdk/pull/844/files > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=844&range=00 > Issue: https://bugs.openjdk.java.net/browse/JDK-8238263 > Stats: 4 lines in 2 files changed: 2 ins; 0 del; 2 mod > Patch: https://git.openjdk.java.net/jdk/pull/844.diff > Fetch: git fetch https://git.openjdk.java.net/jdk pull/844/head:pull/844 > > PR: https://git.openjdk.java.net/jdk/pull/844 From iignatyev at openjdk.java.net Fri Oct 23 19:20:34 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Fri, 23 Oct 2020 19:20:34 GMT Subject: RFR: 8238263: Create at-requires mechanism for containers In-Reply-To: References: Message-ID: <43YvmzXR5TzDgPqfcAJDyfpOd7f_ur6ibKfh7AGf4LU=.40b88d2c-b29b-49d4-8d7f-120b7ade6cc5@github.com> On Fri, 23 Oct 2020 18:44:54 GMT, Harold Seigel wrote: > Please review this change to add an @requires mechanism called "jdk.containerized" to help mark tests that are incompatible with containers. Users would add "@requires jdk.containerized != true" to the incompatible tests and then use "make test ... OPTIONS=-Djdk.containerized=true" or "bash jib.sh mach5 -- remote-build-and-test ... --test-make-args JTREG=OPTIONS=-Djdk.containerized=true" to exclude those tests when testing with containers. Hi Harold, I actually still think that having a separate ProblemList (as we do for graal, zgc, Xcomp, aot) is a better solution. why did you choose `@requires` over it? -- Igor ------------- PR: https://git.openjdk.java.net/jdk/pull/844 From harold.seigel at oracle.com Fri Oct 23 19:22:49 2020 From: harold.seigel at oracle.com (Harold Seigel) Date: Fri, 23 Oct 2020 15:22:49 -0400 Subject: RFR: 8238263: Create at-requires mechanism for containers In-Reply-To: <9A1F573B-E6DD-4168-A89D-2425A9BDF045@oracle.com> References: <9A1F573B-E6DD-4168-A89D-2425A9BDF045@oracle.com> Message-ID: <38650f41-c1e8-ffb1-711f-c870f37df6e2@oracle.com> Hi Bob, Thanks for looking at this.? I'll look into basing the option on an environment variable. Thanks, Harold On 10/23/2020 3:12 PM, Bob Vandette wrote: > I wonder if it makes sense to add this option to open/test/jtreg-ext/requires/VMProps.java and have it > automatically enable this option based on an environment variable so we don?t have to remember the > cryptic command line sequence. > > It?s too bad we can?t automatically detect that we are running in a container. > > Bob. > > >> On Oct 23, 2020, at 2:50 PM, Harold Seigel wrote: >> >> Please review this change to add an @requires mechanism called "jdk.containerized" to help mark tests that are incompatible with containers. Users would add "@requires jdk.containerized != true" to the incompatible tests and then use "make test ... OPTIONS=-Djdk.containerized=true" or "bash jib.sh mach5 -- remote-build-and-test ... --test-make-args JTREG=OPTIONS=-Djdk.containerized=true" to exclude those tests when testing with containers. >> >> ------------- >> >> Commit messages: >> - 8238263: Create at-requires mechanism for containers >> >> Changes: https://git.openjdk.java.net/jdk/pull/844/files >> Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=844&range=00 >> Issue: https://bugs.openjdk.java.net/browse/JDK-8238263 >> Stats: 4 lines in 2 files changed: 2 ins; 0 del; 2 mod >> Patch: https://git.openjdk.java.net/jdk/pull/844.diff >> Fetch: git fetch https://git.openjdk.java.net/jdk pull/844/head:pull/844 >> >> PR: https://git.openjdk.java.net/jdk/pull/844 From hseigel at openjdk.java.net Fri Oct 23 19:28:36 2020 From: hseigel at openjdk.java.net (Harold Seigel) Date: Fri, 23 Oct 2020 19:28:36 GMT Subject: RFR: 8238263: Create at-requires mechanism for containers In-Reply-To: <43YvmzXR5TzDgPqfcAJDyfpOd7f_ur6ibKfh7AGf4LU=.40b88d2c-b29b-49d4-8d7f-120b7ade6cc5@github.com> References: <43YvmzXR5TzDgPqfcAJDyfpOd7f_ur6ibKfh7AGf4LU=.40b88d2c-b29b-49d4-8d7f-120b7ade6cc5@github.com> Message-ID: On Fri, 23 Oct 2020 19:17:40 GMT, Igor Ignatyev wrote: >> Please review this change to add an @requires mechanism called "jdk.containerized" to help mark tests that are incompatible with containers. Users would add "@requires jdk.containerized != true" to the incompatible tests and then use "make test ... OPTIONS=-Djdk.containerized=true" or "bash jib.sh mach5 -- remote-build-and-test ... --test-make-args JTREG=OPTIONS=-Djdk.containerized=true" to exclude those tests when testing with containers. > > Hi Harold, > > I actually still think that having a separate ProblemList (as we do for graal, zgc, Xcomp, aot) is a better solution. why did you choose `@requires` over it? > > -- Igor Hi Igor, I think it depends on whether the tests will be permanently or temporarily excluded from running with containers. I thought this mechanism would be to permanently exclude the tests. That's why I used @requires. Thanks, Harold ------------- PR: https://git.openjdk.java.net/jdk/pull/844 From iignatyev at openjdk.java.net Fri Oct 23 19:57:35 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Fri, 23 Oct 2020 19:57:35 GMT Subject: RFR: 8238263: Create at-requires mechanism for containers In-Reply-To: References: <43YvmzXR5TzDgPqfcAJDyfpOd7f_ur6ibKfh7AGf4LU=.40b88d2c-b29b-49d4-8d7f-120b7ade6cc5@github.com> Message-ID: On Fri, 23 Oct 2020 19:25:27 GMT, Harold Seigel wrote: > I think it depends on whether the tests will be permanently or temporarily excluded from running with containers. I thought this mechanism would be to permanently exclude the tests. That's why I used `@requires`. I see, if this is for permanent exclusion then yes I agree that `@requires` is a better choice. >> enable this option based on an environment variable so we don?t have to remember the cryptic command line sequence. > I'll look into basing the option on an environment variable. one will still need to pass an environment variable to jtreg, and hence will need to remember some sort of "cryptic command line sequence". a solution for that might be to default `jdk.containerized` to `false` in `VMProps.java` and when only _containerized_ runs will have to set it up. btw, I'm not sure that `jdk.containerized` is the best name for this property as _containerization_ is more of an environmental characteristic than that of jdk. how about smth like `env.containerized` or `testenv.containerized`? ------------- PR: https://git.openjdk.java.net/jdk/pull/844 From bob.vandette at oracle.com Fri Oct 23 20:03:11 2020 From: bob.vandette at oracle.com (Bob Vandette) Date: Fri, 23 Oct 2020 16:03:11 -0400 Subject: RFR: 8238263: Create at-requires mechanism for containers In-Reply-To: References: <43YvmzXR5TzDgPqfcAJDyfpOd7f_ur6ibKfh7AGf4LU=.40b88d2c-b29b-49d4-8d7f-120b7ade6cc5@github.com> Message-ID: <9DF4B853-2FFD-4E3E-AE3F-1B5B3FA6FEC6@oracle.com> > On Oct 23, 2020, at 3:57 PM, Igor Ignatyev wrote: > > On Fri, 23 Oct 2020 19:25:27 GMT, Harold Seigel wrote: > >> I think it depends on whether the tests will be permanently or temporarily excluded from running with containers. I thought this mechanism would be to permanently exclude the tests. That's why I used `@requires`. > > I see, if this is for permanent exclusion then yes I agree that `@requires` is a better choice. > >>> enable this option based on an environment variable so we don?t have to remember the > cryptic command line sequence. >> I'll look into basing the option on an environment variable. > > one will still need to pass an environment variable to jtreg, and hence will need to remember some sort of "cryptic command line sequence". a solution for that might be to default `jdk.containerized` to `false` in `VMProps.java` and when only _containerized_ runs will have to set it up. Why? Environment variables are inherited. For developers running jtreg, all they need to do is export the variable. % export JAVA_CONTAINERIZED=1 % bash % echo $JAVA_CONTAINERIZED % 1 Bob. > > > btw, I'm not sure that `jdk.containerized` is the best name for this property as _containerization_ is more of an environmental characteristic than that of jdk. how about smth like `env.containerized` or `testenv.containerized`? > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/844 From iignatyev at openjdk.java.net Fri Oct 23 20:10:34 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Fri, 23 Oct 2020 20:10:34 GMT Subject: RFR: 8238263: Create at-requires mechanism for containers In-Reply-To: References: <43YvmzXR5TzDgPqfcAJDyfpOd7f_ur6ibKfh7AGf4LU=.40b88d2c-b29b-49d4-8d7f-120b7ade6cc5@github.com> Message-ID: On Fri, 23 Oct 2020 19:54:40 GMT, Igor Ignatyev wrote: >> Hi Igor, >> I think it depends on whether the tests will be permanently or temporarily excluded from running with containers. I thought this mechanism would be to permanently exclude the tests. That's why I used @requires. >> Thanks, Harold > >> I think it depends on whether the tests will be permanently or temporarily excluded from running with containers. I thought this mechanism would be to permanently exclude the tests. That's why I used `@requires`. > > I see, if this is for permanent exclusion then yes I agree that `@requires` is a better choice. > >>> enable this option based on an environment variable so we don?t have to remember the > cryptic command line sequence. >> I'll look into basing the option on an environment variable. > > one will still need to pass an environment variable to jtreg, and hence will need to remember some sort of "cryptic command line sequence". a solution for that might be to default `jdk.containerized` to `false` in `VMProps.java` and when only _containerized_ runs will have to set it up. > > > btw, I'm not sure that `jdk.containerized` is the best name for this property as _containerization_ is more of an environmental characteristic than that of jdk. how about smth like `env.containerized` or `testenv.containerized`? > > one will still need to pass an environment variable to jtreg, and hence will need to remember some sort of "cryptic command line sequence". a solution for that might be to default `jdk.containerized` to `false` in `VMProps.java` and when only _containerized_ runs will have to set it up. > > Why? Environment variables are inherited. For developers running jtreg, all they need to do is export the variable. > > % export JAVA_CONTAINERIZED=1 > % bash > % echo $JAVA_CONTAINERIZED > % 1 b/c jtreg strips most of the environment variables, they might still be defined in the process which runs `VMProps` though (I have never checked that) ------------- PR: https://git.openjdk.java.net/jdk/pull/844 From iklam at openjdk.java.net Sat Oct 24 05:16:38 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Sat, 24 Oct 2020 05:16:38 GMT Subject: RFR: 8255231: Avoid upcalls when initializing the statSampler [v2] In-Reply-To: <9yiGXHEPKlYTvGUxxwzhpswuKuCsG6n434AnGAFYqPQ=.d90219a0-b71f-4550-bd90-44fbfee70e20@github.com> References: <9yiGXHEPKlYTvGUxxwzhpswuKuCsG6n434AnGAFYqPQ=.d90219a0-b71f-4550-bd90-44fbfee70e20@github.com> Message-ID: On Thu, 22 Oct 2020 11:28:22 GMT, Claes Redestad wrote: >> Current implementation of the statSampler does upcalls to System.getProperty to collect values for a number of properties that are all provided by the VM itself. And since the sampling starts before any user code run then no property can have changed. >> >> I suggest refactoring the code so that no upcalls are made normally - while asserting this invariant holds using assert-only upcalls. >> >> This is a small startup optimization - reducing the startup sequence by approx. 300k instructions and 70k branches in my linux-x64 setup. > > Claes Redestad has updated the pull request incrementally with one additional commit since the last revision: > > Revert unrelated changes to perfData Looks good to me. A nice clean up. src/hotspot/share/runtime/statSampler.cpp line 217: > 215: */ > 216: > 217: // stable interface, supported counters in the JAVA_PROPERTY name space * The list of System Properties that have corresponding PerfData * string instrumentation created by retrieving the named property's * value from System.getProperty() and unconditionally creating a * PerfStringConstant object initialized to the retrieved value. This * is not an exhaustive list of Java properties with corresponding string * instrumentation as the create_system_property_instrumentation() method * creates other property based instrumentation conditionally. I found the above comment unreadable. Is it possible to clarify it? ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/802 From rrich at openjdk.java.net Sat Oct 24 07:37:35 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Sat, 24 Oct 2020 07:37:35 GMT Subject: RFR: 8255243: Reinforce escape barrier interactions with ZGC conc stack processing In-Reply-To: References: Message-ID: On Fri, 23 Oct 2020 10:25:43 GMT, Erik ?sterlund wrote: > The escape barrier reallocates scalarized objects potentially deep into the stack of a remote thread. Each allocation can safepoint, causing referenced frames to be invalid. Some sprinklings were added that deal with that, but I believe it was subsequently broken with the integration of the new vector API, that has its own new deoptimization code that did not know about this. Not surprisingly, the integration of the new vector API had no idea about this subtlety, and allocates an object, and then reads an object deep from the stack of a remote thread (using an escape barrier). I suppose the issue is that all these 3 things were integrated at almost the same time. The problematic code sequence is in VectorSupport::allocate_vector() in vectorSupport.cpp, which is called from Deoptimization::realloc_objects(). It first allocates an oop (possibly safepointing), and then reads a vector oop from the stack. This is usually fine, but not through the escape barrier, with concurrent stack sc anning. While I have not seen any crashes yet, I can see from code inspection, that there is no way that this works correctly. > > In order to make this less fragile for future changes, we should really have a RAII object that keeps the target thread's stack of the escape barrier, stable and processed, across safepoints. This patch fixes that. Then it becomes much easier to reason about its correctness, compared to hoping the various hooks are applied after each safepoint. > > With this new robustness fix, the thread running the escape barrier, keeps the target thread stack processed, straight through safepoints on the requesting thread, making it easy and intuitive to understand why this works correctly. The RAII object basically just has to cover the code block that pokes at the remote stack and goes in and out of safepoints, arbitrarily. Arguably, this escape barrier doesn't need to be blazingly fast, and can afford keeping stacks sane through its operation. I'm really glad you caught that one! And I like the abstraction provided by KeepStackGCProcessedMark. There is one execution path you missed coming from `VM_GetOrSetLocal::deoptimize_objects(javaVFrame* jvf)`. This code should probably be moved into EscapeBarrier. `EscapeBarrier::deoptimize_objects(int depth)` could be changed to be more generic `EscapeBarrier::deoptimize_objects(int from_depth, int to_depth)`. VM_GetOrSetLocal::doit_prologue() could call eb.deoptimize_objects(_depth, _depth) then. That would be better but maybe not yet really good... Thanks again, Richard. src/hotspot/share/runtime/safepointMechanism.cpp line 92: > 90: // 2) After a thread races with the disarming of the global poll and transitions from native/blocked > 91: // 3) Before the handshake code is run > 92: StackWatermarkSet::on_safepoint(thread); start_processing in the comment above should be renamed too. src/hotspot/share/runtime/keepStackGCProcessed.cpp line 47: > 45: } > 46: StackWatermark* their_watermark = StackWatermarkSet::get(jt, StackWatermarkKind::gc); > 47: our_watermark->link_watermark(their_watermark); Assert our_watermark->_linked_watermark == NULL to avoid unintentional nesting? ------------- Changes requested by rrich (Committer). PR: https://git.openjdk.java.net/jdk/pull/832 From stuefe at openjdk.java.net Sat Oct 24 10:25:35 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Sat, 24 Oct 2020 10:25:35 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) In-Reply-To: <2ZmsAgqfhGGvOGL5GNHWPKOHPnEFlhr5iNOLswmPXKA=.cacdb3bf-d3b3-40fa-8ead-95e803546fcf@github.com> References: <2ZmsAgqfhGGvOGL5GNHWPKOHPnEFlhr5iNOLswmPXKA=.cacdb3bf-d3b3-40fa-8ead-95e803546fcf@github.com> Message-ID: On Fri, 23 Oct 2020 16:20:54 GMT, Gerard Ziemski wrote: >>> Some issues: >>> >>> * 0 is not a valid signal number >> >> Right, I used to start at 1, but then copied the "for loop" code from another part of existing code for "NSIG" and forgot to fix the start value" I will fix that. >> >>> * NSIG is obscure; may be ridiculously large (e.g. AIX I believe 1024) and still may not contain all signals for some platforms (e.g. real time signals on BSD (?) are not included) >> >> 1. The fix will work for any NSIG value, why would its value be relevant here? That's the value we use anywhere else, so if it's wrong here, then it's wrong everywhere else and we have a bigger issue to deal here than just this fix. >> 2. We are not accounting for real time signals on BSD anywhere else, so why do we need to do it here? >> >>> * You now flatten here all signal handlers across the board. What if some third party app had installed their own. I would only reset those which had been originally installed for hotspot usage (beware ReduceSignalUsage). >> >> The purpose of the "UseOSErrorReporting" flag is to let the process die, so that it can be handled by the OS. If we let third party app signal handlers jump in, we can't guarantee that "UseOSErrorReporting" does its job. >> >>> Which brings me to the point that if we fail to re-raise the original error signal, now we run with signal handlers completely disabled :( ... see my original concerns. >> >> I will loosely quote what David Holmes likes to say about signal handlers and hope I get it right: "this is a best effort, nothing is guaranteed in signal handlers". By that I mean yes, you are right that we can't guarantee correct behavior, but at this point of process lifecycle we caught the crash and produced hs_err log, anything else is just great if it works and no big loss if things go wrong now (how bad exactly can things go at this point?). From my experience, however, it works. > >> Thinking further, we probably want to just the default handler for the error signal that happened. And we also want to do this only for synchronous error signals (segv, sigill, sigbus, sigfpe), and possibly only if they are "real" (had not been sent by kill or pthread_kill). > > Handling "UseOSErrorReporting" is the last thing we ask the process to do before letting it crash and get caught by the OS. I'm not sure we need to finesse here with which signals we want to flatten. This keeps the code simple, though I will add a comment to "PosixSignals::clear_signal_handlers()" describing the assumptions and expectations. (resuming the discussion, which had been split between here and the JBS comment section): To summarize my concerns stated in the JBS: I think returning from fatal error signal handling in the hope of retriggering the same fault can be a bit dangerous. If all we risk was more colorful crashes I'd see no issue here. We are already crashing, so what. But to my mind the worst thing which can happen are not crashes or hangs but data corruption. That is highly unlikely but cannot be ruled out. First off, Posix states clearly that this is undefined behavior. Lets ignore this :-) and continue. Spurious crashes are not that rare. Consider this example: GC has a bug which temporarily leaves an oop in an invalid state. Our thread triggers a SEGV and enters signal handling. We write the hs-err file. Maybe three other threads also trigger SEGVs in the meantime. They enter error handling and are now indefinitely parked, never to return. Now we return. The original signal has been consumed and taken from the signal queue. The first issue I see is that we may now first process all deferred signals, before resuming normal work and returning to the original pc to re-trigger the original fault. Most signals we block while in the handler. When returning we unblock, and these signals then are processed. IIRC Posix does not state when this happens, but it may very well be we do this right now. Even if this does not cause problems it prolongs the time even further before we get to retrigger the original fault. The concurrently running GC may have fixed the issue in the meantime. The oop is now valid, we do not re-crash. Now we are in some invalid state but continue to run. Also three of our threads have stopped working and are stuck in VMError. My worry with all this is that this undefined VM state can cause application problems in ways a normal crash cannot. One of the more benign problems of failing to re-trigger the fault could be just more confusion: since the handlers are now all set to default, the next polling page access or stack bangs will tear down the VM. Your OS reporters will show a VM having crashed at places which are not real crashes. I know this may all sound contrived. I probably can't dissuade you from doing it. No-one has to set the switch after all. ------ The following remarks are the actual review: 1) Your patch does not work for asserts, which makes kind of sense. UseOSErrorReporting will just cause the VM to continue after an assert. Easy to reproduce with `java -XX:ErrorHandlerTest=1 -XX:+UseOSErrorReporting` - we now just loop. 2) We should not do this for any signal other than synchronous error signals (`SIGSEGV`, `SIGILL`, `SIGFPE`, `SIGBUS`) where we can be sure that we re-crash right at the return pc. 3) Also, these signals should not have been raised with `kill()`/`raise()`/etc. In other words, they should be real faults. 4) We should not remove signal handlers for all signals. That may expose us to more problems. If the intent is to retrigger our fault, we only should install the default signal handler for that one fault. ------- I also tried your patch on my linux box. When I trigger a crash - either sending a signal with kill or generating this signal with `-XX:ErrorHandlerTest=xx` - with UseOsErrorReporting, I get a hs-err file, then a core dump, but the core seems to have nothing to do with the crash. Regardless of the original signal, it always shows a SIGSEV in some assert which to me indicates an assertion polling page access. ------------- PR: https://git.openjdk.java.net/jdk/pull/813 From redestad at openjdk.java.net Sat Oct 24 12:29:05 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Sat, 24 Oct 2020 12:29:05 GMT Subject: RFR: 8255231: Avoid upcalls when initializing the statSampler [v3] In-Reply-To: References: Message-ID: > Current implementation of the statSampler does upcalls to System.getProperty to collect values for a number of properties that are all provided by the VM itself. And since the sampling starts before any user code run then no property can have changed. > > I suggest refactoring the code so that no upcalls are made normally - while asserting this invariant holds using assert-only upcalls. > > This is a small startup optimization - reducing the startup sequence by approx. 300k instructions and 70k branches in my linux-x64 setup. Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 19 additional commits since the last revision: - Refactor to remove stable_java_property_counters and clarify comments - Merge branch 'master' into com_ns - Revert unrelated changes to perfData - Merge branch 'master' into com_ns - Improve comments - typo - Missing definition - Extract the shorthand java.version from VersionProps and use it in StatSampler - Improve assert - Assert on missing value - ... and 9 more: https://git.openjdk.java.net/jdk/compare/2d0f01eb...6e220227 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/802/files - new: https://git.openjdk.java.net/jdk/pull/802/files/5daedb01..6e220227 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=802&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=802&range=01-02 Stats: 3039 lines in 175 files changed: 1182 ins; 1239 del; 618 mod Patch: https://git.openjdk.java.net/jdk/pull/802.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/802/head:pull/802 PR: https://git.openjdk.java.net/jdk/pull/802 From redestad at openjdk.java.net Sat Oct 24 12:29:06 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Sat, 24 Oct 2020 12:29:06 GMT Subject: RFR: 8255231: Avoid upcalls when initializing the statSampler [v2] In-Reply-To: References: <9yiGXHEPKlYTvGUxxwzhpswuKuCsG6n434AnGAFYqPQ=.d90219a0-b71f-4550-bd90-44fbfee70e20@github.com> Message-ID: On Sat, 24 Oct 2020 05:13:44 GMT, Ioi Lam wrote: >> Claes Redestad has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert unrelated changes to perfData > > src/hotspot/share/runtime/statSampler.cpp line 217: > >> 215: */ >> 216: >> 217: // stable interface, supported counters in the JAVA_PROPERTY name space > > * The list of System Properties that have corresponding PerfData > * string instrumentation created by retrieving the named property's > * value from System.getProperty() and unconditionally creating a > * PerfStringConstant object initialized to the retrieved value. This > * is not an exhaustive list of Java properties with corresponding string > * instrumentation as the create_system_property_instrumentation() method > * creates other property based instrumentation conditionally. > I found the above comment unreadable. Is it possible to clarify it? I reworked it a bit to get rid of the static array and keep comments more coherent. I think it reads a bit better now. ------------- PR: https://git.openjdk.java.net/jdk/pull/802 From kim.barrett at oracle.com Sat Oct 24 18:55:42 2020 From: kim.barrett at oracle.com (Kim Barrett) Date: Sat, 24 Oct 2020 14:55:42 -0400 Subject: =?utf-8?Q?Result=3A_New_HotSpot_Group_Member=3A_Erik_=C3=96sterlu?= =?utf-8?Q?nd?= Message-ID: <237B6F32-57CF-487E-A1DA-978B19225570@oracle.com> The vote for Erik ?sterlund [1] is now closed. Yes: 18 Veto: 0 Abstain: 0 [There were also two ineligible yes votes, not included above.] According to the Bylaws definition of Lazy Consensus, this is sufficient to approve the nomination. Kim Barrett [1] https://mail.openjdk.java.net/pipermail/hotspot-dev/2020-October/044331.html From kbarrett at openjdk.java.net Sat Oct 24 19:25:37 2020 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Sat, 24 Oct 2020 19:25:37 GMT Subject: RFR: 8188055: (ref) Add Reference::refersTo predicate [v6] In-Reply-To: References: <9x0zaxknpYXGIvHun1CkLP0lEC8NQmPTnANxQKjhHF8=.907bdb15-2e2e-4f84-8fe4-ea4ed50534cd@github.com> <3JzF7OkemZ-Lxc4jZgdEh3qNDzW8wF7ITeq-s7_TOlo=.11e4e40b-b775-47cf-9862-735fbc61ffd3@github.com> <3kV3qhFRXBadf7Tol9n0Yomud_ndV_T_p7ShUfk4eVE=.d7151a63-0066-4020-b0ef-bae0d03dc133@github.com> Message-ID: On Fri, 16 Oct 2020 19:22:16 GMT, Mandy Chung wrote: >> I just want to note that if you have a `Reference ref` at hand, you can not just do: >> Referemce r = (Reference) ref; >> ...since those generic types are not related. You have to do something like: >> >> @SuppressWarnings({"unchecked", "rawtypes"}) >> Referemce r = (Reference) ref; >> which is very unfortunate. Comparing this method with for example `Collection.contains(Object element)`, you can see that Collection API has made a decision to not bother with T here. That was also due to keeping old code compatible when migrating from pre-generics Java to generified Collection, but as @dfuch noted, we have a migration story here too. We will be migrating `obj == ref.get()` to `ref.refersTo(obj)` ... Mind you that this is a boolean expression fragment which might be written inline surrounded with other parts of expression. So you'll be forced to split that into assignment with @SuppressWarnings and an expression or you will have to force the whole expression or method to @SuppressWarnings. I don't know if type "safety" is forth it here. > > Reference instances should not be leaked and so I don't see very common that caller of `Reference::get` does not know the referent's type. It also depends on the `refersTo` check against `null` vs an object. Any known use case would be helpful if any (some existing code that wants to call `refersTo` to compare a `Reference` of raw type with an object of unknown type). > > FWIW, when converting a few use of `Reference::get` to `refersTo` in JDK, there is only one case (`equals(Object o)` method that needs the cast. > > http://cr.openjdk.java.net/~mchung/jdk15/webrevs/8188055/jdk-use-refersTo/index.html Is there a consensus on this issue? It's not clear whether Daniel and Peter have agreed with Mandy's responses or have just not yet responded with further discussion. ------------- PR: https://git.openjdk.java.net/jdk/pull/498 From github.com+51754783+coreyashford at openjdk.java.net Sat Oct 24 21:41:36 2020 From: github.com+51754783+coreyashford at openjdk.java.net (CoreyAshford) Date: Sat, 24 Oct 2020 21:41:36 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v8] In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 20:40:00 GMT, CoreyAshford wrote: >> Thanks for this question. I also stumbled over it when reviewing. I guess a branch which gets mispredicted in ~22% of the cases leads to a big performance loss. (In addition, the branch target is not aligned.) > > Yes, it assumes uniformly random data, but also recall that the unencoded data bytes get shifted by 2, 4, 6 bits into the encoded bytes, which I'm guessing would tend to make the data somewhat more uniform, even if the source data has low entropy. > > That said, I didn't actually benchmark it. I will do that to make sure there is a gain, and if there isn't I will remove the conditional branch. > I took a look at the VSX algo. I haven't looked much beyond it. I had a few questions I've inlined. It does look like a faithful VSX implementation of the linked algo. I neglected to thank you for reviewing this code! I realize there's quite a time commitment required to review this properly, and because of that I was having difficulty finding a second reviewer for the PPC64 portion. Just to set expectations, I will be on vacation next week, so further commits won't be posted until the following week, but I will address all of your great feedback. Thanks again! ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From plevart at openjdk.java.net Sat Oct 24 22:25:36 2020 From: plevart at openjdk.java.net (Peter Levart) Date: Sat, 24 Oct 2020 22:25:36 GMT Subject: RFR: 8188055: (ref) Add Reference::refersTo predicate [v6] In-Reply-To: References: <9x0zaxknpYXGIvHun1CkLP0lEC8NQmPTnANxQKjhHF8=.907bdb15-2e2e-4f84-8fe4-ea4ed50534cd@github.com> <3JzF7OkemZ-Lxc4jZgdEh3qNDzW8wF7ITeq-s7_TOlo=.11e4e40b-b775-47cf-9862-735fbc61ffd3@github.com> <3kV3qhFRXBadf7Tol9n0Yomud_ndV_T_p7ShUfk4eVE=.d7151a63-0066-4020-b0ef-bae0d03dc133@github.com> Message-ID: On Fri, 16 Oct 2020 19:22:16 GMT, Mandy Chung wrote: >> I just want to note that if you have a `Reference ref` at hand, you can not just do: >> Referemce r = (Reference) ref; >> ...since those generic types are not related. You have to do something like: >> >> @SuppressWarnings({"unchecked", "rawtypes"}) >> Referemce r = (Reference) ref; >> which is very unfortunate. Comparing this method with for example `Collection.contains(Object element)`, you can see that Collection API has made a decision to not bother with T here. That was also due to keeping old code compatible when migrating from pre-generics Java to generified Collection, but as @dfuch noted, we have a migration story here too. We will be migrating `obj == ref.get()` to `ref.refersTo(obj)` ... Mind you that this is a boolean expression fragment which might be written inline surrounded with other parts of expression. So you'll be forced to split that into assignment with @SuppressWarnings and an expression or you will have to force the whole expression or method to @SuppressWarnings. I don't know if type "safety" is forth it here. > > Reference instances should not be leaked and so I don't see very common that caller of `Reference::get` does not know the referent's type. It also depends on the `refersTo` check against `null` vs an object. Any known use case would be helpful if any (some existing code that wants to call `refersTo` to compare a `Reference` of raw type with an object of unknown type). > > FWIW, when converting a few use of `Reference::get` to `refersTo` in JDK, there is only one case (`equals(Object o)` method that needs the cast. > > http://cr.openjdk.java.net/~mchung/jdk15/webrevs/8188055/jdk-use-refersTo/index.html @mlchung I don't have many known use cases, but how about WeakHashMap.containsKey(Object key) for example? Currently `WeakHashMap.Entry extends WeakReference` but it would be more type safe if it extended `WeakReference`. In that case an `entry.refersTo(key)` would not work... ------------- PR: https://git.openjdk.java.net/jdk/pull/498 From prr at openjdk.java.net Sat Oct 24 23:14:34 2020 From: prr at openjdk.java.net (Phil Race) Date: Sat, 24 Oct 2020 23:14:34 GMT Subject: RFR: 8255299: Drop explicit zeroing at instantiation of Atomic* objects In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 20:46:15 GMT, ?????? ??????? wrote: > As discussed in https://github.com/openjdk/jdk/pull/510 there is never a reason to explicitly instantiate any instance of `Atomic*` class with its default value, i.e. `new AtomicInteger(0)` could be replaced with `new AtomicInteger()` which is faster: > @State(Scope.Thread) > @OutputTimeUnit(TimeUnit.NANOSECONDS) > @BenchmarkMode(value = Mode.AverageTime) > public class AtomicBenchmark { > @Benchmark > public Object defaultValue() { > return new AtomicInteger(); > } > @Benchmark > public Object explicitValue() { > return new AtomicInteger(0); > } > } > THis benchmark demonstrates that `explicitValue()` is much slower: > Benchmark Mode Cnt Score Error Units > AtomicBenchmark.defaultValue avgt 30 4.778 ? 0.403 ns/op > AtomicBenchmark.explicitValue avgt 30 11.846 ? 0.273 ns/op > So meanwhile https://bugs.openjdk.java.net/browse/JDK-8145948 is still in progress we could trivially replace explicit zeroing with default constructors gaining some performance benefit with no risk. > > I've tested the changes locally, both tier1 and tier 2 are ok. > > Could one create an issue for tracking this? client changes are fine ------------- Marked as reviewed by prr (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/818 From dholmes at openjdk.java.net Mon Oct 26 04:35:37 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 26 Oct 2020 04:35:37 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 16:40:43 GMT, Gerard Ziemski wrote: > hi all, > > Please review this fix for POSIX platforms, which addresses a time out that occurs while handling a crash with UseOSErrorReporting ON > > The timeout was caused by the crash handling code, looping infinitively, because it incorrectly assumed that the signal handlers were reset to their defaults, while what really was happening was that the code was resetting the signal handlers to our default signal handler. > > To avoid similar confusion in the future I did the following: > > - renamed the VMError::reset_signal_handlers() to VMError:: rearm_signal_handlers() > - introduced a new API VMError::clear_signal_handlers() which is implemented in PosixSignals > > PosixSignals::clear_signal_handlers() is where the actual fix is done and it simply resets all signal handlers to their system defaults. > > A similar problem occurs on Windows, with the only difference being that before a process times out (takes 2 minutes) it runs out of stack space in about 250 loops, so that's the only reason it doesn't linger for that long. Windows issue is tracked separately by https://bugs.openjdk.java.net/browse/JDK-8250782 > > Note: The expectation for "UseOSErrorReporting" is for the OS to catch the crashed process and to produce its own crash log (in addition to Hotspot creating hs_err log file) - see https://bugs.openjdk.java.net/browse/JDK-8237727 for relevant discussion. It does not affect whether core dump is written or not (that is controlled by CreateCoredumpOnCrash) Hi Gerard, I think we have a fundamental problem here that UseOSErrorReporting was only ever intended for use on Windows. It simply allows VMError::report_and_die to return instead of actually making the VM "die". For Windows this means we can continue to propagate the windows exception and thus allow Windows Error Reporting (WER) to take over. Whether this actually works correctly or not is a different matter. For non-Windows there is no pre-established alternative code path for report_and_die() returning. In the bug report you write: > On Mac/Linux it would look more like this: > > #1 catch signal in our handler > #2 generate hs_err log > #3 turn off our signal handler > #4 continue the process normally, allowing it to crash again in the same spot, with the same signal being generated > To me you are now inventing what UseOSErrorReporting should mean on non-Windows, and I don't agree with it. I don't think it should mean that we re-crash using the "default" signal response and consider that as using "OS error reporting". To me that is just not valid, especially when we cannot return from a signal handling context in many cases without incurring undefined behaviour. To me #4 is not a valid expectation as we have no way to know what will happen next if the signal handler returns. It would also be wrong to just continue execution after an assertion or guarantee fails. I'm assuming that the motivation here is that on macOS if we use the default signal handling modes then macOS will do its own error reporting? If so I would suggest that the right response may be to return from report_and_die (on macOS only) and then deliberately crash after restoring the default handler. Obviously that will change which "crash" the OS reports but that is likely to happen anyway as you cannot guarantee how you will crash after trying to continue (and this goes beyond our general "best effort" approaches in signal handling.) Beyond that I share Thomas's concerns about making sweeping changes to installed signal handlers. So my preferred approaches here would be: 1. Make UseOSErrorReporting Windows only; or 2. Make UseOSErrorReporting Windows and macOS only. Then on macOS do a targeted crash after report_and_die() returns. Thanks, David ------------- PR: https://git.openjdk.java.net/jdk/pull/813 From iklam at openjdk.java.net Mon Oct 26 04:48:49 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 26 Oct 2020 04:48:49 GMT Subject: RFR: 8255231: Avoid upcalls when initializing the statSampler [v3] In-Reply-To: References: Message-ID: <_5ePLWAl86aSWBWNe9NoCgWn9ue7hmMXG1qL4hVBxRI=.0cb6cac5-c2e7-4a76-ae47-2797a248a649@github.com> On Sat, 24 Oct 2020 12:29:05 GMT, Claes Redestad wrote: >> Current implementation of the statSampler does upcalls to System.getProperty to collect values for a number of properties that are all provided by the VM itself. And since the sampling starts before any user code run then no property can have changed. >> >> I suggest refactoring the code so that no upcalls are made normally - while asserting this invariant holds using assert-only upcalls. >> >> This is a small startup optimization - reducing the startup sequence by approx. 300k instructions and 70k branches in my linux-x64 setup. > > Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 19 additional commits since the last revision: > > - Refactor to remove stable_java_property_counters and clarify comments > - Merge branch 'master' into com_ns > - Revert unrelated changes to perfData > - Merge branch 'master' into com_ns > - Improve comments > - typo > - Missing definition > - Extract the shorthand java.version from VersionProps and use it in StatSampler > - Improve assert > - Assert on missing value > - ... and 9 more: https://git.openjdk.java.net/jdk/compare/ec51640f...6e220227 LGTM ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/802 From dholmes at openjdk.java.net Mon Oct 26 04:52:37 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 26 Oct 2020 04:52:37 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 16:40:43 GMT, Gerard Ziemski wrote: > hi all, > > Please review this fix for POSIX platforms, which addresses a time out that occurs while handling a crash with UseOSErrorReporting ON > > The timeout was caused by the crash handling code, looping infinitively, because it incorrectly assumed that the signal handlers were reset to their defaults, while what really was happening was that the code was resetting the signal handlers to our default signal handler. > > To avoid similar confusion in the future I did the following: > > - renamed the VMError::reset_signal_handlers() to VMError:: rearm_signal_handlers() > - introduced a new API VMError::clear_signal_handlers() which is implemented in PosixSignals > > PosixSignals::clear_signal_handlers() is where the actual fix is done and it simply resets all signal handlers to their system defaults. > > A similar problem occurs on Windows, with the only difference being that before a process times out (takes 2 minutes) it runs out of stack space in about 250 loops, so that's the only reason it doesn't linger for that long. Windows issue is tracked separately by https://bugs.openjdk.java.net/browse/JDK-8250782 > > Note: The expectation for "UseOSErrorReporting" is for the OS to catch the crashed process and to produce its own crash log (in addition to Hotspot creating hs_err log file) - see https://bugs.openjdk.java.net/browse/JDK-8237727 for relevant discussion. It does not affect whether core dump is written or not (that is controlled by CreateCoredumpOnCrash) Changing review status to "Request changes". ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/813 From dholmes at openjdk.java.net Mon Oct 26 04:52:38 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 26 Oct 2020 04:52:38 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 04:33:03 GMT, David Holmes wrote: >> hi all, >> >> Please review this fix for POSIX platforms, which addresses a time out that occurs while handling a crash with UseOSErrorReporting ON >> >> The timeout was caused by the crash handling code, looping infinitively, because it incorrectly assumed that the signal handlers were reset to their defaults, while what really was happening was that the code was resetting the signal handlers to our default signal handler. >> >> To avoid similar confusion in the future I did the following: >> >> - renamed the VMError::reset_signal_handlers() to VMError:: rearm_signal_handlers() >> - introduced a new API VMError::clear_signal_handlers() which is implemented in PosixSignals >> >> PosixSignals::clear_signal_handlers() is where the actual fix is done and it simply resets all signal handlers to their system defaults. >> >> A similar problem occurs on Windows, with the only difference being that before a process times out (takes 2 minutes) it runs out of stack space in about 250 loops, so that's the only reason it doesn't linger for that long. Windows issue is tracked separately by https://bugs.openjdk.java.net/browse/JDK-8250782 >> >> Note: The expectation for "UseOSErrorReporting" is for the OS to catch the crashed process and to produce its own crash log (in addition to Hotspot creating hs_err log file) - see https://bugs.openjdk.java.net/browse/JDK-8237727 for relevant discussion. It does not affect whether core dump is written or not (that is controlled by CreateCoredumpOnCrash) > > Hi Gerard, > > I think we have a fundamental problem here that UseOSErrorReporting was only ever intended for use on Windows. It simply allows VMError::report_and_die to return instead of actually making the VM "die". For Windows this means we can continue to propagate the windows exception and thus allow Windows Error Reporting (WER) to take over. Whether this actually works correctly or not is a different matter. > > For non-Windows there is no pre-established alternative code path for report_and_die() returning. > > In the bug report you write: > >> On Mac/Linux it would look more like this: >> >> #1 catch signal in our handler >> #2 generate hs_err log >> #3 turn off our signal handler >> #4 continue the process normally, allowing it to crash again in the same spot, with the same signal being generated >> > > To me you are now inventing what UseOSErrorReporting should mean on non-Windows, and I don't agree with it. I don't think it should mean that we re-crash using the "default" signal response and consider that as using "OS error reporting". To me that is just not valid, especially when we cannot return from a signal handling context in many cases without incurring undefined behaviour. To me #4 is not a valid expectation as we have no way to know what will happen next if the signal handler returns. It would also be wrong to just continue execution after an assertion or guarantee fails. > > I'm assuming that the motivation here is that on macOS if we use the default signal handling modes then macOS will do its own error reporting? If so I would suggest that the right response may be to return from report_and_die (on macOS only) and then deliberately crash after restoring the default handler. Obviously that will change which "crash" the OS reports but that is likely to happen anyway as you cannot guarantee how you will crash after trying to continue (and this goes beyond our general "best effort" approaches in signal handling.) > > Beyond that I share Thomas's concerns about making sweeping changes to installed signal handlers. > > So my preferred approaches here would be: > > 1. Make UseOSErrorReporting Windows only; or > 2. Make UseOSErrorReporting Windows and macOS only. Then on macOS do a targeted crash after report_and_die() returns. > > Thanks, > David From: https://bugs.openjdk.java.net/browse/JDK-6227246 "Iimplemented Windows-only flag -XX+UseOSErrorReporting which allows us instead of running of our crash handler and dying, forward exception handling to the OS in case of actual crash. " but there were issues with the integration of the fix: "Some of the changes to this fix weren't integrated or were merged out by mistake." and we ended up with a shared flag. I can see a comment in the original putback: "Make UseOSErrorReporting platform independant so linux can use someday and because used from os independant code." which is why this ended up not being Windows-only even though it only worked in a meaningful way on Windows. ------------- PR: https://git.openjdk.java.net/jdk/pull/813 From dholmes at openjdk.java.net Mon Oct 26 06:34:48 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 26 Oct 2020 06:34:48 GMT Subject: RFR: 8255231: Avoid upcalls when initializing the statSampler [v3] In-Reply-To: References: Message-ID: On Sat, 24 Oct 2020 12:29:05 GMT, Claes Redestad wrote: >> Current implementation of the statSampler does upcalls to System.getProperty to collect values for a number of properties that are all provided by the VM itself. And since the sampling starts before any user code run then no property can have changed. >> >> I suggest refactoring the code so that no upcalls are made normally - while asserting this invariant holds using assert-only upcalls. >> >> This is a small startup optimization - reducing the startup sequence by approx. 300k instructions and 70k branches in my linux-x64 setup. > > Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 19 additional commits since the last revision: > > - Refactor to remove stable_java_property_counters and clarify comments > - Merge branch 'master' into com_ns > - Revert unrelated changes to perfData > - Merge branch 'master' into com_ns > - Improve comments > - typo > - Missing definition > - Extract the shorthand java.version from VersionProps and use it in StatSampler > - Improve assert > - Assert on missing value > - ... and 9 more: https://git.openjdk.java.net/jdk/compare/f77461b0...6e220227 So just to be clear, the statSampler was doing a System.getProperty() upcall, for a property that was previously set by the VM, where the value was initially obtained via a read of the fields in java.lang.VersionProps and stored in the VM in either JDK_Version of VM_version. So now you just use that JDK_Version/VM_Version value directly. And to allow that you had to add in the reading of the "short version" java.version property. This seems okay in principle. I have a couple of nits: - can we refer to JDK_Version::java_version instead of "short_version"? I see where the short comes from in VersionProps, but as it represents the java.version value I think we can just use that in the VM for clarity. - are we concerned that when we store these values in JDK_Version/VM_Version we are artificially constraining their length? - now that java.version is being read from VersionProps can you please add a comment to that effect in the template: + // This field is read by HotSpot private static final String java_version = "@@VERSION_SHORT@@"; Thanks, David ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/802 From ayang at openjdk.java.net Mon Oct 26 07:44:39 2020 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Mon, 26 Oct 2020 07:44:39 GMT Subject: RFR: 8255298: Remove SurvivorAlignmentInBytes functionality [v2] In-Reply-To: References: Message-ID: <4f0U-SmMRo72_dOV8EbLqnship2hTGMhfkcX9EszAXY=.873d3a45-825b-4e7e-b4f3-f1663aa3eec8@github.com> On Fri, 23 Oct 2020 17:26:50 GMT, Thomas Schatzl wrote: >> Hi all, >> >> can I have reviews to remove the SurvivorAlignmentInBytes functionality? It has not been in use for a long time if ever, and can be removed. Searching the web also indicates that apart from the usual lists of all options and CRs it is never mentioned. >> >> SurvivorAlignmentInBytes is an experimental option so no further process is required. >> >> Testing: tier1-5 >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > shade review Marked as reviewed by ayang (Author). ------------- PR: https://git.openjdk.java.net/jdk/pull/838 From kbarrett at openjdk.java.net Mon Oct 26 08:27:35 2020 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 26 Oct 2020 08:27:35 GMT Subject: RFR: 8255298: Remove SurvivorAlignmentInBytes functionality [v2] In-Reply-To: References: Message-ID: On Fri, 23 Oct 2020 17:26:50 GMT, Thomas Schatzl wrote: >> Hi all, >> >> can I have reviews to remove the SurvivorAlignmentInBytes functionality? It has not been in use for a long time if ever, and can be removed. Searching the web also indicates that apart from the usual lists of all options and CRs it is never mentioned. >> >> SurvivorAlignmentInBytes is an experimental option so no further process is required. >> >> Testing: tier1-5 >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > shade review src/hotspot/share/gc/parallel/psPromotionLAB.inline.hpp line 43: > 41: return obj; > 42: } else { > 43: set_top(obj); I think we don't need this set_top anymore. ------------- PR: https://git.openjdk.java.net/jdk/pull/838 From eosterlund at openjdk.java.net Mon Oct 26 08:38:50 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 26 Oct 2020 08:38:50 GMT Subject: RFR: 8255233: InterpreterRuntime::at_unwind should be a JRT_LEAF [v2] In-Reply-To: References: Message-ID: > InterpreterRuntime::at_unwind is called at the very beginning of remove_activation(), to notify concurrent stack processing that a frame is about to be unwound. It is currently a JRT_ENTRY, because it needs a last_Java_frame to see what frame is about to get unwound. > > However, there are special return paths used by JVMTI pop frame, that checks if the caller frame is deoptimized, then calls a special path that removes the top activation, assuming that does not enter the deopt handler. The new JRT_ENTRY makes that reasoning invalid. > > Therefore, we need this to be a JRT_LEAF, that sets a last Java frame, to make everyone happy. This patch performs that change. > > I have run tier 1-5 testing, and manually tested: > > while true; do make test JTREG="RETAIN=all" TEST=open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/ForceEarlyReturn/ForceEarlyReturn002 TEST_OPTS_JAVA_OPTIONS="-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=0.0001 -XX:ZFragmentationLimit=0.01 -XX:+VerifyOops -XX:+ZVerifyViews" ; done > > Before the fix it crashes ~1/15 runs with a bad oop. After the fix, it doesn't crash. I have run it more times than my tmux buffer fits (for a day), and it does not fail any more with this fix. > > Unfortunately, my testing on AArch64 has been stalled for a day, so I have sent out this PR without the testing of those bits being finished. I won't push until I get the results back, of course. But I am expecting that to be fine, as there is nothing special going on there and it compiles. Will post a comment when the complete results have arrived. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: address cast ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/828/files - new: https://git.openjdk.java.net/jdk/pull/828/files/6c6a6ce4..cc3929d1 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=828&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=828&range=00-01 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/828.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/828/head:pull/828 PR: https://git.openjdk.java.net/jdk/pull/828 From eosterlund at openjdk.java.net Mon Oct 26 08:41:36 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 26 Oct 2020 08:41:36 GMT Subject: RFR: 8255233: InterpreterRuntime::at_unwind should be a JRT_LEAF In-Reply-To: References: Message-ID: On Fri, 23 Oct 2020 09:45:11 GMT, Erik ?sterlund wrote: > InterpreterRuntime::at_unwind is called at the very beginning of remove_activation(), to notify concurrent stack processing that a frame is about to be unwound. It is currently a JRT_ENTRY, because it needs a last_Java_frame to see what frame is about to get unwound. > > However, there are special return paths used by JVMTI pop frame, that checks if the caller frame is deoptimized, then calls a special path that removes the top activation, assuming that does not enter the deopt handler. The new JRT_ENTRY makes that reasoning invalid. > > Therefore, we need this to be a JRT_LEAF, that sets a last Java frame, to make everyone happy. This patch performs that change. > > I have run tier 1-5 testing, and manually tested: > > while true; do make test JTREG="RETAIN=all" TEST=open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/ForceEarlyReturn/ForceEarlyReturn002 TEST_OPTS_JAVA_OPTIONS="-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=0.0001 -XX:ZFragmentationLimit=0.01 -XX:+VerifyOops -XX:+ZVerifyViews" ; done > > Before the fix it crashes ~1/15 runs with a bad oop. After the fix, it doesn't crash. I have run it more times than my tmux buffer fits (for a day), and it does not fail any more with this fix. > > Unfortunately, my testing on AArch64 has been stalled for a day, so I have sent out this PR without the testing of those bits being finished. I won't push until I get the results back, of course. But I am expecting that to be fine, as there is nothing special going on there and it compiles. Will post a comment when the complete results have arrived. The delayed AArch64 test results are in: looking good as expected. However the github bot complained about x86 32 bit needing a cast, so hopefully fixed that now. I guess we will see what the bot has to say. Thank you bot for finding that. ------------- PR: https://git.openjdk.java.net/jdk/pull/828 From ihse at openjdk.java.net Mon Oct 26 09:14:41 2020 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Mon, 26 Oct 2020 09:14:41 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v13] In-Reply-To: References: Message-ID: On Mon, 19 Oct 2020 10:34:32 GMT, Maurizio Cimadamore wrote: >> This patch contains the changes associated with the third incubation round of the foreign memory access API incubation (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: >> >> * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from multiple threads >> * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee that the memory will be deallocated, eventually >> * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class has been added, which defines several useful dereference routines; these are really just thin wrappers around memory access var handles, but they make the barrier of entry for using this API somewhat lower. >> >> A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit of dereference. >> >> This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done by calling `MemoryAddress::asSegmentRestricted`). >> >> A list of the API, implementation and test changes is provided below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be happy to point at existing discussions, and/or to provide the feedback required. >> >> A big thank to Erik Osterlund, Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. >> >> Thanks >> Maurizio >> >> Javadoc: >> >> http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html >> >> Specdiff: >> >> http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html >> >> CSR: >> >> https://bugs.openjdk.java.net/browse/JDK-8254163 >> >> >> >> ### API Changes >> >> * `MemorySegment` >> * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) >> * added a no-arg factory for a native restricted segment representing entire native heap >> * rename `withOwnerThread` to `handoff` >> * add new `share` method, to create shared segments >> * add new `registerCleaner` method, to register a segment against a cleaner >> * add more helpers to create arrays from a segment e.g. `toIntArray` >> * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) >> * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) >> * `MemoryAddress` >> * drop `segment` accessor >> * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative to a given segment >> * `MemoryAccess` >> * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. `getByteAtOffset` vs `getByteAtIndex`). >> * `MemoryHandles` >> * drop `withOffset` combinator >> * drop `withStride` combinator >> * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which it is easy to derive all the other handles using plain var handle combinators. >> * `Addressable` >> * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. >> * `MemoryLayouts` >> * A new layout, for machine addresses, has been added to the mix. >> >> >> >> ### Implementation changes >> >> There are two main things to discuss here: support for shared segments, and the general simplification of the memory access var handle support. >> >> #### Shared segments >> >> The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment is shared, it would be possible for a thread to close it while another is accessing it. >> >> After considering several options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. >> >> Sadly, none of these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). >> >> The question is, then, once we detect that a thread is accessing the very segment we're about to close, what should happen? We first experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread is accessing the segment being closed. >> >> As written in the javadoc, this doesn't mean that clients should just catch and try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should be treated as such. >> >> In terms of gritty implementation, we needed to centralize memory access routines in a single place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` annotation, which tells the VM that something important is going on. >> >> To achieve this, we created a new (autogenerated) class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during access (which is important when registering segments against cleaners). >> >> Of course, to make memory access safe, memory access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead of unsafe, so that a liveness check can be triggered (in case a scope is present). >> >> `ScopedMemoryAccess` has a `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed successfully. >> >> The implementation of `MemoryScope` (now significantly simplified from what we had before), has two implementations, one for confined segments and one for shared segments; the main difference between the two is what happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. >> >> #### Memory access var handles overhaul >> >> The key realization here was that if all memory access var handles took a coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle form. >> >> This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that e.g. additional offset is injected into a base memory access var handle. >> >> This also helped in simplifying the implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level access on the innards of the memory access var handle. All that code is now gone. >> >> #### Test changes >> >> Not much to see here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared segment case. >> >> [1] - https://openjdk.java.net/jeps/393 >> [2] - https://openjdk.java.net/jeps/389 >> [3] - https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html >> [4] - https://openjdk.java.net/jeps/312 > > Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: > > Address CSR comments Changes requested by ihse (Reviewer). make/modules/java.base/gensrc/GensrcScopedMemoryAccess.gmk line 148: > 146: > 147: $(DEST): $(BUILD_TOOLS_JDK) $(SCOPED_MEMORY_ACCESS_TEMPLATE) $(SCOPED_MEMORY_ACCESS_BIN_TEMPLATE) > 148: $(MKDIR) -p $(SCOPED_MEMORY_ACCESS_GENSRC_DIR) Please use `$(call MakeDir, $(SCOPED_MEMORY_ACCESS_GENSRC_DIR))` instead. make/modules/java.base/gensrc/GensrcScopedMemoryAccess.gmk line 34: > 32: SCOPED_MEMORY_ACCESS_TEMPLATE := $(SCOPED_MEMORY_ACCESS_SRC_DIR)/X-ScopedMemoryAccess.java.template > 33: SCOPED_MEMORY_ACCESS_BIN_TEMPLATE := $(SCOPED_MEMORY_ACCESS_SRC_DIR)/X-ScopedMemoryAccess-bin.java.template > 34: DEST := $(SCOPED_MEMORY_ACCESS_GENSRC_DIR)/ScopedMemoryAccess.java `DEST` is a very generic and not really informative name. Maybe `SCOPED_MEMORY_ACCESS_GENSRC_DEST` to fit in with the rest of the names? And/or, maybe, to cut down on the excessive length, shorten `SCOPED_MEMORY_ACCESS` to `SMA` in all variables. make/modules/java.base/gensrc/GensrcScopedMemoryAccess.gmk line 26: > 24: # > 25: > 26: GENSRC_SCOPED_MEMORY_ACCESS := This variable does not seem to be used. A left-over from previous iterations? Also, please cut down a bit on the consecutive empty lines. ------------- PR: https://git.openjdk.java.net/jdk/pull/548 From ihse at openjdk.java.net Mon Oct 26 09:19:37 2020 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Mon, 26 Oct 2020 09:19:37 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v12] In-Reply-To: <2IKx6cpc-IGP3jZtr0s2I14BWM6ptyFD26szPl3b1ng=.9d956b98-dfe6-4a45-a371-bf86923214fb@github.com> References: <2IKx6cpc-IGP3jZtr0s2I14BWM6ptyFD26szPl3b1ng=.9d956b98-dfe6-4a45-a371-bf86923214fb@github.com> Message-ID: On Thu, 22 Oct 2020 17:04:34 GMT, Maurizio Cimadamore wrote: >> This patch contains the changes associated with the first incubation round of the foreign linker access API incubation >> (see JEP 389 [1]). This work is meant to sit on top of the foreign memory access support (see JEP 393 [2] and associated pull request [3]). >> >> The main goal of this API is to provide a way to call native functions from Java code without the need of intermediate JNI glue code. In order to do this, native calls are modeled through the MethodHandle API. I suggest reading the writeup [4] I put together few weeks ago, which illustrates what the foreign linker support is, and how it should be used by clients. >> >> Disclaimer: the pull request mechanism isn't great at managing *dependent* reviews. For this reasons, I'm attaching a webrev which contains only the differences between this PR and the memory access PR. I will be periodically uploading new webrevs, as new iterations come out, to try and make the life of reviewers as simple as possible. >> >> A big thank to Jorn Vernee and Vladimir Ivanov - they are the main architects of all the hotspot changes you see here, and without their help, the foreign linker support wouldn't be what it is today. As usual, a big thank to Paul Sandoz, who provided many insights (often by trying the bits first hand). >> >> Thanks >> Maurizio >> >> Webrev: >> http://cr.openjdk.java.net/~mcimadamore/8254231_v1/webrev >> >> Javadoc: >> >> http://cr.openjdk.java.net/~mcimadamore/8254231_v1/javadoc/jdk/incubator/foreign/package-summary.html >> >> Specdiff (relative to [3]): >> >> http://cr.openjdk.java.net/~mcimadamore/8254231_v1/specdiff_delta/overview-summary.html >> >> CSR: >> >> https://bugs.openjdk.java.net/browse/JDK-8254232 >> >> >> >> ### API Changes >> >> The API changes are actually rather slim: >> >> * `LibraryLookup` >> * This class allows clients to lookup symbols in native libraries; the interface is fairly simple; you can load a library by name, or absolute path, and then lookup symbols on that library. >> * `FunctionDescriptor` >> * This is an abstraction that is very similar, in spirit, to `MethodType`; it is, at its core, an aggregate of memory layouts for the function arguments/return type. A function descriptor is used to describe the signature of a native function. >> * `CLinker` >> * This is the real star of the show. A `CLinker` has two main methods: `downcallHandle` and `upcallStub`; the first takes a native symbol (as obtained from `LibraryLookup`), a `MethodType` and a `FunctionDescriptor` and returns a `MethodHandle` instance which can be used to call the target native symbol. The second takes an existing method handle, and a `FunctionDescriptor` and returns a new `MemorySegment` corresponding to a code stub allocated by the VM which acts as a trampoline from native code to the user-provided method handle. This is very useful for implementing upcalls. >> * This class also contains the various layout constants that should be used by clients when describing native signatures (e.g. `C_LONG` and friends); these layouts contain additional ABI classfication information (in the form of layout attributes) which is used by the runtime to *infer* how Java arguments should be shuffled for the native call to take place. >> * Finally, this class provides some helper functions e.g. so that clients can convert Java strings into C strings and back. >> * `NativeScope` >> * This is an helper class which allows clients to group together logically related allocations; that is, rather than allocating separate memory segments using separate *try-with-resource* constructs, a `NativeScope` allows clients to use a _single_ block, and allocate all the required segments there. This is not only an usability boost, but also a performance boost, since not all allocation requests will be turned into `malloc` calls. >> * `MemorySegment` >> * Only one method added here - namely `handoff(NativeScope)` which allows a segment to be transferred onto an existing native scope. >> >> ### Safety >> >> The foreign linker API is intrinsically unsafe; many things can go wrong when requesting a native method handle. For instance, the description of the native signature might be wrong (e.g. have too many arguments) - and the runtime has, in the general case, no way to detect such mismatches. For these reasons, obtaining a `CLinker` instance is a *restricted* operation, which can be enabled by specifying the usual JDK property `-Dforeign.restricted=permit` (as it's the case for other restricted method in the foreign memory API). >> >> ### Implementation changes >> >> The Java changes associated with `LibraryLookup` are relative straightforward; the only interesting thing to note here is that library loading does _not_ depend on class loaders, so `LibraryLookup` is not subject to the same restrictions which apply to JNI library loading (e.g. same library cannot be loaded by different classloaders). >> >> As for `NativeScope` the changes are again relatively straightforward; it is an API which sits neatly on top of the foreign meory access API, providing some kind of allocation service which shares the same underlying memory segment(s), and turns an allocation request into a segment slice, which is a much less expensive operation. `NativeScope` comes in two variants: there are native scopes for which the allocation size is known a priori, and native scopes which can grow - these two schemes are implemented by two separate subclasses of `AbstractNativeScopeImpl`. >> >> Of course the bulk of the changes are to support the `CLinker` downcall/upcall routines. These changes cut pretty deep into the JVM; I'll briefly summarize the goal of some of this changes - for further details, Jorn has put together a detailed writeup which explains the rationale behind the VM support, with some references to the code [5]. >> >> The main idea behind foreign linker is to infer, given a Java method type (expressed as a `MethodType` instance) and the description of the signature of a native function (expressed as a `FunctionDescriptor` instance) a _recipe_ that can be used to turn a Java call into the corresponding native call targeting the requested native function. >> >> This inference scheme can be defined in a pretty straightforward fashion by looking at the various ABI specifications (for instance, see [6] for the SysV ABI, which is the one used on Linux/Mac). The various `CallArranger` classes, of which we have a flavor for each supported platform, do exactly that kind of inference. >> >> For the inference process to work, we need to attach extra information to memory layouts; it is no longer sufficient to know e.g. that a layout is 32/64 bits - we need to know whether it is meant to represent a floating point value, or an integral value; this knowledge is required because floating points are passed in different registers by most ABIs. For this reason, `CLinker` offers a set of pre-baked, platform-dependent layout constants which contain the required classification attributes (e.g. a `Clinker.TypeKind` enum value). The runtime extracts this attribute, and performs classification accordingly. >> >> A native call is decomposed into a sequence of basic, primitive operations, called `Binding` (see the great javadoc on the `Binding.java` class for more info). There are many such bindings - for instance the `Move` binding is used to move a value into a specific machine register/stack slot. So, the main job of the various `CallingArranger` classes is to determine, given a Java `MethodType` and `FunctionDescriptor` what is the set of bindings associated with the downcall/upcall. >> >> At the heart of the foreign linker support is the `ProgrammableInvoker` class. This class effectively generates a `MethodHandle` which follows the steps described by the various bindings obtained by `CallArranger`. There are actually various strategies to interpret these bindings - listed below: >> >> * basic intepreted mode; in this mode, all bindings are interpreted using a stack-based machine written in Java (see `BindingInterpreter`), except for the `Move` bindings. For these bindings, the move is implemented by allocating a *buffer* (whose size is ABI specific) and by moving all the lowered values into positions within this buffer. The buffer is then passed to a piece of assembly code inside the VM which takes values from the buffer and moves them in their expected registers/stack slots (note that each position in the buffer corresponds to a different register). This is the most general invocation mode, the more "customizable" one, but also the slowest - since for every call there is some extra allocation which takes place. >> >> * specialized interpreted mode; same as before, but instead of interpreting the bindings with a stack-based interpreter, we generate a method handle chain which effectively interprets all the bindings (again, except `Move` ones). >> >> * intrinsified mode; this is typically used in combination with the specialized interpreted mode described above (although it can also be used with the Java-based binding interpreter). The goal here is to remove the buffer allocation and copy by introducing an additional JVM intrinsic. If a native call recipe is constant (e.g. the set of bindings is constant, which is probably the case if the native method handle is stored in a `static`, `final` field), then the VM can generate specialized assembly code which interprets the `Move` binding without the need to go for an intermediate buffer. This gives us back performances that are on par with JNI. >> >> For upcalls, the support is not (yet) as advanced, and only the basic interpreted mode is available there. We plan to add support for intrinsified modes there as well, which should considerably boost perfomances (probably well beyond what JNI can offer at the moment, since the upcall support in JNI is not very well optimized). >> >> Again, for more readings on the internals of the foreign linker support, please refer to [5]. >> >> #### Test changes >> >> Many new tests have been added to validate the foreign linker support; we have high level tests (see `StdLibTest`) which aim at testing the linker from the perspective of code that clients could write. But we also have deeper combinatorial tests (see `TestUpcall` and `TestDowncall`) which are meant to stress every corner of the ABI implementation. There are also some great tests (see the `callarranger` folder) which test the various `CallArranger`s for all the possible platforms; these tests adopt more of a white-box approach - that is, instead of treating the linker machinery as a black box and verify that the support works by checking that the native call returned the results we expected, these tests aims at checking that the set of bindings generated by the call arranger is correct. This also mean that we can test the classification logic for Windows, Mac and Linux regardless of the platform we're executing on. >> >> Some additional microbenchmarks have been added to compare the performances of downcall/upcall with JNI. >> >> [1] - https://openjdk.java.net/jeps/389 >> [2] - https://openjdk.java.net/jeps/393 >> [3] - https://git.openjdk.java.net/jdk/pull/548 >> [4] - https://github.com/openjdk/panama-foreign/blob/foreign-jextract/doc/panama_ffi.md >> [5] - http://cr.openjdk.java.net/~jvernee/docs/Foreign-abi%20downcall%20intrinsics%20technical%20description.html > > Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: > > Fix whitespaces Marked as reviewed by ihse (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From ihse at openjdk.java.net Mon Oct 26 09:19:37 2020 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Mon, 26 Oct 2020 09:19:37 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v12] In-Reply-To: References: <2IKx6cpc-IGP3jZtr0s2I14BWM6ptyFD26szPl3b1ng=.9d956b98-dfe6-4a45-a371-bf86923214fb@github.com> Message-ID: On Fri, 23 Oct 2020 14:06:22 GMT, Maurizio Cimadamore wrote: >> Changes requested by ihse (Reviewer). > > @magicus the files you commented on are not part of this PR, but they are introduced as part of: > https://git.openjdk.java.net/jdk/pull/548 > (you seemed to have approved the changes there - but it's also likely that this PR doesn't include the latest changes in that PR). Sorry for the confusion - but please do report any comment you have on the build changes on that PR! @mcimadamore I'm sorry too for the confusion. :) I must have been a bit in a bit of a hurry when approving it on the other PR. I've now moved my comments there. I don't think there's any way for me to "un-review" this change, so I'll mark it as accepted, even though I don't have anything to say about it (so that I'm not blocking a push). I'll ask the Skara guys if there's a better way to deal with this. Also, in the future, if you are creating a PR which Skara believes has changes in the build system, but it "really" does not, please remove the `build` label, and I won't even see the PR to come bothering you again! ;-) ------------- PR: https://git.openjdk.java.net/jdk/pull/634 From shade at openjdk.java.net Mon Oct 26 09:50:18 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 26 Oct 2020 09:50:18 GMT Subject: RFR: 8255389: ConcurrentHashTable::NoOp omits return in non-void return method Message-ID: Static analysis complains there is a non-void return method without a return statement: struct NoOp { void operator()(VALUE*) {} const VALUE& operator()() {} // <--- here void operator()(bool, VALUE*) {} } noOp; AFAICS, this is UB, and we have seen cases like these break compilers in other places. Not in this case, though, because `noOp` is only used as the default functor in `remove`. Still, it would be good to remove that risky definition, so that it is not used accidentally. ------------- Commit messages: - 8255389: ConcurrentHashTable::NoOp omits return in non-void return method Changes: https://git.openjdk.java.net/jdk/pull/863/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=863&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255389 Stats: 4 lines in 1 file changed: 3 ins; 1 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/863.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/863/head:pull/863 PR: https://git.openjdk.java.net/jdk/pull/863 From ihse at openjdk.java.net Mon Oct 26 11:19:10 2020 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Mon, 26 Oct 2020 11:19:10 GMT Subject: RFR: 8253757: Add LLVM-based backend for hsdis In-Reply-To: <2S00ucaPGiAQLeLOejt1kfXeYEc7ctEPeRCIcq1N0N8=.dbf1ea7a-8de4-48a5-8759-03495e3e3c08@github.com> References: <91erxiMDb4ftvSomuJYHPi9SX-v8Z2VLD2qEwCbz5tk=.b9ed01b5-f0e0-4ed7-9c1a-b06bc0e64640@github.com> <8Eqswd7tsVaGEXHdKDncXqKpW2tBsSeuY0PV6aTB9_c=.a6cf4957-9d31-4e89-bf44-e7b7852205d5@github.com> <2S00ucaPGiAQLeLOejt1kfXeYEc7ctEPeRCIcq1N0N8=.dbf1ea7a-8de4-48a5-8759-03495e3e3c08@github.com> Message-ID: On Thu, 8 Oct 2020 20:40:50 GMT, Xin Liu wrote: >> @navyxliu >> >>> @luhenry I tried to build it with LLVM10.0.1 >>> on my x86_64, ubuntu, I ran into a small problem. here is how I build. >>> $make ARCH=amd64 CC=/opt/llvm/bin/clang CXX=/opt/llvm/bin/clang++ LLVM=/opt/llvm/ >>> >>> I can't meet this condition because Makefile defines LIBOS_linux. >>> >>> #elif defined(LIBOS_Linux) && defined(LIBARCH_amd64) >>> return "x86_64-pc-linux-gnu"; >>> >>> Actually, Makefile assigns OS to windows/linux/aix/macosx (all lower case)and then >>> CPPFLAGS += -DLIBOS_$(OS) -DLIBOS="$(OS)" -DLIBARCH_$(LIBARCH) -DLIBARCH="$(LIBARCH)" -DLIB_EXT="$(LIB_EXT)" >> >> Interestingly, I did it this way because on my machine `LIBOS_Linux` would get defined instead of `LIBOS_linux`. I tried on WSL which might explain the difference. Could you please share more details on what environment you are using? >> >>> In hsdis.cpp, native_target_triple needs to match whatever Makefile defined. With that fix, I generate llvm version hsdis-amd64.so and it works flawlessly >> >> I'm not sure I understand what you mean. Are you saying we should define the native target triple based on the variables in the Makefile? >> >> A difficulty I ran into is that there is not always a 1-to-1 mapping between the autoconf/gcc target triple and the LLVM one. For example. you pass `x86_64-gnu-linux` to the OpenJDK's `configure` script, but the equivalent target triple for LLVM is `x86_64-pc-linux-gnu`. >> >> Since my plan isn't to use LLVM as the default for all platforms, and because there aren't that many combinations of target OS/ARCH, I am taking the approach of hardcoding the combinations we care about in `hsdis.cpp`. > >> @navyxliu >> >> > @luhenry I tried to build it with LLVM10.0.1 >> > on my x86_64, ubuntu, I ran into a small problem. here is how I build. >> > $make ARCH=amd64 CC=/opt/llvm/bin/clang CXX=/opt/llvm/bin/clang++ LLVM=/opt/llvm/ >> > I can't meet this condition because Makefile defines LIBOS_linux. >> > #elif defined(LIBOS_Linux) && defined(LIBARCH_amd64) >> > return "x86_64-pc-linux-gnu"; >> > Actually, Makefile assigns OS to windows/linux/aix/macosx (all lower case)and then >> > CPPFLAGS += -DLIBOS_$(OS) -DLIBOS="$(OS)" -DLIBARCH_$(LIBARCH) -DLIBARCH="$(LIBARCH)" -DLIB_EXT="$(LIB_EXT)" >> >> Interestingly, I did it this way because on my machine `LIBOS_Linux` would get defined instead of `LIBOS_linux`. I tried on WSL which might explain the difference. Could you please share more details on what environment you are using? >> > I am using ubuntu 18.04. > > `OS = $(shell uname)` does initialize OS=Linux in the first place, but later OS is set to "linux" at line 88 of https://openjdk.github.io/cr/?repo=jdk&pr=392&range=05#new-0 > > At line 186, -DLIBOS_linux -DLIBOS="linux" ... It doesn't match line 564 of https://openjdk.github.io/cr/?repo=jdk&pr=392&range=05#new-2 > > in my understanding, C/C++ macros are all case sensitive. I got #error "unknown platform" because of Linux/linux discrepancy. > >> > In hsdis.cpp, native_target_triple needs to match whatever Makefile defined. With that fix, I generate llvm version hsdis-amd64.so and it works flawlessly >> >> I'm not sure I understand what you mean. Are you saying we should define the native target triple based on the variables in the Makefile? >> >> A difficulty I ran into is that there is not always a 1-to-1 mapping between the autoconf/gcc target triple and the LLVM one. For example. you pass `x86_64-gnu-linux` to the OpenJDK's `configure` script, but the equivalent target triple for LLVM is `x86_64-pc-linux-gnu`. >> >> Since my plan isn't to use LLVM as the default for all platforms, and because there aren't that many combinations of target OS/ARCH, I am taking the approach of hardcoding the combinations we care about in `hsdis.cpp`. Since I found it close to impossible to review the changes when I could not get a diff with the changes done to hsdis.c/cpp, I created a webrev which shows these changes. I made this by renaming hsdis.cpp back to hsdis.c, and then webrev could match it up. It is available here: http://cr.openjdk.java.net/~ihse/hsdis-llvm-backend-diff-webrev/ ------------- PR: https://git.openjdk.java.net/jdk/pull/392 From rrich at openjdk.java.net Mon Oct 26 11:29:10 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Mon, 26 Oct 2020 11:29:10 GMT Subject: RFR: 8255243: Reinforce escape barrier interactions with ZGC conc stack processing In-Reply-To: References: Message-ID: <_J8bYqJcDa-5BvnEtZkdc3zIY21IfgEXTYvSSWy7znY=.074f43c7-ceac-4bb2-907e-b141d273dc4e@github.com> On Sat, 24 Oct 2020 07:34:57 GMT, Richard Reingruber wrote: >> The escape barrier reallocates scalarized objects potentially deep into the stack of a remote thread. Each allocation can safepoint, causing referenced frames to be invalid. Some sprinklings were added that deal with that, but I believe it was subsequently broken with the integration of the new vector API, that has its own new deoptimization code that did not know about this. Not surprisingly, the integration of the new vector API had no idea about this subtlety, and allocates an object, and then reads an object deep from the stack of a remote thread (using an escape barrier). I suppose the issue is that all these 3 things were integrated at almost the same time. The problematic code sequence is in VectorSupport::allocate_vector() in vectorSupport.cpp, which is called from Deoptimization::realloc_objects(). It first allocates an oop (possibly safepointing), and then reads a vector oop from the stack. This is usually fine, but not through the escape barrier, with concurrent stack s canning. While I have not seen any crashes yet, I can see from code inspection, that there is no way that this works correctly. >> >> In order to make this less fragile for future changes, we should really have a RAII object that keeps the target thread's stack of the escape barrier, stable and processed, across safepoints. This patch fixes that. Then it becomes much easier to reason about its correctness, compared to hoping the various hooks are applied after each safepoint. >> >> With this new robustness fix, the thread running the escape barrier, keeps the target thread stack processed, straight through safepoints on the requesting thread, making it easy and intuitive to understand why this works correctly. The RAII object basically just has to cover the code block that pokes at the remote stack and goes in and out of safepoints, arbitrarily. Arguably, this escape barrier doesn't need to be blazingly fast, and can afford keeping stacks sane through its operation. > > I'm really glad you caught that one! And I like the abstraction provided by KeepStackGCProcessedMark. > > There is one execution path you missed coming from `VM_GetOrSetLocal::deoptimize_objects(javaVFrame* jvf)`. This code should probably be moved into EscapeBarrier. `EscapeBarrier::deoptimize_objects(int depth)` could be changed to be more generic `EscapeBarrier::deoptimize_objects(int from_depth, int to_depth)`. VM_GetOrSetLocal::doit_prologue() could call eb.deoptimize_objects(_depth, _depth) then. That would be better but maybe not yet really good... > > Update: I see now that there is also a stackwalk in `VM_GetOrSetLocal::doit_prologue()` which needs to be taken care of with regard to concurrent stack processing. I'd like to try to refactor this. Will propose a patch. > > Thanks again, Richard. Hi Erik, the last commit in https://github.com/reinrich/jdk/commits/pr-832-with-better-encapsulation would be the refactoring I would like to do. It removes the code not compliant with concurrent thread stack processing from VM_GetOrSetLocal::doit_prologue(). Instead EscapeBarrier::deoptimize_objects(int d1, int d2) is called. You added already a KeepStackGCProcessedMark to that method and I changed it to accept a range [d1, d2] of frames do the object deoptimization for. I'm not sure how to handle this from a process point of view. Can the refactoring be done within this change? Should a new item or subtask be created for it. I'd be glad if you could give an advice on that. Thanks, Richard. ------------- PR: https://git.openjdk.java.net/jdk/pull/832 From ihse at openjdk.java.net Mon Oct 26 11:40:11 2020 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Mon, 26 Oct 2020 11:40:11 GMT Subject: RFR: 8253757: Add LLVM-based backend for hsdis In-Reply-To: References: <91erxiMDb4ftvSomuJYHPi9SX-v8Z2VLD2qEwCbz5tk=.b9ed01b5-f0e0-4ed7-9c1a-b06bc0e64640@github.com> <8Eqswd7tsVaGEXHdKDncXqKpW2tBsSeuY0PV6aTB9_c=.a6cf4957-9d31-4e89-bf44-e7b7852205d5@github.com> <2S00ucaPGiAQLeLOejt1kfXeYEc7ctEPeRCIcq1N0N8=.dbf1ea7a-8de4-48a5-8759-03495e3e3c08@github.com> Message-ID: On Mon, 26 Oct 2020 11:16:28 GMT, Magnus Ihse Bursie wrote: >>> @navyxliu >>> >>> > @luhenry I tried to build it with LLVM10.0.1 >>> > on my x86_64, ubuntu, I ran into a small problem. here is how I build. >>> > $make ARCH=amd64 CC=/opt/llvm/bin/clang CXX=/opt/llvm/bin/clang++ LLVM=/opt/llvm/ >>> > I can't meet this condition because Makefile defines LIBOS_linux. >>> > #elif defined(LIBOS_Linux) && defined(LIBARCH_amd64) >>> > return "x86_64-pc-linux-gnu"; >>> > Actually, Makefile assigns OS to windows/linux/aix/macosx (all lower case)and then >>> > CPPFLAGS += -DLIBOS_$(OS) -DLIBOS="$(OS)" -DLIBARCH_$(LIBARCH) -DLIBARCH="$(LIBARCH)" -DLIB_EXT="$(LIB_EXT)" >>> >>> Interestingly, I did it this way because on my machine `LIBOS_Linux` would get defined instead of `LIBOS_linux`. I tried on WSL which might explain the difference. Could you please share more details on what environment you are using? >>> >> I am using ubuntu 18.04. >> >> `OS = $(shell uname)` does initialize OS=Linux in the first place, but later OS is set to "linux" at line 88 of https://openjdk.github.io/cr/?repo=jdk&pr=392&range=05#new-0 >> >> At line 186, -DLIBOS_linux -DLIBOS="linux" ... It doesn't match line 564 of https://openjdk.github.io/cr/?repo=jdk&pr=392&range=05#new-2 >> >> in my understanding, C/C++ macros are all case sensitive. I got #error "unknown platform" because of Linux/linux discrepancy. >> >>> > In hsdis.cpp, native_target_triple needs to match whatever Makefile defined. With that fix, I generate llvm version hsdis-amd64.so and it works flawlessly >>> >>> I'm not sure I understand what you mean. Are you saying we should define the native target triple based on the variables in the Makefile? >>> >>> A difficulty I ran into is that there is not always a 1-to-1 mapping between the autoconf/gcc target triple and the LLVM one. For example. you pass `x86_64-gnu-linux` to the OpenJDK's `configure` script, but the equivalent target triple for LLVM is `x86_64-pc-linux-gnu`. >>> >>> Since my plan isn't to use LLVM as the default for all platforms, and because there aren't that many combinations of target OS/ARCH, I am taking the approach of hardcoding the combinations we care about in `hsdis.cpp`. > > Since I found it close to impossible to review the changes when I could not get a diff with the changes done to hsdis.c/cpp, I created a webrev which shows these changes. I made this by renaming hsdis.cpp back to hsdis.c, and then webrev could match it up. It is available here: > > http://cr.openjdk.java.net/~ihse/hsdis-llvm-backend-diff-webrev/ Some notes (perhaps most to myself) about how this ties into the existing hsdis implementation, and with JDK-8188073 (Capstone porting). When printing disassembly from hotspot, the current solution tries to locate and load the hsdis library, which prints disassembly using bfd. The reason for using the separate library approach is, as far as I can understand, perhaps a mix of both incompatible licensing for bfd, and a wish to not burden the jvm library with additional bloat which is needed only for debugging. The Capstone approach, in the prototype patch presented by Jorn in the issue, is to create a new capstonedis library, and dispatch to it instead of hsdis. The approach used in this patch is to refactor the existing hsdis library into an abstract base class for hsdis backends, with two concrete implementations, one for bfd and one for llvm. Unfortunately, I think the resulting code in hsdis.cpp in this patch is hard to read and understand. ------------- PR: https://git.openjdk.java.net/jdk/pull/392 From ihse at openjdk.java.net Mon Oct 26 11:44:13 2020 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Mon, 26 Oct 2020 11:44:13 GMT Subject: RFR: 8253757: Add LLVM-based backend for hsdis In-Reply-To: References: <91erxiMDb4ftvSomuJYHPi9SX-v8Z2VLD2qEwCbz5tk=.b9ed01b5-f0e0-4ed7-9c1a-b06bc0e64640@github.com> <8Eqswd7tsVaGEXHdKDncXqKpW2tBsSeuY0PV6aTB9_c=.a6cf4957-9d31-4e89-bf44-e7b7852205d5@github.com> <2S00ucaPGiAQLeLOejt1kfXeYEc7ctEPeRCIcq1N0N8=.dbf1ea7a-8de4-48a5-8759-03495e3e3c08@github.com> Message-ID: <9oXnHULCd76_J69CKMVVZl3FfDte1pnt38y06LVV4Sg=.26a4ab2c-5ff7-4e2f-9428-0d8cd931d243@github.com> On Mon, 26 Oct 2020 11:37:52 GMT, Magnus Ihse Bursie wrote: >> Since I found it close to impossible to review the changes when I could not get a diff with the changes done to hsdis.c/cpp, I created a webrev which shows these changes. I made this by renaming hsdis.cpp back to hsdis.c, and then webrev could match it up. It is available here: >> >> http://cr.openjdk.java.net/~ihse/hsdis-llvm-backend-diff-webrev/ > > Some notes (perhaps most to myself) about how this ties into the existing hsdis implementation, and with JDK-8188073 (Capstone porting). > > When printing disassembly from hotspot, the current solution tries to locate and load the hsdis library, which prints disassembly using bfd. The reason for using the separate library approach is, as far as I can understand, perhaps a mix of both incompatible licensing for bfd, and a wish to not burden the jvm library with additional bloat which is needed only for debugging. > > The Capstone approach, in the prototype patch presented by Jorn in the issue, is to create a new capstonedis library, and dispatch to it instead of hsdis. > > The approach used in this patch is to refactor the existing hsdis library into an abstract base class for hsdis backends, with two concrete implementations, one for bfd and one for llvm. > > Unfortunately, I think the resulting code in hsdis.cpp in this patch is hard to read and understand. I think a proper solution to both this and the Capstone implementation is to provide a proper framework for selecting the hsdis backend as a first step, and refactor the existing bfd implementation as the first such backend. After that, we can add llvm and capstone as alternative hsdis backend implementations. ------------- PR: https://git.openjdk.java.net/jdk/pull/392 From kbarrett at openjdk.java.net Mon Oct 26 11:44:15 2020 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 26 Oct 2020 11:44:15 GMT Subject: RFR: 8255389: ConcurrentHashTable::NoOp omits return in non-void return method In-Reply-To: References: Message-ID: <3tPr3TNpxB6tJnl5DV5e1_2sc1B2TX247yTHaemH6yE=.91fb91e8-4f10-4d8f-b1d8-5b1b52b71bcb@github.com> On Mon, 26 Oct 2020 09:41:12 GMT, Aleksey Shipilev wrote: > Static analysis complains there is a non-void return method without a return statement: > > struct NoOp { > void operator()(VALUE*) {} > const VALUE& operator()() {} // <--- here > void operator()(bool, VALUE*) {} > } noOp; > > AFAICS, this is UB, and we have seen cases like these break compilers in other places. Not in this case, though, because `noOp` is only used as the default functor in `remove`, which does not use this getter-like definition. Still, it would be good to remove that risky definition, so that it is not used accidentally. Changes requested by kbarrett (Reviewer). src/hotspot/share/utilities/concurrentHashTable.hpp line 197: > 195: // Note it only accepts the VALUE, and does not define methods with > 196: // non-void VALUE returns. Doing so would require defining the neutral > 197: // value for VALUE. I don't think this comment is needed. What's needed is a description of the DELETE_FN in the declaration of remove(), but that kind of template parameter requirements documentation is entirely missing from ConcurrentHashTable (and most other templates in HotSpot). src/hotspot/share/utilities/concurrentHashTable.hpp line 200: > 198: struct NoOp { > 199: void operator()(VALUE*) {} > 200: void operator()(bool, VALUE*) {} The two argument call operator also appears to be unused. src/hotspot/share/utilities/concurrentHashTable.hpp line 198: > 196: // non-void VALUE returns. Doing so would require defining the neutral > 197: // value for VALUE. > 198: struct NoOp { A better name for this class would be something like IgnoreValue. src/hotspot/share/utilities/concurrentHashTable.hpp line 201: > 199: void operator()(VALUE*) {} > 200: void operator()(bool, VALUE*) {} > 201: } noOp; I don't think the noOp object is useful. I think it would be clearer to just use a constructed object of this type in the one place it's used, i.e. `NoOp()` or (if name is changed) `IgnoreValue()`. ------------- PR: https://git.openjdk.java.net/jdk/pull/863 From kbarrett at openjdk.java.net Mon Oct 26 11:44:16 2020 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 26 Oct 2020 11:44:16 GMT Subject: RFR: 8255389: ConcurrentHashTable::NoOp omits return in non-void return method In-Reply-To: <3tPr3TNpxB6tJnl5DV5e1_2sc1B2TX247yTHaemH6yE=.91fb91e8-4f10-4d8f-b1d8-5b1b52b71bcb@github.com> References: <3tPr3TNpxB6tJnl5DV5e1_2sc1B2TX247yTHaemH6yE=.91fb91e8-4f10-4d8f-b1d8-5b1b52b71bcb@github.com> Message-ID: On Mon, 26 Oct 2020 11:39:09 GMT, Kim Barrett wrote: >> Static analysis complains there is a non-void return method without a return statement: >> >> struct NoOp { >> void operator()(VALUE*) {} >> const VALUE& operator()() {} // <--- here >> void operator()(bool, VALUE*) {} >> } noOp; >> >> AFAICS, this is UB, and we have seen cases like these break compilers in other places. Not in this case, though, because `noOp` is only used as the default functor in `remove`, which does not use this getter-like definition. Still, it would be good to remove that risky definition, so that it is not used accidentally. > > src/hotspot/share/utilities/concurrentHashTable.hpp line 201: > >> 199: void operator()(VALUE*) {} >> 200: void operator()(bool, VALUE*) {} >> 201: } noOp; > > I don't think the noOp object is useful. I think it would be clearer to just use a constructed object of this type in the one place it's used, i.e. `NoOp()` or (if name is changed) `IgnoreValue()`. If we were using lambdas we could delete this class and just use `[](VALUE*) {}` in the one place it's being used. ------------- PR: https://git.openjdk.java.net/jdk/pull/863 From rehn at openjdk.java.net Mon Oct 26 12:45:11 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 26 Oct 2020 12:45:11 GMT Subject: RFR: 8255389: ConcurrentHashTable::NoOp omits return in non-void return method In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 09:41:12 GMT, Aleksey Shipilev wrote: > Static analysis complains there is a non-void return method without a return statement: > > struct NoOp { > void operator()(VALUE*) {} > const VALUE& operator()() {} // <--- here > void operator()(bool, VALUE*) {} > } noOp; > > AFAICS, this is UB, and we have seen cases like these break compilers in other places. Not in this case, though, because `noOp` is only used as the default functor in `remove`, which does not use this getter-like definition. Still, it would be good to remove that risky definition, so that it is not used accidentally. This fix is fine by me. Kim's suggestions seems more like enhancements. ------------- Marked as reviewed by rehn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/863 From redestad at openjdk.java.net Mon Oct 26 12:49:43 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Mon, 26 Oct 2020 12:49:43 GMT Subject: RFR: 8255231: Avoid upcalls when initializing the statSampler [v4] In-Reply-To: References: Message-ID: > Current implementation of the statSampler does upcalls to System.getProperty to collect values for a number of properties that are all provided by the VM itself. And since the sampling starts before any user code run then no property can have changed. > > I suggest refactoring the code so that no upcalls are made normally - while asserting this invariant holds using assert-only upcalls. > > This is a small startup optimization - reducing the startup sequence by approx. 300k instructions and 70k branches in my linux-x64 setup. Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 21 additional commits since the last revision: - Address review comments from David Holmes - Merge branch 'master' into com_ns - Refactor to remove stable_java_property_counters and clarify comments - Merge branch 'master' into com_ns - Revert unrelated changes to perfData - Merge branch 'master' into com_ns - Improve comments - typo - Missing definition - Extract the shorthand java.version from VersionProps and use it in StatSampler - ... and 11 more: https://git.openjdk.java.net/jdk/compare/56309bc2...8572159f ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/802/files - new: https://git.openjdk.java.net/jdk/pull/802/files/6e220227..8572159f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=802&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=802&range=02-03 Stats: 172 lines in 14 files changed: 99 ins; 38 del; 35 mod Patch: https://git.openjdk.java.net/jdk/pull/802.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/802/head:pull/802 PR: https://git.openjdk.java.net/jdk/pull/802 From kbarrett at openjdk.java.net Mon Oct 26 13:03:08 2020 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 26 Oct 2020 13:03:08 GMT Subject: RFR: 8255389: ConcurrentHashTable::NoOp omits return in non-void return method In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 12:42:50 GMT, Robbin Ehn wrote: > This fix is fine by me. > Kim's suggestions seems more like enhancements. So is the proposed change, since nothing seems to actually be broken currently. If we're going to touch it at all, let's touch it once and be done with it. ------------- PR: https://git.openjdk.java.net/jdk/pull/863 From rehn at openjdk.java.net Mon Oct 26 13:40:09 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 26 Oct 2020 13:40:09 GMT Subject: RFR: 8255389: ConcurrentHashTable::NoOp omits return in non-void return method In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 13:00:49 GMT, Kim Barrett wrote: > > This fix is fine by me. > > Kim's suggestions seems more like enhancements. > > So is the proposed change, since nothing seems to actually be broken currently. If we're going to touch it at all, let's touch it once and be done with it. Any improvements are appreciated of course. But if he don't have time and this simple fix helps him, I have no issue with that. If you feel you have time you can do the touching instead of him. ------------- PR: https://git.openjdk.java.net/jdk/pull/863 From shade at openjdk.java.net Mon Oct 26 13:46:12 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 26 Oct 2020 13:46:12 GMT Subject: RFR: 8255389: ConcurrentHashTable::NoOp omits return in non-void return method In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 13:36:56 GMT, Robbin Ehn wrote: > Any improvements are appreciated of course. > But if he don't have time and this simple fix helps him, I have no issue with that. > If you feel you have time you can do the touching instead of him. This potential problem was found during the routine code inspection. As stated in the original description, it is not a problem currently, so I would try and implement Kim's suggestions to nail everything in one swing. ------------- PR: https://git.openjdk.java.net/jdk/pull/863 From hseigel at openjdk.java.net Mon Oct 26 13:52:10 2020 From: hseigel at openjdk.java.net (Harold Seigel) Date: Mon, 26 Oct 2020 13:52:10 GMT Subject: RFR: 8238263: Create at-requires mechanism for containers In-Reply-To: References: <43YvmzXR5TzDgPqfcAJDyfpOd7f_ur6ibKfh7AGf4LU=.40b88d2c-b29b-49d4-8d7f-120b7ade6cc5@github.com> Message-ID: On Fri, 23 Oct 2020 20:08:01 GMT, Igor Ignatyev wrote: >>> I think it depends on whether the tests will be permanently or temporarily excluded from running with containers. I thought this mechanism would be to permanently exclude the tests. That's why I used `@requires`. >> >> I see, if this is for permanent exclusion then yes I agree that `@requires` is a better choice. >> >>>> enable this option based on an environment variable so we don?t have to remember the >> cryptic command line sequence. >>> I'll look into basing the option on an environment variable. >> >> one will still need to pass an environment variable to jtreg, and hence will need to remember some sort of "cryptic command line sequence". a solution for that might be to default `jdk.containerized` to `false` in `VMProps.java` and when only _containerized_ runs will have to set it up. >> >> >> btw, I'm not sure that `jdk.containerized` is the best name for this property as _containerization_ is more of an environmental characteristic than that of jdk. how about smth like `env.containerized` or `testenv.containerized`? > >> > one will still need to pass an environment variable to jtreg, and hence will need to remember some sort of "cryptic command line sequence". a solution for that might be to default `jdk.containerized` to `false` in `VMProps.java` and when only _containerized_ runs will have to set it up. >> >> Why? Environment variables are inherited. For developers running jtreg, all they need to do is export the variable. >> >> % export JAVA_CONTAINERIZED=1 >> % bash >> % echo $JAVA_CONTAINERIZED >> % 1 > > b/c jtreg strips most of the environment variables, they might still be defined in the process which runs `VMProps` though (I have never checked that) Defining an environment variable works when running JTReg from the command line. But, mach5 does not pass environment variable settings to its JTReg test runs. Some mach5 special command args would still be needed. ------------- PR: https://git.openjdk.java.net/jdk/pull/844 From stuefe at openjdk.java.net Mon Oct 26 14:19:13 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 26 Oct 2020 14:19:13 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 04:49:24 GMT, David Holmes wrote: >> hi all, >> >> Please review this fix for POSIX platforms, which addresses a time out that occurs while handling a crash with UseOSErrorReporting ON >> >> The timeout was caused by the crash handling code, looping infinitively, because it incorrectly assumed that the signal handlers were reset to their defaults, while what really was happening was that the code was resetting the signal handlers to our default signal handler. >> >> To avoid similar confusion in the future I did the following: >> >> - renamed the VMError::reset_signal_handlers() to VMError:: rearm_signal_handlers() >> - introduced a new API VMError::clear_signal_handlers() which is implemented in PosixSignals >> >> PosixSignals::clear_signal_handlers() is where the actual fix is done and it simply resets all signal handlers to their system defaults. >> >> A similar problem occurs on Windows, with the only difference being that before a process times out (takes 2 minutes) it runs out of stack space in about 250 loops, so that's the only reason it doesn't linger for that long. Windows issue is tracked separately by https://bugs.openjdk.java.net/browse/JDK-8250782 >> >> Note: The expectation for "UseOSErrorReporting" is for the OS to catch the crashed process and to produce its own crash log (in addition to Hotspot creating hs_err log file) - see https://bugs.openjdk.java.net/browse/JDK-8237727 for relevant discussion. It does not affect whether core dump is written or not (that is controlled by CreateCoredumpOnCrash) > > Changing review status to "Request changes". > > So my preferred approaches here would be: > > 1. Make UseOSErrorReporting Windows only; or > 2. Make UseOSErrorReporting Windows and macOS only. Then on macOS do a targeted crash after report_and_die() returns. > I like (2). It is sure to preserve the stack of the crashing thread. Not perfect, but maybe its close to what Gerard likes to see on MacOS. Only remark, this gets very close to what we do already, since os::abort() calls ::abort() which raises SIGABORT... but according to Gerard abort() does not seem to get noticed by MacOS crash handling. So artificially triggering a fault may be better. ..Thomas > Thanks, > David ------------- PR: https://git.openjdk.java.net/jdk/pull/813 From iignatyev at openjdk.java.net Mon Oct 26 14:28:11 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Mon, 26 Oct 2020 14:28:11 GMT Subject: RFR: 8238263: Create at-requires mechanism for containers In-Reply-To: References: <43YvmzXR5TzDgPqfcAJDyfpOd7f_ur6ibKfh7AGf4LU=.40b88d2c-b29b-49d4-8d7f-120b7ade6cc5@github.com> Message-ID: On Mon, 26 Oct 2020 13:49:26 GMT, Harold Seigel wrote: > Defining an environment variable works when running JTReg from the command line. But, mach5 does not pass environment variable settings to its JTReg test runs. Some mach5 special command args would still be needed. right, yet given you also need to explicitly say mach5 that you want to run testing within docker, that's not a huge problem. this is assuming we default env. variable to `false`, `make` propagates this env. variable to `jtreg` and `jtreg` propagates it to the JVM which runs `VMProps` class. ------------- PR: https://git.openjdk.java.net/jdk/pull/844 From stefank at openjdk.java.net Mon Oct 26 14:35:13 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 26 Oct 2020 14:35:13 GMT Subject: RFR: 8255254: Split os::reserve_memory and os::map_memory_to_file interfaces In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 15:40:19 GMT, Anton Kozlov wrote: > Hi, > > Please review a change to extract map_memory_to_file interface out of reserve_memory when the latter takes file descriptor. > > The change should be a pure refactoring without changes in functionality. The only part is disturbing: a comment in original os_posix.cpp:316 seems to refer to else clause and it contradicts to the actual code. I think this looks good. Just a few minor comments. src/hotspot/os/posix/os_posix.cpp line 343: > 341: // mmap but it also may System V shared memory which cannot be uncommitted as a whole, so > 342: // chopping off and unmapping excess bits back and front (see below) would not work. > 343: char* extra_base = os::reserve_memory(extra_size); It seems like this belonged to the fd != -1 clause. It previously talked about the problems of using os::reserve_memory, and then used reserve_mmapped_memory instead in the fd != -1 case. It's not obvious to me that this comment belongs here. src/hotspot/os/posix/os_posix.cpp line 362: > 360: // After we have an aligned address, we can replace anonymous mapping with file mapping > 361: if (replace_existing_mapping_with_file_mapping(aligned_base, size, file_desc) == NULL) { > 362: vm_exit_during_initialization(err_msg("Error in mapping Java heap at the given filesystem directory")); There shouldn't be a need to use err_msg for plain strings. src/hotspot/os/windows/os_windows.cpp line 3176: > 3174: > 3175: char* os::reserve_memory_aligned(size_t size, size_t alignment) { > 3176: return map_or_reserve_memory_aligned(size, alignment, -1/*file_desc*/); Could you add a space between -1 and /* ? src/hotspot/share/runtime/os.cpp line 1742: > 1740: } > 1741: return result; > 1742: } It's a bit unfortunate that the two functions behave differently w.r.t. NMT. They have the same name, but only one reports to NMT. I'd personally would like to transition the code base so that all os::_memory functions report to NMT, and if you don't wan that you'd use pd_ or internal functions. I think there's a pre-existing bug where a call to os::map_memory_to_file misses the NMT reporting. This is a recurring problem in the os:: layer. Since this is already problematic before your changes, we can investigate this as a separate JBS bug. ------------- PR: https://git.openjdk.java.net/jdk/pull/812 From eosterlund at openjdk.java.net Mon Oct 26 14:49:10 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 26 Oct 2020 14:49:10 GMT Subject: RFR: 8255243: Reinforce escape barrier interactions with ZGC conc stack processing In-Reply-To: <_J8bYqJcDa-5BvnEtZkdc3zIY21IfgEXTYvSSWy7znY=.074f43c7-ceac-4bb2-907e-b141d273dc4e@github.com> References: <_J8bYqJcDa-5BvnEtZkdc3zIY21IfgEXTYvSSWy7znY=.074f43c7-ceac-4bb2-907e-b141d273dc4e@github.com> Message-ID: <5tKw_M8ud42YEtE-k_93YPjTbpC13BRPT20afBbInbA=.039581b3-7c53-42fa-947f-d672d2192202@github.com> On Mon, 26 Oct 2020 11:26:40 GMT, Richard Reingruber wrote: > Hi Erik, the last commit in https://github.com/reinrich/jdk/commits/pr-832-with-better-encapsulation would be the refactoring I would like to do. It removes the code not compliant with concurrent thread stack processing from VM_GetOrSetLocal::doit_prologue(). Instead EscapeBarrier::deoptimize_objects(int d1, int d2) is called. You added already a KeepStackGCProcessedMark to that method and I changed it to accept a range [d1, d2] of frames do the object deoptimization for. > > I'm not sure how to handle this from a process point of view. Can the refactoring be done within this change? Should a new item or subtask be created for it. I'd be glad if you could give an advice on that. > > Thanks, Richard. If you are okay with it, I can add your refactorings into this change, and add you as a co-author of the change. Sounds good? Thanks, ------------- PR: https://git.openjdk.java.net/jdk/pull/832 From redestad at openjdk.java.net Mon Oct 26 14:58:20 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Mon, 26 Oct 2020 14:58:20 GMT Subject: RFR: 8255397: x86: coalesce reference and int entry points into vtos bytecodes Message-ID: On x86 - both 32- and 64-bit - the code laid out for transitionining into a vtos bytecode when having a reference and int top-of-stack state is semantically identical, and can be coalesced. This patch removes a short jump on some cases which is marginally beneficial when interpreting, while measurably reducing overhead of generating the interpreter itself. ------------- Commit messages: - x86: coalesce some ptr and int entry points Changes: https://git.openjdk.java.net/jdk/pull/865/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=865&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255397 Stats: 4 lines in 1 file changed: 0 ins; 3 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/865.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/865/head:pull/865 PR: https://git.openjdk.java.net/jdk/pull/865 From rrich at openjdk.java.net Mon Oct 26 15:22:13 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Mon, 26 Oct 2020 15:22:13 GMT Subject: RFR: 8255243: Reinforce escape barrier interactions with ZGC conc stack processing In-Reply-To: <5tKw_M8ud42YEtE-k_93YPjTbpC13BRPT20afBbInbA=.039581b3-7c53-42fa-947f-d672d2192202@github.com> References: <_J8bYqJcDa-5BvnEtZkdc3zIY21IfgEXTYvSSWy7znY=.074f43c7-ceac-4bb2-907e-b141d273dc4e@github.com> <5tKw_M8ud42YEtE-k_93YPjTbpC13BRPT20afBbInbA=.039581b3-7c53-42fa-947f-d672d2192202@github.com> Message-ID: On Mon, 26 Oct 2020 14:46:24 GMT, Erik ?sterlund wrote: >> Hi Erik, the last commit in https://github.com/reinrich/jdk/commits/pr-832-with-better-encapsulation would be the refactoring I would like to do. It removes the code not compliant with concurrent thread stack processing from VM_GetOrSetLocal::doit_prologue(). Instead EscapeBarrier::deoptimize_objects(int d1, int d2) is called. You added already a KeepStackGCProcessedMark to that method and I changed it to accept a range [d1, d2] of frames do the object deoptimization for. >> >> I'm not sure how to handle this from a process point of view. Can the refactoring be done within this change? Should a new item or subtask be created for it. I'd be glad if you could give an advice on that. >> >> Thanks, Richard. > >> Hi Erik, the last commit in https://github.com/reinrich/jdk/commits/pr-832-with-better-encapsulation would be the refactoring I would like to do. It removes the code not compliant with concurrent thread stack processing from VM_GetOrSetLocal::doit_prologue(). Instead EscapeBarrier::deoptimize_objects(int d1, int d2) is called. You added already a KeepStackGCProcessedMark to that method and I changed it to accept a range [d1, d2] of frames do the object deoptimization for. >> >> I'm not sure how to handle this from a process point of view. Can the refactoring be done within this change? Should a new item or subtask be created for it. I'd be glad if you could give an advice on that. >> >> Thanks, Richard. > > If you are okay with it, I can add your refactorings into this change, and add you as a co-author of the change. Sounds good? > > Thanks, It does sound good indeed to me if you don't mind doing that. Thanks! I have run the tests dedicated to EscapeBarriers with ZGC enabled and also the DeoptimizeObjectsALot stress testing. I will run some more serviceability tests and my teams CI testing until tomorrow. ------------- PR: https://git.openjdk.java.net/jdk/pull/832 From pliden at openjdk.java.net Mon Oct 26 15:23:12 2020 From: pliden at openjdk.java.net (Per Liden) Date: Mon, 26 Oct 2020 15:23:12 GMT Subject: RFR: 8237363: Remove automatic is in heap verification in OopIterateClosure In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 07:30:16 GMT, Stefan Karlsson wrote: > There's verification code in the "oop iterate" framework that asserts that a pointer is "is in the heap". This works for most GCs, but ZGC *can* eagerly decommit the old relocation set pages, which means that pointers to the old / from copy of the object could point to memory that is currently not a part of the current heap. > > To combat this in the past I've added a way for oop iterate closures to turn off this verification. However, every single time we add a new closure we have to consider if we can allow this verification check or if we have to remove it. Personally, I think this is a false abstraction and also widens the oop iterate closure interface. I previously proposed a patch that moved the verification code down into the oop iterate closures. It wasn't a huge patch, but I got push-back that it was convenient for other GCs to get this automatic verification, and the review stalled. > > In this new patch I propose a different way to retain the verification. The realization is that most oop iterate closures have to deal with both compressed and non-compressed oops, so the code typically looks like this: > > template > inline void G1ScanCardClosure::do_oop_work(T* p) { > T o = RawAccess<>::oop_load(p); > if (CompressedOops::is_null(o)) { > return; > } > oop obj = CompressedOops::decode_not_null(o); > > Therefore the suggest new place to put the is_in verification is in the CompressedOops::decode*. This injects the assert into almost all non-ZGC closures, and also to places that don't use the oop iterate closure framework. I think this is a neat workaround, and hope this patch is accepted this time. > > I've tested this patch a few weeks ago, but will rerun the relevant tiers. Marked as reviewed by pliden (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/797 From gziemski at openjdk.java.net Mon Oct 26 15:35:14 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Mon, 26 Oct 2020 15:35:14 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 04:33:03 GMT, David Holmes wrote: > Hi Gerard, > > I think we have a fundamental problem here that UseOSErrorReporting was only ever intended for use on Windows. It simply allows VMError::report_and_die to return instead of actually making the VM "die". For Windows this means we can continue to propagate the windows exception and thus allow Windows Error Reporting (WER) to take over. Whether this actually works correctly or not is a different matter. > > For non-Windows there is no pre-established alternative code path for report_and_die() returning. > > In the bug report you write: > > > On Mac/Linux it would look more like this: > > #1 catch signal in our handler > > #2 generate hs_err log > > #3 turn off our signal handler > > #4 continue the process normally, allowing it to crash again in the same spot, with the same signal being generated > > To me you are now inventing what UseOSErrorReporting should mean on non-Windows, and I don't agree with it. I don't think it should mean that we re-crash using the "default" signal response and consider that as using "OS error reporting". To me that is just not valid, especially when we cannot return from a signal handling context in many cases without incurring undefined behaviour. To me #4 is not a valid expectation as we have no way to know what will happen next if the signal handler returns. It would also be wrong to just continue execution after an assertion or guarantee fails. > > I'm assuming that the motivation here is that on macOS if we use the default signal handling modes then macOS will do its own error reporting? If so I would suggest that the right response may be to return from report_and_die (on macOS only) and then deliberately crash after restoring the default handler. Obviously that will change which "crash" the OS reports but that is likely to happen anyway as you cannot guarantee how you will crash after trying to continue (and this goes beyond our general "best effort" approaches in signal handling.) > > Beyond that I share Thomas's concerns about making sweeping changes to installed signal handlers. > > So my preferred approaches here would be: > > 1. Make UseOSErrorReporting Windows only; or > 2. Make UseOSErrorReporting Windows and macOS only. Then on macOS do a targeted crash after report_and_die() returns. hi David, Many thanks for the review and finding the background info on the history of this issue. How we do things when a user turns ON the "UseOSErrorReporting" flag is just an implementation detail. On Windows we forward the crash to the OS to handle it, but just because in this fix we "just" turn off our signal handlers, reset them to SIG_DFL and return to let it crash again, does not mean it's not a meaningful way to forward it to OS, if that's how the OS wants it - please see this technical note from Apple https://developer.apple.com/forums/thread/113742 where Apple suggest the way to let the macOS handle the crash is to: "unregister your signal handler (set it to SIG_DFL) and then return. This will cause the crashed process to continue execution, crash again, and generate a crash report via the Apple crash reporter." That's how Apple suggest we do it for Mac. I can limit the scope of this fix to just macOS here, like I was planning it for JDK-8237727 and worry about Linux in a different issue. ------------- PR: https://git.openjdk.java.net/jdk/pull/813 From stefank at openjdk.java.net Mon Oct 26 17:30:23 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 26 Oct 2020 17:30:23 GMT Subject: Integrated: 8237363: Remove automatic is in heap verification in OopIterateClosure In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 07:30:16 GMT, Stefan Karlsson wrote: > There's verification code in the "oop iterate" framework that asserts that a pointer is "is in the heap". This works for most GCs, but ZGC *can* eagerly decommit the old relocation set pages, which means that pointers to the old / from copy of the object could point to memory that is currently not a part of the current heap. > > To combat this in the past I've added a way for oop iterate closures to turn off this verification. However, every single time we add a new closure we have to consider if we can allow this verification check or if we have to remove it. Personally, I think this is a false abstraction and also widens the oop iterate closure interface. I previously proposed a patch that moved the verification code down into the oop iterate closures. It wasn't a huge patch, but I got push-back that it was convenient for other GCs to get this automatic verification, and the review stalled. > > In this new patch I propose a different way to retain the verification. The realization is that most oop iterate closures have to deal with both compressed and non-compressed oops, so the code typically looks like this: > > template > inline void G1ScanCardClosure::do_oop_work(T* p) { > T o = RawAccess<>::oop_load(p); > if (CompressedOops::is_null(o)) { > return; > } > oop obj = CompressedOops::decode_not_null(o); > > Therefore the suggest new place to put the is_in verification is in the CompressedOops::decode*. This injects the assert into almost all non-ZGC closures, and also to places that don't use the oop iterate closure framework. I think this is a neat workaround, and hope this patch is accepted this time. > > I've tested this patch a few weeks ago, but will rerun the relevant tiers. This pull request has now been integrated. Changeset: 6666dcbe Author: Stefan Karlsson URL: https://git.openjdk.java.net/jdk/commit/6666dcbe Stats: 117 lines in 17 files changed: 25 ins; 83 del; 9 mod 8237363: Remove automatic is in heap verification in OopIterateClosure Reviewed-by: eosterlund, pliden ------------- PR: https://git.openjdk.java.net/jdk/pull/797 From hseigel at openjdk.java.net Mon Oct 26 18:13:29 2020 From: hseigel at openjdk.java.net (Harold Seigel) Date: Mon, 26 Oct 2020 18:13:29 GMT Subject: RFR: 8238263: Create at-requires mechanism for containers [v2] In-Reply-To: References: Message-ID: > Please review this change to add an @requires mechanism called "jdk.containerized" to help mark tests that are incompatible with containers. Users would add "@requires jdk.containerized != true" to the incompatible tests and then use "make test ... OPTIONS=-Djdk.containerized=true" or "bash jib.sh mach5 -- remote-build-and-test ... --test-make-args JTREG=OPTIONS=-Djdk.containerized=true" to exclude those tests when testing with containers. Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: 8238263: Create at-requires mechanism for containers ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/844/files - new: https://git.openjdk.java.net/jdk/pull/844/files/9ffe9ad7..7a5ae052 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=844&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=844&range=00-01 Stats: 6 lines in 1 file changed: 6 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/844.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/844/head:pull/844 PR: https://git.openjdk.java.net/jdk/pull/844 From hseigel at openjdk.java.net Mon Oct 26 18:17:16 2020 From: hseigel at openjdk.java.net (Harold Seigel) Date: Mon, 26 Oct 2020 18:17:16 GMT Subject: RFR: 8238263: Create at-requires mechanism for containers In-Reply-To: References: <43YvmzXR5TzDgPqfcAJDyfpOd7f_ur6ibKfh7AGf4LU=.40b88d2c-b29b-49d4-8d7f-120b7ade6cc5@github.com> Message-ID: On Mon, 26 Oct 2020 14:25:14 GMT, Igor Ignatyev wrote: >> Defining an environment variable works when running JTReg from the command line. But, mach5 does not pass environment variable settings to its JTReg test runs. Some mach5 special command args would still be needed. > >> Defining an environment variable works when running JTReg from the command line. But, mach5 does not pass environment variable settings to its JTReg test runs. Some mach5 special command args would still be needed. > > right, yet given you also need to explicitly say mach5 that you want to run testing within docker, that's not a huge problem. this is assuming we default env. variable to `false`, `make` propagates this env. variable to `jtreg` and `jtreg` propagates it to the JVM which runs `VMProps` class. Please review this updated webrev that adds environment variable TEST_JDK_CONTAINERIZED for setting @requires jdk.containerized. When TEST_JDK_CONTAINERIZED is unset, jdk.containerized is false. ------------- PR: https://git.openjdk.java.net/jdk/pull/844 From shade at openjdk.java.net Mon Oct 26 19:07:32 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 26 Oct 2020 19:07:32 GMT Subject: RFR: 8255389: ConcurrentHashTable::NoOp omits return in non-void return method [v2] In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 13:00:49 GMT, Kim Barrett wrote: >> This fix is fine by me. >> Kim's suggestions seems more like enhancements. > >> This fix is fine by me. >> Kim's suggestions seems more like enhancements. > > So is the proposed change, since nothing seems to actually be broken currently. If we're going to touch it at all, let's touch it once and be done with it. @kimbarrett, does new version look better to you? ------------- PR: https://git.openjdk.java.net/jdk/pull/863 From shade at openjdk.java.net Mon Oct 26 19:07:32 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 26 Oct 2020 19:07:32 GMT Subject: RFR: 8255389: ConcurrentHashTable::NoOp omits return in non-void return method [v2] In-Reply-To: References: Message-ID: > Static analysis complains there is a non-void return method without a return statement: > > struct NoOp { > void operator()(VALUE*) {} > const VALUE& operator()() {} // <--- here > void operator()(bool, VALUE*) {} > } noOp; > > AFAICS, this is UB, and we have seen cases like these break compilers in other places. Not in this case, though, because `noOp` is only used as the default functor in `remove`, which does not use this getter-like definition. Still, it would be good to remove that risky definition, so that it is not used accidentally. Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Implement suggestions from review ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/863/files - new: https://git.openjdk.java.net/jdk/pull/863/files/e1c6727d..1154fea6 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=863&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=863&range=00-01 Stats: 9 lines in 1 file changed: 1 ins; 4 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/863.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/863/head:pull/863 PR: https://git.openjdk.java.net/jdk/pull/863 From github.com+12972156+pmur at openjdk.java.net Mon Oct 26 19:47:21 2020 From: github.com+12972156+pmur at openjdk.java.net (Paul Murphy) Date: Mon, 26 Oct 2020 19:47:21 GMT Subject: RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v8] In-Reply-To: <_JR-e3ZsRFwvZCR7ws34z5jLjp2kJQ1bu4gyl0RG1XU=.ec3040cf-8147-4dcd-b87d-4fd9be4eb59e@github.com> References: <_JR-e3ZsRFwvZCR7ws34z5jLjp2kJQ1bu4gyl0RG1XU=.ec3040cf-8147-4dcd-b87d-4fd9be4eb59e@github.com> Message-ID: <-7PHVafzbyMukuWngsX5bdLvJPubN2KzjMWM2lrQnCs=.a278b608-3e1c-4126-9791-efe18a5d8d5e@github.com> On Thu, 22 Oct 2020 22:06:11 GMT, CoreyAshford wrote: >> src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3878: >> >>> 3876: // | Element | | | | | | | | | >>> 3877: // +===============+=============+======================+======================+=============+=============+======================+======================+=============+ >>> 3878: // | after vaddubm | 00||b0:0..5 | 00||b0:6..7||b1:0..3 | 00||b1:4..7||b2:0..1 | 00||b2:2..7 | 00||b3:0..5 | 00||b3:6..7||b4:0..3 | 00||b4:4..7||b5:0..1 | 00||b5:2..7 | >> >> An extra line here showing how the 8 6-bit values above get mapping into 6 bytes greatly help my brain out. (likewise for the > Just to make sure I understand, you're not asking for a change here, is that right? I think the first line should also express the initial layout of the 6 bit values similar to the linked algo. I think changing this comment add an extra line which describes the bits as they leave `vaddubm` would be helpful to understand the demangling here. (e.g the `00aaaaaa 00bbbbbb 00ccccc 00dddddd` comments in the linked paper) >> src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3884: >> >>> 3882: // | vec_0x3fs | 00111111 | 00111111 | 00111111 | 00111111 | 00111111 | 00111111 | 00111111 | 00111111 | >>> 3883: // +---------------+-------------+----------------------+----------------------+-------------+-------------+----------------------+----------------------+-------------+ >>> 3884: // | after vpextd | b5:0..7 | b4:0..7 | b3:0..7 | b2:0..7 | b1:0..7 | b0:0..7 | 00000000 | 00000000 | >> >> Are theses comments correct or am I misunderstanding this? I read the final result as something starting as `b5:2..7 || b4:4..7|| b5:0..1` from vpextd. > > Because the bytes are displayed e15..e8, instead of the other way around, it's hard to follow. As an example, consider just the last four bytes of the table, but displayed in the reverse order: > > 00||b0:0..5 00||b0:6..7||b1:0..3 00||b1:4..7||b2:0..1 00||b2:2..7 > > After vpextd with bit select pattern 00111111 for all bytes: > > b0:0..5||b0:6..7 b1:0..3||1:4..7 b2:0..1||b2:2..7 > = > b0:0..7 b1:0..7 b2:0..7 > > Should I reverse the order of this table with a comment at the top, to explain the reason for the reversal? It seems like a good idea. Since you are operating on doublewords here, expressing this as operations on a doubleword instead of bytes would be more intuitive here. I think the lane mappings for little endian are what throw me off. ------------- PR: https://git.openjdk.java.net/jdk/pull/293 From akozlov at openjdk.java.net Mon Oct 26 20:05:19 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 26 Oct 2020 20:05:19 GMT Subject: RFR: 8255254: Split os::reserve_memory and os::map_memory_to_file interfaces In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 08:23:12 GMT, Stefan Karlsson wrote: >> Hi, >> >> Please review a change to extract map_memory_to_file interface out of reserve_memory when the latter takes file descriptor. >> >> The change should be a pure refactoring without changes in functionality. The only part is disturbing: a comment in original os_posix.cpp:316 seems to refer to else clause and it contradicts to the actual code. > > src/hotspot/os/posix/os_posix.cpp line 362: > >> 360: // After we have an aligned address, we can replace anonymous mapping with file mapping >> 361: if (replace_existing_mapping_with_file_mapping(aligned_base, size, file_desc) == NULL) { >> 362: vm_exit_during_initialization(err_msg("Error in mapping Java heap at the given filesystem directory")); > > There shouldn't be a need to use err_msg for plain strings. Agree, I haven't paid attention while I was moving lines. Unfortunately, there are more cases for exactly this line https://github.com/openjdk/jdk/search?q=%22Error+in+mapping+Java+heap%22 I think it would interesting to find and eliminate all of them at once. I've filed https://bugs.openjdk.java.net/browse/JDK-8255416 > src/hotspot/share/runtime/os.cpp line 1742: > >> 1740: } >> 1741: return result; >> 1742: } > > It's a bit unfortunate that the two functions behave differently w.r.t. NMT. They have the same name, but only one reports to NMT. I'd personally would like to transition the code base so that all os::_memory functions report to NMT, and if you don't wan that you'd use pd_ or internal functions. I think there's a pre-existing bug where a call to os::map_memory_to_file misses the NMT reporting. This is a recurring problem in the os:: layer. Since this is already problematic before your changes, we can investigate this as a separate JBS bug. Nice catch. Agree, this would be out of scope of this patch. I've filed https://bugs.openjdk.java.net/browse/JDK-8255414 ------------- PR: https://git.openjdk.java.net/jdk/pull/812 From akozlov at openjdk.java.net Mon Oct 26 20:16:32 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 26 Oct 2020 20:16:32 GMT Subject: RFR: 8255254: Split os::reserve_memory and os::map_memory_to_file interfaces [v2] In-Reply-To: References: Message-ID: <8OBEzNqpmYnnaGIXG32ksZn2FZGEmyZEd2hZmdutRa8=.c360dd90-b3df-4040-8013-5541b90406b8@github.com> > Hi, > > Please review a change to extract map_memory_to_file interface out of reserve_memory when the latter takes file descriptor. > > The change should be a pure refactoring without changes in functionality. The only part is disturbing: a comment in original os_posix.cpp:316 seems to refer to else clause and it contradicts to the actual code. Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: Fix review findings ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/812/files - new: https://git.openjdk.java.net/jdk/pull/812/files/f9423ddd..df6fb834 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=812&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=812&range=00-01 Stats: 7 lines in 2 files changed: 3 ins; 3 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/812.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/812/head:pull/812 PR: https://git.openjdk.java.net/jdk/pull/812 From akozlov at openjdk.java.net Mon Oct 26 20:16:33 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 26 Oct 2020 20:16:33 GMT Subject: RFR: 8255254: Split os::reserve_memory and os::map_memory_to_file interfaces [v2] In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 08:18:59 GMT, Stefan Karlsson wrote: >> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix review findings > > src/hotspot/os/posix/os_posix.cpp line 343: > >> 341: // mmap but it also may System V shared memory which cannot be uncommitted as a whole, so >> 342: // chopping off and unmapping excess bits back and front (see below) would not work. >> 343: char* extra_base = os::reserve_memory(extra_size); > > It seems like this belonged to the fd != -1 clause. It previously talked about the problems of using os::reserve_memory, and then used reserve_mmapped_memory instead in the fd != -1 case. It's not obvious to me that this comment belongs here. Makes sense, thanks. I've restored the comment in the later version ------------- PR: https://git.openjdk.java.net/jdk/pull/812 From bobv at openjdk.java.net Mon Oct 26 20:24:20 2020 From: bobv at openjdk.java.net (Bob Vandette) Date: Mon, 26 Oct 2020 20:24:20 GMT Subject: RFR: 8238263: Create at-requires mechanism for containers [v2] In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 18:13:29 GMT, Harold Seigel wrote: >> Please review this change to add an @requires mechanism called "jdk.containerized" to help mark tests that are incompatible with containers. Users would add "@requires jdk.containerized != true" to the incompatible tests and then use "make test ... OPTIONS=-Djdk.containerized=true" or "bash jib.sh mach5 -- remote-build-and-test ... --test-make-args JTREG=OPTIONS=-Djdk.containerized=true" to exclude those tests when testing with containers. > > Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: > > 8238263: Create at-requires mechanism for containers Marked as reviewed by bobv (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/844 From rehn at openjdk.java.net Mon Oct 26 20:26:21 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 26 Oct 2020 20:26:21 GMT Subject: RFR: 8255389: ConcurrentHashTable::NoOp omits return in non-void return method [v2] In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 19:04:44 GMT, Aleksey Shipilev wrote: >>> This fix is fine by me. >>> Kim's suggestions seems more like enhancements. >> >> So is the proposed change, since nothing seems to actually be broken currently. If we're going to touch it at all, let's touch it once and be done with it. > > @kimbarrett, does new version look better to you? You can even do something like: // Same without DELETE_FUNC. template bool remove(Thread* thread, LOOKUP_FUNC& lookup_f) { struct { void operator()(VALUE*) {} } ignore_del_f; return internal_remove(thread, lookup_f, ignore_del_f); } Which I think @kimbarrett hinted. ------------- PR: https://git.openjdk.java.net/jdk/pull/863 From iignatyev at openjdk.java.net Mon Oct 26 20:27:20 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Mon, 26 Oct 2020 20:27:20 GMT Subject: RFR: 8238263: Create at-requires mechanism for containers [v2] In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 18:13:29 GMT, Harold Seigel wrote: >> Please review this change to add an @requires mechanism called "jdk.containerized" to help mark tests that are incompatible with containers. Users would add "@requires jdk.containerized != true" to the incompatible tests and then use "make test ... OPTIONS=-Djdk.containerized=true" or "bash jib.sh mach5 -- remote-build-and-test ... --test-make-args JTREG=OPTIONS=-Djdk.containerized=true" to exclude those tests when testing with containers. > > Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: > > 8238263: Create at-requires mechanism for containers LGTM, modulo my earlier comment/doubt about the name, but I don?t think it?s important enough ------------- Marked as reviewed by iignatyev (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/844 From hseigel at openjdk.java.net Mon Oct 26 20:37:20 2020 From: hseigel at openjdk.java.net (Harold Seigel) Date: Mon, 26 Oct 2020 20:37:20 GMT Subject: RFR: 8238263: Create at-requires mechanism for containers [v2] In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 20:24:35 GMT, Igor Ignatyev wrote: >> Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: >> >> 8238263: Create at-requires mechanism for containers > > LGTM, modulo my earlier comment/doubt about the name, but I don?t think it?s important enough Thanks Bob and Igor! ------------- PR: https://git.openjdk.java.net/jdk/pull/844 From hseigel at openjdk.java.net Mon Oct 26 20:37:21 2020 From: hseigel at openjdk.java.net (Harold Seigel) Date: Mon, 26 Oct 2020 20:37:21 GMT Subject: Integrated: 8238263: Create at-requires mechanism for containers In-Reply-To: References: Message-ID: On Fri, 23 Oct 2020 18:44:54 GMT, Harold Seigel wrote: > Please review this change to add an @requires mechanism called "jdk.containerized" to help mark tests that are incompatible with containers. Users would add "@requires jdk.containerized != true" to the incompatible tests and then use "make test ... OPTIONS=-Djdk.containerized=true" or "bash jib.sh mach5 -- remote-build-and-test ... --test-make-args JTREG=OPTIONS=-Djdk.containerized=true" to exclude those tests when testing with containers. This pull request has now been integrated. Changeset: ca8bba64 Author: Harold Seigel URL: https://git.openjdk.java.net/jdk/commit/ca8bba64 Stats: 10 lines in 3 files changed: 8 ins; 0 del; 2 mod 8238263: Create at-requires mechanism for containers Reviewed-by: bobv, iignatyev ------------- PR: https://git.openjdk.java.net/jdk/pull/844 From rkennke at openjdk.java.net Mon Oct 26 20:58:25 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 26 Oct 2020 20:58:25 GMT Subject: RFR: 8255401: Shenandoah: Allow oldval and newval registers to overlap in cmpxchg_oop() Message-ID: We encountered a failure in testing: Internal Error (/home/jenkins/workspace/nightly/jdk-jdk/src/hotspot/share/asm/register.hpp:141), pid=15470, tid=15611 assert(a != b && a != c && a != d && b != c && b != d && c != d) failed: registers must be different: a=0x0000000000000000, b=0x0000000000000000, c=0x000000000000000b, d=0x000000000000000a in: Stack: [0x00007fb8fa2e3000,0x00007fb8fa3e4000], sp=0x00007fb8fa3deca0, free space=1007k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x156890e] ShenandoahBarrierSetAssembler::cmpxchg_oop(MacroAssembler*, RegisterImpl*, Address, RegisterImpl*, RegisterImpl*, bool, RegisterImpl*, RegisterImpl*)+0xde V [libjvm.so+0x3ec1d1] compareAndSwapN_shenandoahNode::emit(CodeBuffer&, PhaseRegAlloc*) const+0x571 It seems to appear very rarely. The failure is that both newval and oldval are the same register (rax). I believe it is ok for the two registers to overlap: - It is not expected that newval is preserved across the cmpxchg - The CAS will override newval, but: - The first CAS is unaffected by the overlap - The retry-loop is only entered when previous-value == old-value, and thus newval will still hold the same value For aarch64 it matters even less, because newval is never overridden. Testing: hotspot_gc_shenandoah (x86 & aarch64). ------------- Commit messages: - 8255401: Shenandoah: Allow oldval and newval registers to overlap in cmpxchg_oop() Changes: https://git.openjdk.java.net/jdk/pull/871/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=871&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255401 Stats: 4 lines in 2 files changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/871.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/871/head:pull/871 PR: https://git.openjdk.java.net/jdk/pull/871 From iklam at openjdk.java.net Mon Oct 26 22:20:28 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 26 Oct 2020 22:20:28 GMT Subject: RFR: 8255285: Move JVMFlag origins into a new enum JVMFlagOrigin [v2] In-Reply-To: References: Message-ID: > Many JVM function take an `JVMFlag::Flags` parameter to indicate the origin of the flag -- i.e., "who is setting this flag". E.g., in arguments.hpp: > > static bool parse_argument(const char* arg, JVMFlag::Flags origin); > > However, `JVMFlag::Flags` contains many other bits that are unrelated to the origin. We should add a new enum `JVMFlagOrigin` that has only the valid values for the origin. This makes it possible to do more type-safety checks at C++ compilation time. > > This patch also renamed the confusing bit `JVMFlag::ORIG_COMMAND_LINE` to `WAS_SET_IN_COMMAND_LINE` and added documentation, so that it won't be confused with `JVMFlagOrigin::COMMAND_LINE`. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: Removed aliases of JVMFlagOrigin::X as JVMFlag::X ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/823/files - new: https://git.openjdk.java.net/jdk/pull/823/files/ab814837..53fed1b0 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=823&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=823&range=00-01 Stats: 80 lines in 11 files changed: 0 ins; 10 del; 70 mod Patch: https://git.openjdk.java.net/jdk/pull/823.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/823/head:pull/823 PR: https://git.openjdk.java.net/jdk/pull/823 From iklam at openjdk.java.net Mon Oct 26 22:32:17 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 26 Oct 2020 22:32:17 GMT Subject: RFR: 8255285: Move JVMFlag origins into a new enum JVMFlagOrigin In-Reply-To: References: Message-ID: On Fri, 23 Oct 2020 06:33:06 GMT, Ioi Lam wrote: > Many JVM function take an `JVMFlag::Flags` parameter to indicate the origin of the flag -- i.e., "who is setting this flag". E.g., in arguments.hpp: > > static bool parse_argument(const char* arg, JVMFlag::Flags origin); > > However, `JVMFlag::Flags` contains many other bits that are unrelated to the origin. We should add a new enum `JVMFlagOrigin` that has only the valid values for the origin. This makes it possible to do more type-safety checks at C++ compilation time. > > This patch also renamed the confusing bit `JVMFlag::ORIG_COMMAND_LINE` to `WAS_SET_IN_COMMAND_LINE` and added documentation, so that it won't be confused with `JVMFlagOrigin::COMMAND_LINE`. > _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ > > Hi Ioi, > > On 23/10/2020 4:52 pm, Ioi Lam wrote: > > This patch also renamed the confusing bit `JVMFlag::ORIG_COMMAND_LINE` to `WAS_SET_IN_COMMAND_LINE` and added documentation, so that it won't be confused with `JVMFlagOrigin::COMMAND_LINE`. > > I'm still confused :) Why are we reporting "command line" for a flag > that was ergonomically set, or vice-versa? Surely a flag is either set > via the command-line or via ergonomics but not both ?? We have code like this: void JVMFlag::print_origin(outputStream* st, unsigned int width) const { case JVMFlagOrigin::ERGONOMIC: if (_flags & WAS_SET_IN_COMMAND_LINE) { st->print("command line, "); } st->print("ergonomic"); break; So if FLAG_SET_ERGO changes a flag that was specified in the command-line, we will print out "command line, ergonomic". > I was under the > assumption that ergonomics should not touch a flag explicitly set on the > command-line as that defeats the purpose of setting it. I have no idea why this is the case. Maybe ergonomics is allowed to "fine tune" user-specified values? Anyway, if we want to change this, we should do it in a separate RFE. > > enum class JVMFlagOrigin > > Why not define this as > > enum class Origin > > inside class JVMFlag, so that it is then referred to as JVMFlag::Origin? The reason is to allow `JVMFlagOrigin` to be used in a forward declaration without including jvmFlag.hpp. See vmEnums.hpp. A nested enum like `JVMFlag::Origin` cannot be forward-declared. > > static const JVMFlagOrigin DEFAULT = JVMFlagOrigin::DEFAULT; > > Why is this needed?? To avoid re-typing JVMFlagOrigin? Yeah, but I removed this in the latest version [53fed1b](https://github.com/openjdk/jdk/pull/823/commits/53fed1b00784763c873bf81475430e280f06d72c) ------------- PR: https://git.openjdk.java.net/jdk/pull/823 From david.holmes at oracle.com Mon Oct 26 22:40:59 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 27 Oct 2020 08:40:59 +1000 Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) In-Reply-To: References: Message-ID: <499d8dd3-8cd6-c5f9-29c5-e78f5edd96d9@oracle.com> On 27/10/2020 1:35 am, Gerard Ziemski wrote: > On Mon, 26 Oct 2020 04:33:03 GMT, David Holmes wrote: > >> Hi Gerard, >> >> I think we have a fundamental problem here that UseOSErrorReporting was only ever intended for use on Windows. It simply allows VMError::report_and_die to return instead of actually making the VM "die". For Windows this means we can continue to propagate the windows exception and thus allow Windows Error Reporting (WER) to take over. Whether this actually works correctly or not is a different matter. >> >> For non-Windows there is no pre-established alternative code path for report_and_die() returning. >> >> In the bug report you write: >> >>> On Mac/Linux it would look more like this: >>> #1 catch signal in our handler >>> #2 generate hs_err log >>> #3 turn off our signal handler >>> #4 continue the process normally, allowing it to crash again in the same spot, with the same signal being generated >> >> To me you are now inventing what UseOSErrorReporting should mean on non-Windows, and I don't agree with it. I don't think it should mean that we re-crash using the "default" signal response and consider that as using "OS error reporting". To me that is just not valid, especially when we cannot return from a signal handling context in many cases without incurring undefined behaviour. To me #4 is not a valid expectation as we have no way to know what will happen next if the signal handler returns. It would also be wrong to just continue execution after an assertion or guarantee fails. >> >> I'm assuming that the motivation here is that on macOS if we use the default signal handling modes then macOS will do its own error reporting? If so I would suggest that the right response may be to return from report_and_die (on macOS only) and then deliberately crash after restoring the default handler. Obviously that will change which "crash" the OS reports but that is likely to happen anyway as you cannot guarantee how you will crash after trying to continue (and this goes beyond our general "best effort" approaches in signal handling.) >> >> Beyond that I share Thomas's concerns about making sweeping changes to installed signal handlers. >> >> So my preferred approaches here would be: >> >> 1. Make UseOSErrorReporting Windows only; or >> 2. Make UseOSErrorReporting Windows and macOS only. Then on macOS do a targeted crash after report_and_die() returns. > > hi David, > > Many thanks for the review and finding the background info on the history of this issue. > > How we do things when a user turns ON the "UseOSErrorReporting" flag is just an implementation detail. No there is a semantic underpining as to what it means for there to be OS error reporting on a given platform. Windows has a nicely defined model. Other platforms not so nice. On macOS they really don't want apps to attempt any kind of crash handling on their own. :) > On Windows we forward the crash to the OS to handle it, but just because in this fix we "just" turn off our signal handlers, reset them to SIG_DFL and return to let it crash again, does not mean it's not a meaningful way to forward it to OS, if that's how the OS wants it - please see this technical note from Apple https://developer.apple.com/forums/thread/113742 where Apple suggest the way to let the macOS handle the crash is to: > > "unregister your signal handler (set it to SIG_DFL) and then return. This will cause the crashed process to continue execution, crash again, and generate a crash report via the Apple crash reporter." > > That's how Apple suggest we do it for Mac. That is a blog by an Apple developer giving some very general advice, and IMO lacking in some necessary detail. That quote above is in the context of answering: "Finally, there?s the question of how to exit from your signal handler." The suggestion to "then return" hits UB for the synchronous error signals - a fact not mentioned in the blog entry. The assertion that: "This will cause the crashed process to continue execution, crash again, ... " is a naive oversimplification. If you just seg-faulted doing a read from memory how can you continue execution? What does that mean when the read yielded no value? Will you just continue with a random value? Will the system try to re-execute the read and so crash again? Maybe it will crash again, maybe it won't. Maybe it will do something in the meantime that leads to totally unexpected behaviour (as Thomas previously described). Hence my suggestion that if you are going to attempt this path for macOS then you need to introduce the second crash so we know exactly what will happen. Returning from the original signal handler is not an option IMO. > I can limit the scope of this fix to just macOS here, like I was planning it for JDK-8237727 and worry about Linux in a different issue. Yes please limit to macOS only. We should look at how to remove the flag from platforms where it has no well-defined meaning. Thanks, David ----- > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/813 > From dholmes at openjdk.java.net Mon Oct 26 22:55:32 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 26 Oct 2020 22:55:32 GMT Subject: RFR: 8255231: Avoid upcalls when initializing the statSampler [v4] In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 12:49:43 GMT, Claes Redestad wrote: >> Current implementation of the statSampler does upcalls to System.getProperty to collect values for a number of properties that are all provided by the VM itself. And since the sampling starts before any user code run then no property can have changed. >> >> I suggest refactoring the code so that no upcalls are made normally - while asserting this invariant holds using assert-only upcalls. >> >> This is a small startup optimization - reducing the startup sequence by approx. 300k instructions and 70k branches in my linux-x64 setup. > > Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 21 additional commits since the last revision: > > - Address review comments from David Holmes > - Merge branch 'master' into com_ns > - Refactor to remove stable_java_property_counters and clarify comments > - Merge branch 'master' into com_ns > - Revert unrelated changes to perfData > - Merge branch 'master' into com_ns > - Improve comments > - typo > - Missing definition > - Extract the shorthand java.version from VersionProps and use it in StatSampler > - ... and 11 more: https://git.openjdk.java.net/jdk/compare/145a3876...8572159f Thanks for making the suggested changes. I think we need a further RFE to add some error checking for the sizes of the various property strings in relation to the fixed size arrays that have been allocated to them. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/802 From kbarrett at openjdk.java.net Mon Oct 26 22:59:23 2020 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 26 Oct 2020 22:59:23 GMT Subject: RFR: 8255389: ConcurrentHashTable::NoOp omits return in non-void return method [v2] In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 19:07:32 GMT, Aleksey Shipilev wrote: >> Static analysis complains there is a non-void return method without a return statement: >> >> struct NoOp { >> void operator()(VALUE*) {} >> const VALUE& operator()() {} // <--- here >> void operator()(bool, VALUE*) {} >> } noOp; >> >> AFAICS, this is UB, and we have seen cases like these break compilers in other places. Not in this case, though, because `noOp` is only used as the default functor in `remove`, which does not use this getter-like definition. Still, it would be good to remove that risky definition, so that it is not used accidentally. > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Implement suggestions from review Marked as reviewed by kbarrett (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/863 From kbarrett at openjdk.java.net Mon Oct 26 22:59:24 2020 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 26 Oct 2020 22:59:24 GMT Subject: RFR: 8255389: ConcurrentHashTable::NoOp omits return in non-void return method [v2] In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 22:55:00 GMT, Kim Barrett wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> Implement suggestions from review > > Marked as reviewed by kbarrett (Reviewer). > You can even do something like: > > ``` > // Same without DELETE_FUNC. > template > bool remove(Thread* thread, LOOKUP_FUNC& lookup_f) { > struct { > void operator()(VALUE*) {} > } ignore_del_f; > return internal_remove(thread, lookup_f, ignore_del_f); > } > ``` > > Which I think @kimbarrett hinted. Something along those lines would be fine too. ------------- PR: https://git.openjdk.java.net/jdk/pull/863 From david.holmes at oracle.com Mon Oct 26 23:07:36 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 27 Oct 2020 09:07:36 +1000 Subject: RFR: 8255285: Move JVMFlag origins into a new enum JVMFlagOrigin In-Reply-To: References: Message-ID: <8663f56c-344e-da2a-893e-a318051cbcc7@oracle.com> Hi Ioi, On 27/10/2020 8:32 am, Ioi Lam wrote: > On Fri, 23 Oct 2020 06:33:06 GMT, Ioi Lam wrote: >> _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ >> >> Hi Ioi, >> >> On 23/10/2020 4:52 pm, Ioi Lam wrote: >>> This patch also renamed the confusing bit `JVMFlag::ORIG_COMMAND_LINE` to `WAS_SET_IN_COMMAND_LINE` and added documentation, so that it won't be confused with `JVMFlagOrigin::COMMAND_LINE`. >> >> I'm still confused :) Why are we reporting "command line" for a flag >> that was ergonomically set, or vice-versa? Surely a flag is either set >> via the command-line or via ergonomics but not both ?? > > We have code like this: > > void JVMFlag::print_origin(outputStream* st, unsigned int width) const { > case JVMFlagOrigin::ERGONOMIC: > if (_flags & WAS_SET_IN_COMMAND_LINE) { > st->print("command line, "); > } > st->print("ergonomic"); break; > > So if FLAG_SET_ERGO changes a flag that was specified in the command-line, we will print out "command line, ergonomic". > >> I was under the >> assumption that ergonomics should not touch a flag explicitly set on the >> command-line as that defeats the purpose of setting it. > > I have no idea why this is the case. Maybe ergonomics is allowed to "fine tune" user-specified values? Anyway, if we want to change this, we should do it in a separate RFE. Yes separate RFE. This might be necessary/desirable but at a minimum it should be clearly documented when ergonomics can override an explicit user setting. >>> enum class JVMFlagOrigin >> >> Why not define this as >> >> enum class Origin >> >> inside class JVMFlag, so that it is then referred to as JVMFlag::Origin? > > The reason is to allow `JVMFlagOrigin` to be used in a forward declaration without including jvmFlag.hpp. See vmEnums.hpp. > > A nested enum like `JVMFlag::Origin` cannot be forward-declared. That is a pity. I'm not sure I agree with the overall approach of making enums all top-level just to minimise the number of includes needed. Code structure is more important to me than shaving a few seconds off build times. >>> static const JVMFlagOrigin DEFAULT = JVMFlagOrigin::DEFAULT; >> >> Why is this needed?? To avoid re-typing JVMFlagOrigin? > > Yeah, but I removed this in the latest version [53fed1b](https://github.com/openjdk/jdk/pull/823/commits/53fed1b00784763c873bf81475430e280f06d72c) Okay. Thanks, David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/823 > From dholmes at openjdk.java.net Mon Oct 26 23:10:21 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 26 Oct 2020 23:10:21 GMT Subject: RFR: 8255285: Move JVMFlag origins into a new enum JVMFlagOrigin [v2] In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 22:20:28 GMT, Ioi Lam wrote: >> Many JVM function take an `JVMFlag::Flags` parameter to indicate the origin of the flag -- i.e., "who is setting this flag". E.g., in arguments.hpp: >> >> static bool parse_argument(const char* arg, JVMFlag::Flags origin); >> >> However, `JVMFlag::Flags` contains many other bits that are unrelated to the origin. We should add a new enum `JVMFlagOrigin` that has only the valid values for the origin. This makes it possible to do more type-safety checks at C++ compilation time. >> >> This patch also renamed the confusing bit `JVMFlag::ORIG_COMMAND_LINE` to `WAS_SET_IN_COMMAND_LINE` and added documentation, so that it won't be confused with `JVMFlagOrigin::COMMAND_LINE`. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Removed aliases of JVMFlagOrigin::X as JVMFlag::X Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/823 From coleenp at openjdk.java.net Mon Oct 26 23:32:19 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 26 Oct 2020 23:32:19 GMT Subject: RFR: 8255233: InterpreterRuntime::at_unwind should be a JRT_LEAF [v2] In-Reply-To: References: Message-ID: <11LLovVJnTjVF411y0jpSbu2YuVA8U4mROM46a01MoI=.695c972f-76f8-4463-b8f1-4cd8bf18261b@github.com> On Mon, 26 Oct 2020 08:38:50 GMT, Erik ?sterlund wrote: >> InterpreterRuntime::at_unwind is called at the very beginning of remove_activation(), to notify concurrent stack processing that a frame is about to be unwound. It is currently a JRT_ENTRY, because it needs a last_Java_frame to see what frame is about to get unwound. >> >> However, there are special return paths used by JVMTI pop frame, that checks if the caller frame is deoptimized, then calls a special path that removes the top activation, assuming that does not enter the deopt handler. The new JRT_ENTRY makes that reasoning invalid. >> >> Therefore, we need this to be a JRT_LEAF, that sets a last Java frame, to make everyone happy. This patch performs that change. >> >> I have run tier 1-5 testing, and manually tested: >> >> while true; do make test JTREG="RETAIN=all" TEST=open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/ForceEarlyReturn/ForceEarlyReturn002 TEST_OPTS_JAVA_OPTIONS="-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=0.0001 -XX:ZFragmentationLimit=0.01 -XX:+VerifyOops -XX:+ZVerifyViews" ; done >> >> Before the fix it crashes ~1/15 runs with a bad oop. After the fix, it doesn't crash. I have run it more times than my tmux buffer fits (for a day), and it does not fail any more with this fix. >> >> Unfortunately, my testing on AArch64 has been stalled for a day, so I have sent out this PR without the testing of those bits being finished. I won't push until I get the results back, of course. But I am expecting that to be fine, as there is nothing special going on there and it compiles. Will post a comment when the complete results have arrived. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > address cast LGTM. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/828 From dholmes at openjdk.java.net Tue Oct 27 01:27:21 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 27 Oct 2020 01:27:21 GMT Subject: RFR: 8255233: InterpreterRuntime::at_unwind should be a JRT_LEAF [v2] In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 08:38:50 GMT, Erik ?sterlund wrote: >> InterpreterRuntime::at_unwind is called at the very beginning of remove_activation(), to notify concurrent stack processing that a frame is about to be unwound. It is currently a JRT_ENTRY, because it needs a last_Java_frame to see what frame is about to get unwound. >> >> However, there are special return paths used by JVMTI pop frame, that checks if the caller frame is deoptimized, then calls a special path that removes the top activation, assuming that does not enter the deopt handler. The new JRT_ENTRY makes that reasoning invalid. >> >> Therefore, we need this to be a JRT_LEAF, that sets a last Java frame, to make everyone happy. This patch performs that change. >> >> I have run tier 1-5 testing, and manually tested: >> >> while true; do make test JTREG="RETAIN=all" TEST=open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/ForceEarlyReturn/ForceEarlyReturn002 TEST_OPTS_JAVA_OPTIONS="-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=0.0001 -XX:ZFragmentationLimit=0.01 -XX:+VerifyOops -XX:+ZVerifyViews" ; done >> >> Before the fix it crashes ~1/15 runs with a bad oop. After the fix, it doesn't crash. I have run it more times than my tmux buffer fits (for a day), and it does not fail any more with this fix. >> >> Unfortunately, my testing on AArch64 has been stalled for a day, so I have sent out this PR without the testing of those bits being finished. I won't push until I get the results back, of course. But I am expecting that to be fine, as there is nothing special going on there and it compiles. Will post a comment when the complete results have arrived. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > address cast Seems okay. One minor requested change below. Thanks, David src/hotspot/share/interpreter/interpreterRuntime.cpp line 1177: > 1175: JRT_LEAF(void, InterpreterRuntime::at_unwind(JavaThread* thread)) > 1176: // JRT_END does an implicit safepoint check, hence we are guaranteed to block > 1177: // if this is called during a safepoint The comments are no longer valid. The implicit safepoint check came from the use of ThreadInVMfromJava as part of the JRT_ENTRY. Also it is far from obvious that StackWatermarkSet::before_unwind meets all the requirements of a JRT_LEAF method. Please assure me it is. :) ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/828 From shade at openjdk.java.net Tue Oct 27 05:46:28 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 27 Oct 2020 05:46:28 GMT Subject: RFR: 8255389: ConcurrentHashTable::NoOp omits return in non-void return method [v3] In-Reply-To: References: Message-ID: <-DXpHGTwSf9RyMKfGAFPGDm2jEIYDzQKrEvHcrVwhBk=.b344a829-e066-4cf2-acb6-dc7af89c30a2@github.com> > Static analysis complains there is a non-void return method without a return statement: > > struct NoOp { > void operator()(VALUE*) {} > const VALUE& operator()() {} // <--- here > void operator()(bool, VALUE*) {} > } noOp; > > AFAICS, this is UB, and we have seen cases like these break compilers in other places. Not in this case, though, because `noOp` is only used as the default functor in `remove`, which does not use this getter-like definition. Still, it would be good to remove that risky definition, so that it is not used accidentally. Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Even simpler version ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/863/files - new: https://git.openjdk.java.net/jdk/pull/863/files/1154fea6..d84a461f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=863&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=863&range=01-02 Stats: 8 lines in 1 file changed: 2 ins; 5 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/863.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/863/head:pull/863 PR: https://git.openjdk.java.net/jdk/pull/863 From shade at openjdk.java.net Tue Oct 27 05:46:29 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 27 Oct 2020 05:46:29 GMT Subject: RFR: 8255389: ConcurrentHashTable::NoOp omits return in non-void return method [v2] In-Reply-To: References: Message-ID: <_18H3E51ie2We-5NSyGMCjM4vM8_ekTDUQsl2KLSqIM=.7f0b0ac6-f50d-4a55-a785-e371d85e6019@github.com> On Mon, 26 Oct 2020 22:56:37 GMT, Kim Barrett wrote: >> Marked as reviewed by kbarrett (Reviewer). > >> You can even do something like: >> >> ``` >> // Same without DELETE_FUNC. >> template >> bool remove(Thread* thread, LOOKUP_FUNC& lookup_f) { >> struct { >> void operator()(VALUE*) {} >> } ignore_del_f; >> return internal_remove(thread, lookup_f, ignore_del_f); >> } >> ``` >> >> Which I think @kimbarrett hinted. > > Something along those lines would be fine too. Okay, new version then, please take a look! ------------- PR: https://git.openjdk.java.net/jdk/pull/863 From rehn at openjdk.java.net Tue Oct 27 07:23:19 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 27 Oct 2020 07:23:19 GMT Subject: RFR: 8255389: ConcurrentHashTable::NoOp omits return in non-void return method [v2] In-Reply-To: <_18H3E51ie2We-5NSyGMCjM4vM8_ekTDUQsl2KLSqIM=.7f0b0ac6-f50d-4a55-a785-e371d85e6019@github.com> References: <_18H3E51ie2We-5NSyGMCjM4vM8_ekTDUQsl2KLSqIM=.7f0b0ac6-f50d-4a55-a785-e371d85e6019@github.com> Message-ID: On Tue, 27 Oct 2020 05:42:23 GMT, Aleksey Shipilev wrote: >>> You can even do something like: >>> >>> ``` >>> // Same without DELETE_FUNC. >>> template >>> bool remove(Thread* thread, LOOKUP_FUNC& lookup_f) { >>> struct { >>> void operator()(VALUE*) {} >>> } ignore_del_f; >>> return internal_remove(thread, lookup_f, ignore_del_f); >>> } >>> ``` >>> >>> Which I think @kimbarrett hinted. >> >> Something along those lines would be fine too. > > Okay, new version then, please take a look! Still fine! ------------- PR: https://git.openjdk.java.net/jdk/pull/863 From shade at openjdk.java.net Tue Oct 27 08:23:24 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 27 Oct 2020 08:23:24 GMT Subject: Integrated: 8255389: ConcurrentHashTable::NoOp omits return in non-void return method In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 09:41:12 GMT, Aleksey Shipilev wrote: > Static analysis complains there is a non-void return method without a return statement: > > struct NoOp { > void operator()(VALUE*) {} > const VALUE& operator()() {} // <--- here > void operator()(bool, VALUE*) {} > } noOp; > > AFAICS, this is UB, and we have seen cases like these break compilers in other places. Not in this case, though, because `noOp` is only used as the default functor in `remove`, which does not use this getter-like definition. Still, it would be good to remove that risky definition, so that it is not used accidentally. This pull request has now been integrated. Changeset: dccfd2b3 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/dccfd2b3 Stats: 11 lines in 1 file changed: 3 ins; 7 del; 1 mod 8255389: ConcurrentHashTable::NoOp omits return in non-void return method Reviewed-by: kbarrett, rehn ------------- PR: https://git.openjdk.java.net/jdk/pull/863 From shade at openjdk.java.net Tue Oct 27 08:23:23 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 27 Oct 2020 08:23:23 GMT Subject: RFR: 8255389: ConcurrentHashTable::NoOp omits return in non-void return method [v2] In-Reply-To: References: <_18H3E51ie2We-5NSyGMCjM4vM8_ekTDUQsl2KLSqIM=.7f0b0ac6-f50d-4a55-a785-e371d85e6019@github.com> Message-ID: On Tue, 27 Oct 2020 07:20:09 GMT, Robbin Ehn wrote: >> Okay, new version then, please take a look! > > Still fine! Cheers @robehn and @kimbarrett. Testing is still clean, so I am pushing. ------------- PR: https://git.openjdk.java.net/jdk/pull/863 From stefank at openjdk.java.net Tue Oct 27 08:47:20 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Tue, 27 Oct 2020 08:47:20 GMT Subject: RFR: 8255254: Split os::reserve_memory and os::map_memory_to_file interfaces [v2] In-Reply-To: <8OBEzNqpmYnnaGIXG32ksZn2FZGEmyZEd2hZmdutRa8=.c360dd90-b3df-4040-8013-5541b90406b8@github.com> References: <8OBEzNqpmYnnaGIXG32ksZn2FZGEmyZEd2hZmdutRa8=.c360dd90-b3df-4040-8013-5541b90406b8@github.com> Message-ID: On Mon, 26 Oct 2020 20:16:32 GMT, Anton Kozlov wrote: >> Hi, >> >> Please review a change to extract map_memory_to_file interface out of reserve_memory when the latter takes file descriptor. >> >> The change should be a pure refactoring without changes in functionality. The only part is disturbing: a comment in original os_posix.cpp:316 seems to refer to else clause and it contradicts to the actual code. > > Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Fix review findings Marked as reviewed by stefank (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/812 From tschatzl at openjdk.java.net Tue Oct 27 09:42:31 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 27 Oct 2020 09:42:31 GMT Subject: RFR: 8255298: Remove SurvivorAlignmentInBytes functionality [v3] In-Reply-To: References: Message-ID: > Hi all, > > can I have reviews to remove the SurvivorAlignmentInBytes functionality? It has not been in use for a long time if ever, and can be removed. Searching the web also indicates that apart from the usual lists of all options and CRs it is never mentioned. > > SurvivorAlignmentInBytes is an experimental option so no further process is required. > > Testing: tier1-5 > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: kbarrett review ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/838/files - new: https://git.openjdk.java.net/jdk/pull/838/files/878cac21..fa947b79 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=838&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=838&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/838.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/838/head:pull/838 PR: https://git.openjdk.java.net/jdk/pull/838 From tschatzl at openjdk.java.net Tue Oct 27 09:48:15 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 27 Oct 2020 09:48:15 GMT Subject: RFR: 8255298: Remove SurvivorAlignmentInBytes functionality [v3] In-Reply-To: <4f0U-SmMRo72_dOV8EbLqnship2hTGMhfkcX9EszAXY=.873d3a45-825b-4e7e-b4f3-f1663aa3eec8@github.com> References: <4f0U-SmMRo72_dOV8EbLqnship2hTGMhfkcX9EszAXY=.873d3a45-825b-4e7e-b4f3-f1663aa3eec8@github.com> Message-ID: On Mon, 26 Oct 2020 07:42:20 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> kbarrett review > > Marked as reviewed by ayang (Author). @kimbarrett : fixed in new revision. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/838 From burban at openjdk.java.net Tue Oct 27 10:08:20 2020 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Tue, 27 Oct 2020 10:08:20 GMT Subject: RFR: 8254072: AArch64: Get rid of --disable-warnings-as-errors on Windows+ARM64 build [v4] In-Reply-To: <0kVzFMOZKbbvPLUlyE-VbpYSC5omD-nZoOqxBRt4s8s=.450fc58e-3aa5-401a-bce8-953a52892b87@github.com> References: <0kVzFMOZKbbvPLUlyE-VbpYSC5omD-nZoOqxBRt4s8s=.450fc58e-3aa5-401a-bce8-953a52892b87@github.com> Message-ID: <20qZgoXKo027_hRREaBgTilt_YjIgm96eRdUDyvXbuQ=.2ac1bd03-de20-46af-b6e8-a0cefea17e7d@github.com> On Sun, 18 Oct 2020 09:07:17 GMT, Magnus Ihse Bursie wrote: >> Bernhard Urban-Forster has updated the pull request incrementally with two additional commits since the last revision: >> >> - uppercase suffix >> - add assert > > Build changes look fine now. @theRealAph does the PR look okay to you now? ------------- PR: https://git.openjdk.java.net/jdk/pull/530 From redestad at openjdk.java.net Tue Oct 27 10:38:27 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Tue, 27 Oct 2020 10:38:27 GMT Subject: RFR: 8255231: Avoid upcalls when initializing the statSampler [v4] In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 22:52:39 GMT, David Holmes wrote: >> Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 21 additional commits since the last revision: >> >> - Address review comments from David Holmes >> - Merge branch 'master' into com_ns >> - Refactor to remove stable_java_property_counters and clarify comments >> - Merge branch 'master' into com_ns >> - Revert unrelated changes to perfData >> - Merge branch 'master' into com_ns >> - Improve comments >> - typo >> - Missing definition >> - Extract the shorthand java.version from VersionProps and use it in StatSampler >> - ... and 11 more: https://git.openjdk.java.net/jdk/compare/94ef2dbc...8572159f > > Thanks for making the suggested changes. > > I think we need a further RFE to add some error checking for the sizes of the various property strings in relation to the fixed size arrays that have been allocated to them. > > Thanks, > David @dholmes @iklam: thanks for reviewing! ------------- PR: https://git.openjdk.java.net/jdk/pull/802 From redestad at openjdk.java.net Tue Oct 27 10:38:28 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Tue, 27 Oct 2020 10:38:28 GMT Subject: Integrated: 8255231: Avoid upcalls when initializing the statSampler In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 10:11:31 GMT, Claes Redestad wrote: > Current implementation of the statSampler does upcalls to System.getProperty to collect values for a number of properties that are all provided by the VM itself. And since the sampling starts before any user code run then no property can have changed. > > I suggest refactoring the code so that no upcalls are made normally - while asserting this invariant holds using assert-only upcalls. > > This is a small startup optimization - reducing the startup sequence by approx. 300k instructions and 70k branches in my linux-x64 setup. This pull request has now been integrated. Changeset: f7c59c66 Author: Claes Redestad URL: https://git.openjdk.java.net/jdk/commit/f7c59c66 Stats: 136 lines in 7 files changed: 51 ins; 41 del; 44 mod 8255231: Avoid upcalls when initializing the statSampler Reviewed-by: iklam, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/802 From shade at openjdk.java.net Tue Oct 27 11:30:30 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 27 Oct 2020 11:30:30 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v11] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: On Wed, 21 Oct 2020 20:48:29 GMT, Roman Kennke wrote: >> Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads that make heavvy use of such weak references will therefore potentially cause significant GC pauses. >> >> There are 3 main items that contribute to pause time linear to number of references, or worse: >> - We need to scan and consider each reference on the various 'discovered' lists. >> - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is theoretically only bounded by the live data set size. >> - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' >> >> The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. >> >> The solution to this is two-fold: >> 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for FinalReference, marking stops there, and does not mark through the referent. >> 2. Concurrent processing is performed after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers in Reference.get() intrinsics that return NULL when the referent is not reachable. >> >> Testing: hotspot_gc_shenadoah (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions, tier1, tier2, vmTestbase_vm_metaspace, vmTestbase_nsk_jvmti, with -XX:+UseShenandoahGC without regressions, specjvm with various levels of verification > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Change ShenandoahLRBKind to be an enum class instead of plain enum, and some minor touch-ups src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.inline.hpp line 100: > 98: ShenandoahHeap::cas_oop(fwd, load_addr, obj); > 99: } > 100: return fwd; Unnecessary change? ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From shade at openjdk.java.net Tue Oct 27 11:30:29 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 27 Oct 2020 11:30:29 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v12] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: On Thu, 22 Oct 2020 16:04:25 GMT, Roman Kennke wrote: >> Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads that make heavvy use of such weak references will therefore potentially cause significant GC pauses. >> >> There are 3 main items that contribute to pause time linear to number of references, or worse: >> - We need to scan and consider each reference on the various 'discovered' lists. >> - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is theoretically only bounded by the live data set size. >> - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' >> >> The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. >> >> The solution to this is two-fold: >> 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for FinalReference, marking stops there, and does not mark through the referent. >> 2. Concurrent processing is performed after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers in Reference.get() intrinsics that return NULL when the referent is not reachable. >> >> Testing: hotspot_gc_shenadoah (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions, tier1, tier2, vmTestbase_vm_metaspace, vmTestbase_nsk_jvmti, with -XX:+UseShenandoahGC without regressions, specjvm with various levels of verification > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Rename native argument to maybe_narrow_oop for more clarity My initial review follows. I have not digested the whole thing yet. src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp line 360: > 358: > 359: if (ShenandoahBarrierSet::use_load_reference_barrier_native(decorators, type)) { > 360: load_reference_barrier_native(masm, dst, src, (decorators & IN_NATIVE) == 0); I am a bit confused. If we introduce the local variable, would it be `maybe_narrow_oop`? How does it relate to `IS_NATIVE`? Also, see that `use_load_reference_barrier_native` already tests `IS_NATIVE`. src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp line 525: > 523: > 524: if (ShenandoahBarrierSet::use_load_reference_barrier_native(decorators, type)) { > 525: load_reference_barrier_native(masm, dst, src, (decorators & IN_NATIVE) == 0); Same comment as in `aarch64` code. src/hotspot/share/gc/shenandoah/c1/shenandoahBarrierSetC1.cpp line 287: > 285: _load_reference_barrier_weakref_rt_code_blob = Runtime1::generate_blob(buffer_blob, -1, > 286: "shenandoah_load_reference_barrier_weakref_slow", > 287: false, &lrb_weakref_code_gen_cl); Nit: Looks like indenting is a bit off here, in comparisons with blocks above. src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.hpp line 41: > 39: NATIVE, > 40: WEAK > 41: }; Descriptions maybe? Also, `ShenandoahLRBKind` seems noisy. Since it is already in `SBS`, maybe `ShenandoahBarrierSet::LRBKind`? src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.hpp line 64: > 62: static bool use_load_reference_barrier_native(DecoratorSet decorators, BasicType type); > 63: static bool need_keep_alive_barrier(DecoratorSet decorators, BasicType type); > 64: static ShenandoahLRBKind access_kind(DecoratorSet decorators, BasicType type); ...or in fact, `ShenandoahBarrierSet::AccessKind` to match the `access_kind` here? Or does it clash with something else? Or rename this to `lrb_kind`? src/hotspot/share/gc/shenandoah/c2/shenandoahBarrierSetC2.cpp line 1062: > 1060: Node* in2 = n->in(2); > 1061: if (in1->bottom_type() == TypePtr::NULL_PTR && > 1062: (in2->Opcode() != Op_ShenandoahLoadReferenceBarrier || This is a bugfix, right? It changes `in1` (seemingly incorrect) to `in2` (seemingly correct). If so, maybe we should split it out to fix previous releases too? src/hotspot/share/gc/shenandoah/c2/shenandoahSupport.cpp line 996: > 994: break; > 995: default: > 996: ShouldNotReachHere(); I expect some compilers to complain here about the uninitialized `name`, please add `name = NULL;` before the `ShouldNotReachHere()`? src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.hpp line 104: > 102: > 103: template > 104: inline oop load_reference_barrier_native(oop obj, T* load_addr); These might be forked to a separate cleanup? Not insisting... That would make backports a bit cleaner, though. src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp line 81: > 79: _heap(ShenandoahHeap::heap()), > 80: _mark_context(_heap->marking_context()), > 81: _strong(true) Do we want to turn this to yet another template parameter, like for dedup? That would also resolve passing `true` or `false` to `strong` argument without comments. src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.inline.hpp line 264: > 262: marked = mark_context->mark_strong(obj, marked_first); > 263: } else { > 264: marked = mark_context->mark_final(obj, marked_first); Is this `mark_final` actually `mark_weak`? src/hotspot/share/gc/shenandoah/shenandoahMarkBitMap.inline.hpp line 64: > 62: } > 63: > 64: inline bool ShenandoahMarkBitMap::mark_final(HeapWord* heap_addr, bool& marked_first) { It looks to me that `marked_first` is always the same as the return value? If so, can we drop that out-argument? src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 1733: > 1731: } > 1732: } > 1733: Why this move? src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 3053: > 3051: > 3052: ShenandoahWorkerScope scope(workers(), > 3053: ShenandoahWorkerPolicy::calc_workers_for_conc_root_processing(), It probably does not matter, but maybe we should be having a separate `ShenandoahWorkerPolicy` entry for `conc_weak_refs`. src/hotspot/share/gc/shenandoah/shenandoahHeap.inline.hpp line 404: > 402: assert(ctx->is_complete(), "sanity"); > 403: > 404: const ShenandoahMarkBitMap* mark_bit_map = ctx->mark_bit_map(); Why `const`? Not necessarily wrong, but inconsistent with the rest of the method. src/hotspot/share/gc/shenandoah/shenandoahMarkBitMap.cpp line 2: > 1: /* > 2: * Copyright (c) 2020, Red Hat, Inc. and/or its affiliates. Was it copied from the `MarkBitMap`? Oracle's copyright needs to be left in place, methinks, along with Red Hat-s for Red Hat-s modifications. src/hotspot/share/gc/shenandoah/shenandoahMarkBitMap.hpp line 2: > 1: /* > 2: * Copyright (c) 2020, Red Hat, Inc. and/or its affiliates. Ditto. src/hotspot/share/gc/shenandoah/shenandoahMarkBitMap.inline.hpp line 2: > 1: /* > 2: * Copyright (c) 2020, Red Hat, Inc. and/or its affiliates. Ditto. src/hotspot/share/gc/shenandoah/shenandoahPhaseTimings.hpp line 85: > 83: f(conc_weak_refs, "Concurrent Weak References") \ > 84: f(conc_weak_refs_work, " Process") \ > 85: SHENANDOAH_PAR_PHASE_DO(conc_weak_refs_work_, " CWR: ", f) \ Eh. Clashes with `CWR:` below... Maybe `CWRF`? src/hotspot/share/gc/shenandoah/shenandoahReferenceProcessor.hpp line 42: > 40: * 1. Concurrent reference marking: Discover all j.l.r.Reference objects and determine reachability of all live objects. > 41: * 2. Concurrent reference processing: For all discoved j.l.r.References, determine whether or not to keep or clean > 42: * them. Also, clean and enqueue relevant references concurrently. "determine whether to keep them alive or clean them", right? These are the choices? src/hotspot/share/gc/shenandoah/shenandoahReferenceProcessor.hpp line 53: > 51: * These reachabilities are implemented in shenandoahMarkBitMap.* > 52: * Conceptually, marking starts with a strong wavefront at the GC roots. Whenever a Reference object is encountered, > 53: * that Reference is discovered, it may be discovered by the ShenandoahReferenceProcessor. If it is discovered, it "Whenever a Reference object is encountered, it may be discovered by the ShenandoahReferenceProcessor"? src/hotspot/share/gc/shenandoah/shenandoahReferenceProcessor.hpp line 55: > 53: * that Reference is discovered, it may be discovered by the ShenandoahReferenceProcessor. If it is discovered, it > 54: * gets added to the discovered list, and that wavefront stops there, except when it's a FinalReference, in which > 55: * case the wavefront switches to finalizable marking and marks through the refenent. When a Reference is not "refenent" -> "referent" src/hotspot/share/gc/shenandoah/shenandoahReferenceProcessor.hpp line 57: > 55: * case the wavefront switches to finalizable marking and marks through the refenent. When a Reference is not > 56: * discovered, e.g. if it's a SoftReference that is not eligible for discovery, then marking continues as if the > 57: * Reference were a regular object. Whenever a strong wavefront encounters an object that is already marked "was a regular object" src/hotspot/share/gc/shenandoah/shenandoahTaskqueue.hpp line 154: > 152: public: > 153: ObjArrayChunkedTask(oop o = NULL, bool count_liveness = true, bool strong = true) { > 154: assert(decode_oop(encode_oop(o, count_liveness, strong)) == o, "oop can be encoded: " PTR_FORMAT, p2i(o)); Need encodeability `assert`-s for `count_liveness` and `strong` too? Also in the other constructor? This might get costly, and the whole thing might need rethinking... But at least the initial testing better check there are no bugs here. src/hotspot/share/gc/shenandoah/shenandoahTaskqueue.hpp line 180: > 178: inline uintptr_t encode_oop(oop obj, bool count_liveness, bool strong) const { > 179: uintptr_t encoded_oop = ((uintptr_t)(void*) obj) << oop_shift; > 180: assert((encoded_oop & (count_liveness_decode_mask | strong_decode_mask)) == 0, "need bit for encoding count-liveness and strong bits"); Ah! This assert can be sacrificed for the additional asserts in constructors, right? src/hotspot/share/gc/shenandoah/shenandoahThreadLocalData.hpp line 55: > 53: int _disarmed_value; > 54: double _paced_time; > 55: ShenandoahMarkRefsSuperClosure* _mark_closure; This rubs me the wrong way. Closures are usually stack-allocated, so we are exposing the stack pointer here. src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp line 97: > 95: if (!CompressedOops::is_null(o)) { > 96: oop obj = CompressedOops::decode_not_null(o); > 97: obj = ShenandoahForwarding::get_forwardee(obj); But... `verify_oop` verifies the consistency of `obj` and its fwdptr. Here, we effectively omit those checks, making verification less effective! Is this for `Reference` classes only? src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp line 853: > 851: _verify_forwarded_none, // no forwarded references > 852: _verify_marked_complete_except_references, // walk over marked objects too > 853: _verify_cset_disable, // non-forwarded references to cset expected Please make sure these are indented properly. ------------- Changes requested by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/505 From rkennke at openjdk.java.net Tue Oct 27 11:53:23 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 27 Oct 2020 11:53:23 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v12] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: On Tue, 27 Oct 2020 10:36:27 GMT, Aleksey Shipilev wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename native argument to maybe_narrow_oop for more clarity > > src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp line 360: > >> 358: >> 359: if (ShenandoahBarrierSet::use_load_reference_barrier_native(decorators, type)) { >> 360: load_reference_barrier_native(masm, dst, src, (decorators & IN_NATIVE) == 0); > > I am a bit confused. If we introduce the local variable, would it be `maybe_narrow_oop`? How does it relate to `IS_NATIVE`? Also, see that `use_load_reference_barrier_native` already tests `IS_NATIVE`. Yeah, this is confusing. Zhengyu also stumbled over this. Notice that (decorators & IN_NATIVE) == 0 tests for 'is *not* native'. The point is that native-access is *always* uncompressed-oops, while accessing a referent is narrowOop or oop depending on UseCompressedOops. Hence the distinction. If you have a good suggestion on how to make this less confusing, I'd appreciate it. > src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.hpp line 64: > >> 62: static bool use_load_reference_barrier_native(DecoratorSet decorators, BasicType type); >> 63: static bool need_keep_alive_barrier(DecoratorSet decorators, BasicType type); >> 64: static ShenandoahLRBKind access_kind(DecoratorSet decorators, BasicType type); > > ...or in fact, `ShenandoahBarrierSet::AccessKind` to match the `access_kind` here? Or does it clash with something else? Or rename this to `lrb_kind`? AccessKind seems sensible. I'll try it. > src/hotspot/share/gc/shenandoah/shenandoahThreadLocalData.hpp line 55: > >> 53: int _disarmed_value; >> 54: double _paced_time; >> 55: ShenandoahMarkRefsSuperClosure* _mark_closure; > > This rubs me the wrong way. Closures are usually stack-allocated, so we are exposing the stack pointer here. Yeah we need to pass it between the mark-loop and the reference-processor. It's still thread-local. ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From stuefe at openjdk.java.net Tue Oct 27 12:33:24 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 27 Oct 2020 12:33:24 GMT Subject: RFR: JDK-8255450: runtime/ThreadCountLimit.java causes high system load Message-ID: Hi, this is rather trivial: runtime/ThreadCountLimit.java, introduced with JDK-8222671, caused problems on our test machines (JDK-8222671 is private and cannot be accessed from outside Oracle, so all I know comes from its review thread [1]). The test creates massive amount of threads in order to hit some OS limit which would manifest as an OOM. This affects unrelated processes, unless the test is executed in a jail or with specific limits set. This test probably should be executed with a specific limit, but for now lets mark it as @stress and remove it from tier1. ------------- Commit messages: - JDK-8255450 Changes: https://git.openjdk.java.net/jdk/pull/876/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=876&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255450 Stats: 2 lines in 2 files changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/876.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/876/head:pull/876 PR: https://git.openjdk.java.net/jdk/pull/876 From mcimadamore at openjdk.java.net Tue Oct 27 12:59:31 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Tue, 27 Oct 2020 12:59:31 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v14] In-Reply-To: References: Message-ID: <6DXvMCLE3m-Rcv_Sxl-WMZITXkY8LwzctCgf_Ke83Ls=.c8f71977-0f81-415f-b6a5-f56be3a934c1@github.com> > This patch contains the changes associated with the third incubation round of the foreign memory access API incubation (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: > > * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from multiple threads > * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class has been added, which defines several useful dereference routines; these are really just thin wrappers around memory access var handles, but they make the barrier of entry for using this API somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit of dereference. > > This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done by calling `MemoryAddress::asSegmentRestricted`). > > A list of the API, implementation and test changes is provided below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be happy to point at existing discussions, and/or to provide the feedback required. > > A big thank to Erik Osterlund, Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. > > Thanks > Maurizio > > Javadoc: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which it is easy to derive all the other handles using plain var handle combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. > * `MemoryLayouts` > * A new layout, for machine addresses, has been added to the mix. > > > > ### Implementation changes > > There are two main things to discuss here: support for shared segments, and the general simplification of the memory access var handle support. > > #### Shared segments > > The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment is shared, it would be possible for a thread to close it while another is accessing it. > > After considering several options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. > > Sadly, none of these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). > > The question is, then, once we detect that a thread is accessing the very segment we're about to close, what should happen? We first experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread is accessing the segment being closed. > > As written in the javadoc, this doesn't mean that clients should just catch and try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should be treated as such. > > In terms of gritty implementation, we needed to centralize memory access routines in a single place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` annotation, which tells the VM that something important is going on. > > To achieve this, we created a new (autogenerated) class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during access (which is important when registering segments against cleaners). > > Of course, to make memory access safe, memory access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead of unsafe, so that a liveness check can be triggered (in case a scope is present). > > `ScopedMemoryAccess` has a `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed successfully. > > The implementation of `MemoryScope` (now significantly simplified from what we had before), has two implementations, one for confined segments and one for shared segments; the main difference between the two is what happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. > > #### Memory access var handles overhaul > > The key realization here was that if all memory access var handles took a coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle form. > > This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that e.g. additional offset is injected into a base memory access var handle. > > This also helped in simplifying the implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level access on the innards of the memory access var handle. All that code is now gone. > > #### Test changes > > Not much to see here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared segment case. > > [1] - https://openjdk.java.net/jeps/393 > [2] - https://openjdk.java.net/jeps/389 > [3] - https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html > [4] - https://openjdk.java.net/jeps/312 Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 18 commits: - Merge branch 'master' into 8254162 - Address review comment for scoped memory access makefile - Address CSR comments - Back-port of TestByteBuffer fix - Merge branch 'master' into 8254162 - Merge branch 'master' into 8254162 - Merge branch 'master' into 8254162 - Remove spurious check on MemoryScope::confineTo Added tests to make sure no spurious exception is thrown when: * handing off a segment from A to A * sharing an already shared segment - Merge branch 'master' into 8254162 - Simplify example in the toplevel javadoc - ... and 8 more: https://git.openjdk.java.net/jdk/compare/cf56c7e0...697c7ca5 ------------- Changes: https://git.openjdk.java.net/jdk/pull/548/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=13 Stats: 7504 lines in 79 files changed: 4797 ins; 1530 del; 1177 mod Patch: https://git.openjdk.java.net/jdk/pull/548.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/548/head:pull/548 PR: https://git.openjdk.java.net/jdk/pull/548 From stuefe at openjdk.java.net Tue Oct 27 13:14:22 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 27 Oct 2020 13:14:22 GMT Subject: RFR: 8255254: Split os::reserve_memory and os::map_memory_to_file interfaces [v2] In-Reply-To: <8OBEzNqpmYnnaGIXG32ksZn2FZGEmyZEd2hZmdutRa8=.c360dd90-b3df-4040-8013-5541b90406b8@github.com> References: <8OBEzNqpmYnnaGIXG32ksZn2FZGEmyZEd2hZmdutRa8=.c360dd90-b3df-4040-8013-5541b90406b8@github.com> Message-ID: On Mon, 26 Oct 2020 20:16:32 GMT, Anton Kozlov wrote: >> Hi, >> >> Please review a change to extract map_memory_to_file interface out of reserve_memory when the latter takes file descriptor. >> >> The change should be a pure refactoring without changes in functionality. The only part is disturbing: a comment in original os_posix.cpp:316 seems to refer to else clause and it contradicts to the actual code. > > Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Fix review findings Hi Anton, this is good! My remarks are all minor nits and/or things which can be fixed in another RFE. Pity that we have to multiplex now in VirtualSpace, but that is still better than what we do now. Cheers, Thomas src/hotspot/share/runtime/os.hpp line 372: > 370: static char* map_memory_to_file_aligned(size_t size, size_t alignment, int fd); > 371: static char* map_memory_to_file(char* base, size_t size, int fd); > 372: static char* attempt_map_memory_to_file(char* base, size_t size, int fd); Can we name this attempt_map_memory_to_file **_at** to have equality to "reserve_memory_at" ? src/hotspot/os/linux/os_linux.cpp line 4207: > 4205: } > 4206: > 4207: char* os::pd_attempt_map_memory_to_file(char* requested_addr, size_t bytes, int file_desc) { This is fine, but I am not sure anymore what the point was of first reserving, then replacing the mapping in the first place. Seems to me one could call os::map_memory_to_file() directly, or am I missing something? I think this was somehow all related to https://openjdk.java.net/jeps/316 and allocating on NVDIMM. src/hotspot/os/posix/os_posix.cpp line 329: > 327: > 328: if (end_offset > 0) { > 329: os::release_memory(extra_base + begin_offset + size, end_offset); Not your patch, but the name "end_offset" is seriously confusing here... src/hotspot/os/posix/os_posix.cpp line 335: > 333: } > 334: > 335: // Multiple threads can race in this code, and can remap over each other with MAP_FIXED, I think you could either completely remove the comment - its somewhat obvious - or remove it down to the call of chop_extra_memory. src/hotspot/os/windows/os_windows.cpp line 3140: > 3138: // virtual space to get requested alignment, like posix-like os's. > 3139: // Windows prevents multiple thread from remapping over each other so this loop is thread-safe. > 3140: char* map_or_reserve_memory_aligned(size_t size, size_t alignment, int file_desc) { static ? ------------- PR: https://git.openjdk.java.net/jdk/pull/812 From eosterlund at openjdk.java.net Tue Oct 27 13:29:31 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 27 Oct 2020 13:29:31 GMT Subject: RFR: 8255233: InterpreterRuntime::at_unwind should be a JRT_LEAF [v3] In-Reply-To: References: Message-ID: > InterpreterRuntime::at_unwind is called at the very beginning of remove_activation(), to notify concurrent stack processing that a frame is about to be unwound. It is currently a JRT_ENTRY, because it needs a last_Java_frame to see what frame is about to get unwound. > > However, there are special return paths used by JVMTI pop frame, that checks if the caller frame is deoptimized, then calls a special path that removes the top activation, assuming that does not enter the deopt handler. The new JRT_ENTRY makes that reasoning invalid. > > Therefore, we need this to be a JRT_LEAF, that sets a last Java frame, to make everyone happy. This patch performs that change. > > I have run tier 1-5 testing, and manually tested: > > while true; do make test JTREG="RETAIN=all" TEST=open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/ForceEarlyReturn/ForceEarlyReturn002 TEST_OPTS_JAVA_OPTIONS="-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=0.0001 -XX:ZFragmentationLimit=0.01 -XX:+VerifyOops -XX:+ZVerifyViews" ; done > > Before the fix it crashes ~1/15 runs with a bad oop. After the fix, it doesn't crash. I have run it more times than my tmux buffer fits (for a day), and it does not fail any more with this fix. > > Unfortunately, my testing on AArch64 has been stalled for a day, so I have sent out this PR without the testing of those bits being finished. I won't push until I get the results back, of course. But I am expecting that to be fine, as there is nothing special going on there and it compiles. Will post a comment when the complete results have arrived. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: Remove comment ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/828/files - new: https://git.openjdk.java.net/jdk/pull/828/files/cc3929d1..a9a59f3b Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=828&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=828&range=01-02 Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/828.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/828/head:pull/828 PR: https://git.openjdk.java.net/jdk/pull/828 From eosterlund at openjdk.java.net Tue Oct 27 13:31:36 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 27 Oct 2020 13:31:36 GMT Subject: RFR: 8255243: Reinforce escape barrier interactions with ZGC conc stack processing [v2] In-Reply-To: References: Message-ID: > The escape barrier reallocates scalarized objects potentially deep into the stack of a remote thread. Each allocation can safepoint, causing referenced frames to be invalid. Some sprinklings were added that deal with that, but I believe it was subsequently broken with the integration of the new vector API, that has its own new deoptimization code that did not know about this. Not surprisingly, the integration of the new vector API had no idea about this subtlety, and allocates an object, and then reads an object deep from the stack of a remote thread (using an escape barrier). I suppose the issue is that all these 3 things were integrated at almost the same time. The problematic code sequence is in VectorSupport::allocate_vector() in vectorSupport.cpp, which is called from Deoptimization::realloc_objects(). It first allocates an oop (possibly safepointing), and then reads a vector oop from the stack. This is usually fine, but not through the escape barrier, with concurrent stack sc anning. While I have not seen any crashes yet, I can see from code inspection, that there is no way that this works correctly. > > In order to make this less fragile for future changes, we should really have a RAII object that keeps the target thread's stack of the escape barrier, stable and processed, across safepoints. This patch fixes that. Then it becomes much easier to reason about its correctness, compared to hoping the various hooks are applied after each safepoint. > > With this new robustness fix, the thread running the escape barrier, keeps the target thread stack processed, straight through safepoints on the requesting thread, making it easy and intuitive to understand why this works correctly. The RAII object basically just has to cover the code block that pokes at the remote stack and goes in and out of safepoints, arbitrarily. Arguably, this escape barrier doesn't need to be blazingly fast, and can afford keeping stacks sane through its operation. Erik ?sterlund has updated the pull request incrementally with two additional commits since the last revision: - Better encapsulate object deoptimization in EscapeBarrier also to facilitate correct interaction with concurrent thread stack processing. The Stackwalk for object deoptimization in VM_GetOrSetLocal::doit_prologue is not prepared for concurrent thread stack processing. EscapeBarrier::deoptimize_objects(int depth) is extended to cover a range of frames from depth d1 to depth d2. It is also prepared for concurrent thread stack processing. With this change it is used to deoptimize objects in the prologue of VM_GetOrSetLocal. - Review comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/832/files - new: https://git.openjdk.java.net/jdk/pull/832/files/5159c9a2..7db0bab1 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=832&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=832&range=00-01 Stats: 120 lines in 9 files changed: 27 ins; 63 del; 30 mod Patch: https://git.openjdk.java.net/jdk/pull/832.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/832/head:pull/832 PR: https://git.openjdk.java.net/jdk/pull/832 From eosterlund at openjdk.java.net Tue Oct 27 13:31:36 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 27 Oct 2020 13:31:36 GMT Subject: RFR: 8255243: Reinforce escape barrier interactions with ZGC conc stack processing [v2] In-Reply-To: References: <_J8bYqJcDa-5BvnEtZkdc3zIY21IfgEXTYvSSWy7znY=.074f43c7-ceac-4bb2-907e-b141d273dc4e@github.com> <5tKw_M8ud42YEtE-k_93YPjTbpC13BRPT20afBbInbA=.039581b3-7c53-42fa-947f-d672d2192202@github.com> Message-ID: <6a6IAyQBDlOUwyJ4qnxDc9BxK0zJYtMy5PRW33GFhcA=.a6661933-fcca-4700-9265-a4018f206e11@github.com> On Mon, 26 Oct 2020 15:19:51 GMT, Richard Reingruber wrote: >>> Hi Erik, the last commit in https://github.com/reinrich/jdk/commits/pr-832-with-better-encapsulation would be the refactoring I would like to do. It removes the code not compliant with concurrent thread stack processing from VM_GetOrSetLocal::doit_prologue(). Instead EscapeBarrier::deoptimize_objects(int d1, int d2) is called. You added already a KeepStackGCProcessedMark to that method and I changed it to accept a range [d1, d2] of frames do the object deoptimization for. >>> >>> I'm not sure how to handle this from a process point of view. Can the refactoring be done within this change? Should a new item or subtask be created for it. I'd be glad if you could give an advice on that. >>> >>> Thanks, Richard. >> >> If you are okay with it, I can add your refactorings into this change, and add you as a co-author of the change. Sounds good? >> >> Thanks, > > It does sound good indeed to me if you don't mind doing that. Thanks! > I have run the tests dedicated to EscapeBarriers with ZGC enabled and also the DeoptimizeObjectsALot stress testing. I will run some more serviceability tests and my teams CI testing until tomorrow. Thanks @reinrich. I uploaded your patch to this PR, and will add you as contributor. Also addressed your review comments. Hope your testing went fine. ------------- PR: https://git.openjdk.java.net/jdk/pull/832 From eosterlund at openjdk.java.net Tue Oct 27 13:32:21 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 27 Oct 2020 13:32:21 GMT Subject: RFR: 8255233: InterpreterRuntime::at_unwind should be a JRT_LEAF [v2] In-Reply-To: <11LLovVJnTjVF411y0jpSbu2YuVA8U4mROM46a01MoI=.695c972f-76f8-4463-b8f1-4cd8bf18261b@github.com> References: <11LLovVJnTjVF411y0jpSbu2YuVA8U4mROM46a01MoI=.695c972f-76f8-4463-b8f1-4cd8bf18261b@github.com> Message-ID: <-c5XxP5EdTiFr7UQUFRCLAMs4D3mEpryMe8ONgeazSU=.a5cdc674-cc36-488f-bc0c-25b850273924@github.com> On Mon, 26 Oct 2020 23:29:25 GMT, Coleen Phillimore wrote: > LGTM. Thanks for the review. ------------- PR: https://git.openjdk.java.net/jdk/pull/828 From eosterlund at openjdk.java.net Tue Oct 27 13:38:25 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 27 Oct 2020 13:38:25 GMT Subject: RFR: 8255233: InterpreterRuntime::at_unwind should be a JRT_LEAF [v2] In-Reply-To: References: Message-ID: On Tue, 27 Oct 2020 01:14:07 GMT, David Holmes wrote: >> Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: >> >> address cast > > src/hotspot/share/interpreter/interpreterRuntime.cpp line 1177: > >> 1175: JRT_LEAF(void, InterpreterRuntime::at_unwind(JavaThread* thread)) >> 1176: // JRT_END does an implicit safepoint check, hence we are guaranteed to block >> 1177: // if this is called during a safepoint > > The comments are no longer valid. The implicit safepoint check came from the use of ThreadInVMfromJava as part of the JRT_ENTRY. > > Also it is far from obvious that StackWatermarkSet::before_unwind meets all the requirements of a JRT_LEAF method. Please assure me it is. :) Nice catch. I removed the out of date comments. Regarding the leafness of the operation: the operations performed in the stack watermark are indeed designed to run in such conditions. When the callbacks are executed, verification code is run, asserting we are not in a state ignored by the GC. It would never work to have code that transitions inside of such callbacks. Today, it merely applies some GC barriers, similar to performing an access API call. Having said that, despite not transitioning in this code path, I do make the stack walkable when calling this method, so that the barrier can see what the last_Java_frame is, and figure out if action needs to be taken or not. But we do that for other leaf functions as well, and that is fine. The crucial thing is that there are zero transitions, which I can assure you of. In particular, the thread will be _in_java throughout the entire operation. Thanks for the review! ------------- PR: https://git.openjdk.java.net/jdk/pull/828 From shade at openjdk.java.net Tue Oct 27 13:43:20 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 27 Oct 2020 13:43:20 GMT Subject: RFR: JDK-8255450: runtime/ThreadCountLimit.java causes high system load In-Reply-To: References: Message-ID: On Tue, 27 Oct 2020 12:24:58 GMT, Thomas Stuefe wrote: > Hi, > > this is rather trivial: > > runtime/ThreadCountLimit.java, introduced with JDK-8222671, caused problems on our test machines (JDK-8222671 is private and cannot be accessed from outside Oracle, so all I know comes from its review thread [1]). > > The test creates massive amount of threads in order to hit some OS limit which would manifest as an OOM. This affects unrelated processes, unless the test is executed in a jail or with specific limits set. > > This test probably should be executed with a specific limit, but for now lets mark it as @stress and remove it from tier1. This makes sense to me. On my TR 3970X with Linux x86_64 release, it peaks at 70G VIRT, 3G RSS, and fails half of the time. The test itself should probably made more resilient, but even then it does not look like `tier1`-grade test. ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/876 From rkennke at openjdk.java.net Tue Oct 27 13:45:38 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 27 Oct 2020 13:45:38 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v13] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: <9OMvQQo4NDb-SXiBw9kh46B5ABD6ykwumz70Q6Z6maQ=.9ada24e7-de57-4aa2-8993-0d8143e29057@github.com> > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads that make heavvy use of such weak references will therefore potentially cause significant GC pauses. > > There are 3 main items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for FinalReference, marking stops there, and does not mark through the referent. > 2. Concurrent processing is performed after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers in Reference.get() intrinsics that return NULL when the referent is not reachable. > > Testing: hotspot_gc_shenadoah (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions, tier1, tier2, vmTestbase_vm_metaspace, vmTestbase_nsk_jvmti, with -XX:+UseShenandoahGC without regressions, specjvm with various levels of verification Roman Kennke has updated the pull request incrementally with three additional commits since the last revision: - Rename ShenandoahLRBKind -> AccessKind - Relax verification only for j.l.r.Reference objects - Intendation fixes ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/505/files - new: https://git.openjdk.java.net/jdk/pull/505/files/6418428d..3a3f1c44 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=12 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=11-12 Stats: 82 lines in 14 files changed: 11 ins; 4 del; 67 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From rkennke at openjdk.java.net Tue Oct 27 14:00:33 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 27 Oct 2020 14:00:33 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v12] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: On Tue, 27 Oct 2020 10:49:58 GMT, Aleksey Shipilev wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename native argument to maybe_narrow_oop for more clarity > > src/hotspot/share/gc/shenandoah/c2/shenandoahBarrierSetC2.cpp line 1062: > >> 1060: Node* in2 = n->in(2); >> 1061: if (in1->bottom_type() == TypePtr::NULL_PTR && >> 1062: (in2->Opcode() != Op_ShenandoahLoadReferenceBarrier || > > This is a bugfix, right? It changes `in1` (seemingly incorrect) to `in2` (seemingly correct). If so, maybe we should split it out to fix previous releases too? Actually I think I *introduced* a bug there. It seems curious that it worked that way :-) I'm fixing it. ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From aph at openjdk.java.net Tue Oct 27 14:06:24 2020 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 27 Oct 2020 14:06:24 GMT Subject: RFR: 8254072: AArch64: Get rid of --disable-warnings-as-errors on Windows+ARM64 build [v4] In-Reply-To: References: Message-ID: On Thu, 15 Oct 2020 18:35:30 GMT, Bernhard Urban-Forster wrote: >> I organized this PR so that each commit contains the warning emitted by MSVC as commit message and its relevant fix. >> >> Verified on >> * Linux+ARM64: `{hotspot,jdk,langtools}:tier1`, no failures. >> * Windows+ARM64: `{hotspot,jdk,langtools}:tier1`, no (new) failures. >> * internal macOS+ARM64 port: build without `--disable-warnings-as-errors` still works. Just mentioning this here, because it's yet another toolchain (Xcode / clang) that needs to be kept happy [going forward](https://openjdk.java.net/jeps/391). > > Bernhard Urban-Forster has updated the pull request incrementally with two additional commits since the last revision: > > - uppercase suffix > - add assert Marked as reviewed by aph (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/530 From rkennke at openjdk.java.net Tue Oct 27 14:22:31 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 27 Oct 2020 14:22:31 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v14] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads that make heavvy use of such weak references will therefore potentially cause significant GC pauses. > > There are 3 main items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for FinalReference, marking stops there, and does not mark through the referent. > 2. Concurrent processing is performed after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers in Reference.get() intrinsics that return NULL when the referent is not reachable. > > Testing: hotspot_gc_shenadoah (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions, tier1, tier2, vmTestbase_vm_metaspace, vmTestbase_nsk_jvmti, with -XX:+UseShenandoahGC without regressions, specjvm with various levels of verification Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 61 commits: - Merge branch 'master' into shenandoah-concurrent-weakrefs - Fix CmpP optimization - Rename ShenandoahLRBKind -> AccessKind - Relax verification only for j.l.r.Reference objects - Intendation fixes - Rename native argument to maybe_narrow_oop for more clarity - Change ShenandoahLRBKind to be an enum class instead of plain enum, and some minor touch-ups - Add fallback support for new properties in ObjArrayChunkedTask - Fix 32bit interpreter LRB-native call - Explicitely use concurrent vs stw reference processing, don't rely on is_at_shenandoah_safepoint() - ... and 51 more: https://git.openjdk.java.net/jdk/compare/83a91bfa...d95e88c6 ------------- Changes: https://git.openjdk.java.net/jdk/pull/505/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=13 Stats: 2394 lines in 56 files changed: 1628 ins; 567 del; 199 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From rkennke at openjdk.java.net Tue Oct 27 14:30:42 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 27 Oct 2020 14:30:42 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v15] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads that make heavvy use of such weak references will therefore potentially cause significant GC pauses. > > There are 3 main items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for FinalReference, marking stops there, and does not mark through the referent. > 2. Concurrent processing is performed after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers in Reference.get() intrinsics that return NULL when the referent is not reachable. > > Testing: hotspot_gc_shenadoah (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions, tier1, tier2, vmTestbase_vm_metaspace, vmTestbase_nsk_jvmti, with -XX:+UseShenandoahGC without regressions, specjvm with various levels of verification Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 62 commits: - Merge branch 'master' into shenandoah-concurrent-weakrefs - Merge branch 'master' into shenandoah-concurrent-weakrefs - Fix CmpP optimization - Rename ShenandoahLRBKind -> AccessKind - Relax verification only for j.l.r.Reference objects - Intendation fixes - Rename native argument to maybe_narrow_oop for more clarity - Change ShenandoahLRBKind to be an enum class instead of plain enum, and some minor touch-ups - Add fallback support for new properties in ObjArrayChunkedTask - Fix 32bit interpreter LRB-native call - ... and 52 more: https://git.openjdk.java.net/jdk/compare/504cb005...4e63ff73 ------------- Changes: https://git.openjdk.java.net/jdk/pull/505/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=14 Stats: 2394 lines in 56 files changed: 1628 ins; 567 del; 199 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From stuefe at openjdk.java.net Tue Oct 27 14:33:19 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 27 Oct 2020 14:33:19 GMT Subject: RFR: JDK-8255450: runtime/ThreadCountLimit.java causes high system load In-Reply-To: References: Message-ID: On Tue, 27 Oct 2020 13:40:36 GMT, Aleksey Shipilev wrote: >> Hi, >> >> this is rather trivial: >> >> runtime/ThreadCountLimit.java, introduced with JDK-8222671, caused problems on our test machines (JDK-8222671 is private and cannot be accessed from outside Oracle, so all I know comes from its review thread [1]). >> >> The test creates massive amount of threads in order to hit some OS limit which would manifest as an OOM. This affects unrelated processes, unless the test is executed in a jail or with specific limits set. >> >> This test probably should be executed with a specific limit, but for now lets mark it as @stress and remove it from tier1. > > This makes sense to me. On my TR 3970X with Linux x86_64 release, it peaks at 70G VIRT, 3G RSS, and fails half of the time. The test itself should probably made more resilient, but even then it does not look like `tier1`-grade test. Following the trivial rule I will integrate this patch. ------------- PR: https://git.openjdk.java.net/jdk/pull/876 From stuefe at openjdk.java.net Tue Oct 27 14:33:21 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 27 Oct 2020 14:33:21 GMT Subject: Integrated: JDK-8255450: runtime/ThreadCountLimit.java causes high system load In-Reply-To: References: Message-ID: On Tue, 27 Oct 2020 12:24:58 GMT, Thomas Stuefe wrote: > Hi, > > this is rather trivial: > > runtime/ThreadCountLimit.java, introduced with JDK-8222671, caused problems on our test machines (JDK-8222671 is private and cannot be accessed from outside Oracle, so all I know comes from its review thread [1]). > > The test creates massive amount of threads in order to hit some OS limit which would manifest as an OOM. This affects unrelated processes, unless the test is executed in a jail or with specific limits set. > > This test probably should be executed with a specific limit, but for now lets mark it as @stress and remove it from tier1. This pull request has now been integrated. Changeset: 7d41a541 Author: Thomas Stuefe URL: https://git.openjdk.java.net/jdk/commit/7d41a541 Stats: 2 lines in 2 files changed: 2 ins; 0 del; 0 mod 8255450: runtime/ThreadCountLimit.java causes high system load Reviewed-by: shade ------------- PR: https://git.openjdk.java.net/jdk/pull/876 From akozlov at openjdk.java.net Tue Oct 27 14:34:38 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 27 Oct 2020 14:34:38 GMT Subject: RFR: 8255254: Split os::reserve_memory and os::map_memory_to_file interfaces [v3] In-Reply-To: References: Message-ID: > Hi, > > Please review a change to extract map_memory_to_file interface out of reserve_memory when the latter takes file descriptor. > > The change should be a pure refactoring without changes in functionality. The only part is disturbing: a comment in original os_posix.cpp:316 seems to refer to else clause and it contradicts to the actual code. Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: Fix reviewing findings 2 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/812/files - new: https://git.openjdk.java.net/jdk/pull/812/files/df6fb834..8908acb7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=812&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=812&range=01-02 Stats: 13 lines in 9 files changed: 1 ins; 0 del; 12 mod Patch: https://git.openjdk.java.net/jdk/pull/812.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/812/head:pull/812 PR: https://git.openjdk.java.net/jdk/pull/812 From akozlov at openjdk.java.net Tue Oct 27 14:34:41 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 27 Oct 2020 14:34:41 GMT Subject: RFR: 8255254: Split os::reserve_memory and os::map_memory_to_file interfaces [v2] In-Reply-To: References: <8OBEzNqpmYnnaGIXG32ksZn2FZGEmyZEd2hZmdutRa8=.c360dd90-b3df-4040-8013-5541b90406b8@github.com> Message-ID: On Tue, 27 Oct 2020 12:29:11 GMT, Thomas Stuefe wrote: >> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix review findings > > src/hotspot/share/runtime/os.hpp line 372: > >> 370: static char* map_memory_to_file_aligned(size_t size, size_t alignment, int fd); >> 371: static char* map_memory_to_file(char* base, size_t size, int fd); >> 372: static char* attempt_map_memory_to_file(char* base, size_t size, int fd); > > Can we name this attempt_map_memory_to_file **_at** to have equality to "reserve_memory_at" ? Makes sense, thanks, fixed > src/hotspot/os/linux/os_linux.cpp line 4207: > >> 4205: } >> 4206: >> 4207: char* os::pd_attempt_map_memory_to_file(char* requested_addr, size_t bytes, int file_desc) { > > This is fine, but I am not sure anymore what the point was of first reserving, then replacing the mapping in the first place. Seems to me one could call os::map_memory_to_file() directly, or am I missing something? > > I think this was somehow all related to https://openjdk.java.net/jeps/316 and allocating on NVDIMM. `os::map_memory_to_file` assumes the base strictly and uses `MAP_FIXED`, when it is not null https://github.com/openjdk/jdk/blob/ae72b5283b5b5eee0fbb6c9121494a4e65fb381c/src/hotspot/os/posix/os_posix.cpp#L275 So, to treat the address as a hint only, pd_attempt_reserve_memory_at is called > src/hotspot/os/posix/os_posix.cpp line 329: > >> 327: >> 328: if (end_offset > 0) { >> 329: os::release_memory(extra_base + begin_offset + size, end_offset); > > Not your patch, but the name "end_offset" is seriously confusing here... Probably yes. But if you don't mind, I would leave that as is. A nice picture above provides some clarification and I'd really like to avoid touching unrelated code > src/hotspot/os/posix/os_posix.cpp line 335: > >> 333: } >> 334: >> 335: // Multiple threads can race in this code, and can remap over each other with MAP_FIXED, > > I think you could either completely remove the comment - its somewhat obvious - or remove it down to the call of chop_extra_memory. I would like to persist the comment (as the race is not evident for the reader). As for the placement, the comment describes an alternative implementation of this function and provides rationale why the current one is taken. chop_extra_memory is a step of the implementation, that has no alternative. I think it would be wrong to move the comment there. But I've added a comment for chop_extra_memory for clarity. > src/hotspot/os/windows/os_windows.cpp line 3140: > >> 3138: // virtual space to get requested alignment, like posix-like os's. >> 3139: // Windows prevents multiple thread from remapping over each other so this loop is thread-safe. >> 3140: char* map_or_reserve_memory_aligned(size_t size, size_t alignment, int file_desc) { > > static ? Thanks, I've missed this. ------------- PR: https://git.openjdk.java.net/jdk/pull/812 From shade at openjdk.java.net Tue Oct 27 14:36:25 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 27 Oct 2020 14:36:25 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) In-Reply-To: References: Message-ID: On Wed, 7 Oct 2020 17:13:22 GMT, Maurizio Cimadamore wrote: > This patch contains the changes associated with the third incubation round of the foreign memory access API incubation (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: > > * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from multiple threads > * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class has been added, which defines several useful dereference routines; these are really just thin wrappers around memory access var handles, but they make the barrier of entry for using this API somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit of dereference. > > This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done by calling `MemoryAddress::asSegmentRestricted`). > > A list of the API, implementation and test changes is provided below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be happy to point at existing discussions, and/or to provide the feedback required. > > A big thank to Erik Osterlund, Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. > > Thanks > Maurizio > > Javadoc: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which it is easy to derive all the other handles using plain var handle combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. > * `MemoryLayouts` > * A new layout, for machine addresses, has been added to the mix. > > > > ### Implementation changes > > There are two main things to discuss here: support for shared segments, and the general simplification of the memory access var handle support. > > #### Shared segments > > The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment is shared, it would be possible for a thread to close it while another is accessing it. > > After considering several options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. > > Sadly, none of these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). > > The question is, then, once we detect that a thread is accessing the very segment we're about to close, what should happen? We first experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread is accessing the segment being closed. > > As written in the javadoc, this doesn't mean that clients should just catch and try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should be treated as such. > > In terms of gritty implementation, we needed to centralize memory access routines in a single place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` annotation, which tells the VM that something important is going on. > > To achieve this, we created a new (autogenerated) class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during access (which is important when registering segments against cleaners). > > Of course, to make memory access safe, memory access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead of unsafe, so that a liveness check can be triggered (in case a scope is present). > > `ScopedMemoryAccess` has a `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed successfully. > > The implementation of `MemoryScope` (now significantly simplified from what we had before), has two implementations, one for confined segments and one for shared segments; the main difference between the two is what happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. > > #### Memory access var handles overhaul > > The key realization here was that if all memory access var handles took a coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle form. > > This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that e.g. additional offset is injected into a base memory access var handle. > > This also helped in simplifying the implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level access on the innards of the memory access var handle. All that code is now gone. > > #### Test changes > > Not much to see here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared segment case. > > [1] - https://openjdk.java.net/jeps/393 > [2] - https://openjdk.java.net/jeps/389 > [3] - https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html > [4] - https://openjdk.java.net/jeps/312 @mcimadamore, if you pull from current master, you would get the Linux x86_32 tier1 run "for free". ------------- PR: https://git.openjdk.java.net/jdk/pull/548 From rrich at openjdk.java.net Tue Oct 27 14:42:21 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Tue, 27 Oct 2020 14:42:21 GMT Subject: RFR: 8255243: Reinforce escape barrier interactions with ZGC conc stack processing [v2] In-Reply-To: References: Message-ID: On Tue, 27 Oct 2020 13:31:36 GMT, Erik ?sterlund wrote: >> The escape barrier reallocates scalarized objects potentially deep into the stack of a remote thread. Each allocation can safepoint, causing referenced frames to be invalid. Some sprinklings were added that deal with that, but I believe it was subsequently broken with the integration of the new vector API, that has its own new deoptimization code that did not know about this. Not surprisingly, the integration of the new vector API had no idea about this subtlety, and allocates an object, and then reads an object deep from the stack of a remote thread (using an escape barrier). I suppose the issue is that all these 3 things were integrated at almost the same time. The problematic code sequence is in VectorSupport::allocate_vector() in vectorSupport.cpp, which is called from Deoptimization::realloc_objects(). It first allocates an oop (possibly safepointing), and then reads a vector oop from the stack. This is usually fine, but not through the escape barrier, with concurrent stack s canning. While I have not seen any crashes yet, I can see from code inspection, that there is no way that this works correctly. >> >> In order to make this less fragile for future changes, we should really have a RAII object that keeps the target thread's stack of the escape barrier, stable and processed, across safepoints. This patch fixes that. Then it becomes much easier to reason about its correctness, compared to hoping the various hooks are applied after each safepoint. >> >> With this new robustness fix, the thread running the escape barrier, keeps the target thread stack processed, straight through safepoints on the requesting thread, making it easy and intuitive to understand why this works correctly. The RAII object basically just has to cover the code block that pokes at the remote stack and goes in and out of safepoints, arbitrarily. Arguably, this escape barrier doesn't need to be blazingly fast, and can afford keeping stacks sane through its operation. > > Erik ?sterlund has updated the pull request incrementally with two additional commits since the last revision: > > - Better encapsulate object deoptimization in EscapeBarrier also to facilitate correct interaction with concurrent thread stack processing. > > The Stackwalk for object deoptimization in VM_GetOrSetLocal::doit_prologue is not prepared for concurrent thread stack processing. > EscapeBarrier::deoptimize_objects(int depth) is extended to cover a range of frames from depth d1 to depth d2. It is also prepared for concurrent thread stack processing. With this change it is used to deoptimize objects in the prologue of VM_GetOrSetLocal. > - Review comments Thanks for making EscapeBarriers more robust with regard to concurrent thread stack processing. The change looks good to me. ------------- Marked as reviewed by rrich (Committer). PR: https://git.openjdk.java.net/jdk/pull/832 From rrich at openjdk.java.net Tue Oct 27 14:42:23 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Tue, 27 Oct 2020 14:42:23 GMT Subject: RFR: 8255243: Reinforce escape barrier interactions with ZGC conc stack processing [v2] In-Reply-To: <6a6IAyQBDlOUwyJ4qnxDc9BxK0zJYtMy5PRW33GFhcA=.a6661933-fcca-4700-9265-a4018f206e11@github.com> References: <_J8bYqJcDa-5BvnEtZkdc3zIY21IfgEXTYvSSWy7znY=.074f43c7-ceac-4bb2-907e-b141d273dc4e@github.com> <5tKw_M8ud42YEtE-k_93YPjTbpC13BRPT20afBbInbA=.039581b3-7c53-42fa-947f-d672d2192202@github.com> <6a6IAyQBDlOUwyJ4qnxDc9BxK0zJYtMy5PRW33GFhcA=.a6661933-fcca-4700-9265-a4018f206e11@github.com> Message-ID: On Tue, 27 Oct 2020 13:28:53 GMT, Erik ?sterlund wrote: >> It does sound good indeed to me if you don't mind doing that. Thanks! >> I have run the tests dedicated to EscapeBarriers with ZGC enabled and also the DeoptimizeObjectsALot stress testing. I will run some more serviceability tests and my teams CI testing until tomorrow. > > Thanks @reinrich. I uploaded your patch to this PR, and will add you as contributor. Also addressed your review comments. Hope your testing went fine. Thanks for importing the patch and for addressing my comments. I've tested hotspot_serviceability, jdk_svc, jdk_jdi, vmTestbase_nsk_jdi, vmTestbase_nsk_jvmti, vmTestbase_nsk_jdwp. Just the test jdk/jdk/jfr/event/runtime/TestClassLoaderStatsEvent.java fails repeatedly with ZGC. It does so independently of this pr. ------------- PR: https://git.openjdk.java.net/jdk/pull/832 From stuefe at openjdk.java.net Tue Oct 27 14:43:25 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 27 Oct 2020 14:43:25 GMT Subject: RFR: 8255254: Split os::reserve_memory and os::map_memory_to_file interfaces [v3] In-Reply-To: References: Message-ID: On Tue, 27 Oct 2020 14:34:38 GMT, Anton Kozlov wrote: >> Hi, >> >> Please review a change to extract map_memory_to_file interface out of reserve_memory when the latter takes file descriptor. >> >> The change should be a pure refactoring without changes in functionality. The only part is disturbing: a comment in original os_posix.cpp:316 seems to refer to else clause and it contradicts to the actual code. > > Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Fix reviewing findings 2 LGTM. Thanks, Thomas ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/812 From stuefe at openjdk.java.net Tue Oct 27 14:43:27 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 27 Oct 2020 14:43:27 GMT Subject: RFR: 8255254: Split os::reserve_memory and os::map_memory_to_file interfaces [v2] In-Reply-To: References: <8OBEzNqpmYnnaGIXG32ksZn2FZGEmyZEd2hZmdutRa8=.c360dd90-b3df-4040-8013-5541b90406b8@github.com> Message-ID: On Tue, 27 Oct 2020 14:30:48 GMT, Anton Kozlov wrote: >> src/hotspot/os/linux/os_linux.cpp line 4207: >> >>> 4205: } >>> 4206: >>> 4207: char* os::pd_attempt_map_memory_to_file(char* requested_addr, size_t bytes, int file_desc) { >> >> This is fine, but I am not sure anymore what the point was of first reserving, then replacing the mapping in the first place. Seems to me one could call os::map_memory_to_file() directly, or am I missing something? >> >> I think this was somehow all related to https://openjdk.java.net/jeps/316 and allocating on NVDIMM. > > `os::map_memory_to_file` assumes the base strictly and uses `MAP_FIXED`, when it is not null https://github.com/openjdk/jdk/blob/ae72b5283b5b5eee0fbb6c9121494a4e65fb381c/src/hotspot/os/posix/os_posix.cpp#L275 > So, to treat the address as a hint only, pd_attempt_reserve_memory_at is called Missed that. Okay, I see. It is the same then as `os::replace_existing_mapping_with_file_mapping` - which is actually just a wrapper with an assert - and those could be melded into one. But as I said, I'm fine doing this in a separate cleanup. >> src/hotspot/os/posix/os_posix.cpp line 329: >> >>> 327: >>> 328: if (end_offset > 0) { >>> 329: os::release_memory(extra_base + begin_offset + size, end_offset); >> >> Not your patch, but the name "end_offset" is seriously confusing here... > > Probably yes. But if you don't mind, I would leave that as is. A nice picture above provides some clarification and I'd really like to avoid touching unrelated code Sure. >> src/hotspot/os/posix/os_posix.cpp line 335: >> >>> 333: } >>> 334: >>> 335: // Multiple threads can race in this code, and can remap over each other with MAP_FIXED, >> >> I think you could either completely remove the comment - its somewhat obvious - or remove it down to the call of chop_extra_memory. > > I would like to persist the comment (as the race is not evident for the reader). As for the placement, the comment describes an alternative implementation of this function and provides rationale why the current one is taken. chop_extra_memory is a step of the implementation, that has no alternative. I think it would be wrong to move the comment there. But I've added a comment for chop_extra_memory for clarity. Okay. ------------- PR: https://git.openjdk.java.net/jdk/pull/812 From mcimadamore at openjdk.java.net Tue Oct 27 14:43:33 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Tue, 27 Oct 2020 14:43:33 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v15] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the third incubation round of the foreign memory access API incubation (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: > > * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from multiple threads > * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class has been added, which defines several useful dereference routines; these are really just thin wrappers around memory access var handles, but they make the barrier of entry for using this API somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit of dereference. > > This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done by calling `MemoryAddress::asSegmentRestricted`). > > A list of the API, implementation and test changes is provided below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be happy to point at existing discussions, and/or to provide the feedback required. > > A big thank to Erik Osterlund, Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. > > Thanks > Maurizio > > Javadoc: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which it is easy to derive all the other handles using plain var handle combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. > * `MemoryLayouts` > * A new layout, for machine addresses, has been added to the mix. > > > > ### Implementation changes > > There are two main things to discuss here: support for shared segments, and the general simplification of the memory access var handle support. > > #### Shared segments > > The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment is shared, it would be possible for a thread to close it while another is accessing it. > > After considering several options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. > > Sadly, none of these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). > > The question is, then, once we detect that a thread is accessing the very segment we're about to close, what should happen? We first experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread is accessing the segment being closed. > > As written in the javadoc, this doesn't mean that clients should just catch and try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should be treated as such. > > In terms of gritty implementation, we needed to centralize memory access routines in a single place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` annotation, which tells the VM that something important is going on. > > To achieve this, we created a new (autogenerated) class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during access (which is important when registering segments against cleaners). > > Of course, to make memory access safe, memory access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead of unsafe, so that a liveness check can be triggered (in case a scope is present). > > `ScopedMemoryAccess` has a `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed successfully. > > The implementation of `MemoryScope` (now significantly simplified from what we had before), has two implementations, one for confined segments and one for shared segments; the main difference between the two is what happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. > > #### Memory access var handles overhaul > > The key realization here was that if all memory access var handles took a coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle form. > > This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that e.g. additional offset is injected into a base memory access var handle. > > This also helped in simplifying the implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level access on the innards of the memory access var handle. All that code is now gone. > > #### Test changes > > Not much to see here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared segment case. > > [1] - https://openjdk.java.net/jeps/393 > [2] - https://openjdk.java.net/jeps/389 > [3] - https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html > [4] - https://openjdk.java.net/jeps/312 Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 21 commits: - Remove TestMismatch from 32-bit problem list - Merge branch 'master' into 8254162 - Tweak javadoc for MemorySegment::mapFromPath Tweak alignment for long/double Java layouts on 32 bits platforms - Merge branch 'master' into 8254162 - Address review comment for scoped memory access makefile - Address CSR comments - Back-port of TestByteBuffer fix - Merge branch 'master' into 8254162 - Merge branch 'master' into 8254162 - Merge branch 'master' into 8254162 - ... and 11 more: https://git.openjdk.java.net/jdk/compare/7d41a541...f844f544 ------------- Changes: https://git.openjdk.java.net/jdk/pull/548/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=14 Stats: 7526 lines in 80 files changed: 4814 ins; 1531 del; 1181 mod Patch: https://git.openjdk.java.net/jdk/pull/548.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/548/head:pull/548 PR: https://git.openjdk.java.net/jdk/pull/548 From mcimadamore at openjdk.java.net Tue Oct 27 14:43:33 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Tue, 27 Oct 2020 14:43:33 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) In-Reply-To: References: Message-ID: On Wed, 7 Oct 2020 17:13:22 GMT, Maurizio Cimadamore wrote: > This patch contains the changes associated with the third incubation round of the foreign memory access API incubation (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: > > * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from multiple threads > * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class has been added, which defines several useful dereference routines; these are really just thin wrappers around memory access var handles, but they make the barrier of entry for using this API somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit of dereference. > > This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done by calling `MemoryAddress::asSegmentRestricted`). > > A list of the API, implementation and test changes is provided below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be happy to point at existing discussions, and/or to provide the feedback required. > > A big thank to Erik Osterlund, Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. > > Thanks > Maurizio > > Javadoc: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which it is easy to derive all the other handles using plain var handle combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. > * `MemoryLayouts` > * A new layout, for machine addresses, has been added to the mix. > > > > ### Implementation changes > > There are two main things to discuss here: support for shared segments, and the general simplification of the memory access var handle support. > > #### Shared segments > > The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment is shared, it would be possible for a thread to close it while another is accessing it. > > After considering several options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. > > Sadly, none of these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). > > The question is, then, once we detect that a thread is accessing the very segment we're about to close, what should happen? We first experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread is accessing the segment being closed. > > As written in the javadoc, this doesn't mean that clients should just catch and try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should be treated as such. > > In terms of gritty implementation, we needed to centralize memory access routines in a single place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` annotation, which tells the VM that something important is going on. > > To achieve this, we created a new (autogenerated) class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during access (which is important when registering segments against cleaners). > > Of course, to make memory access safe, memory access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead of unsafe, so that a liveness check can be triggered (in case a scope is present). > > `ScopedMemoryAccess` has a `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed successfully. > > The implementation of `MemoryScope` (now significantly simplified from what we had before), has two implementations, one for confined segments and one for shared segments; the main difference between the two is what happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. > > #### Memory access var handles overhaul > > The key realization here was that if all memory access var handles took a coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle form. > > This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that e.g. additional offset is injected into a base memory access var handle. > > This also helped in simplifying the implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level access on the innards of the memory access var handle. All that code is now gone. > > #### Test changes > > Not much to see here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared segment case. > > [1] - https://openjdk.java.net/jeps/393 > [2] - https://openjdk.java.net/jeps/389 > [3] - https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html > [4] - https://openjdk.java.net/jeps/312 > @mcimadamore, if you pull from current master, you would get the Linux x86_32 tier1 run "for free". Just did that - I also removed TestMismatch from the problem list in the latest iteration, and fixed the alignment for long/double layouts, after chatting with the team (https://bugs.openjdk.java.net/browse/JDK-8255350) ------------- PR: https://git.openjdk.java.net/jdk/pull/548 From akozlov at openjdk.java.net Tue Oct 27 14:50:23 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 27 Oct 2020 14:50:23 GMT Subject: RFR: 8255254: Split os::reserve_memory and os::map_memory_to_file interfaces [v3] In-Reply-To: References: Message-ID: On Tue, 27 Oct 2020 14:41:00 GMT, Thomas Stuefe wrote: >> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix reviewing findings 2 > > LGTM. > > Thanks, Thomas Thanks for reviews! ------------- PR: https://git.openjdk.java.net/jdk/pull/812 From rkennke at openjdk.java.net Tue Oct 27 14:59:24 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 27 Oct 2020 14:59:24 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v12] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: On Tue, 27 Oct 2020 10:54:25 GMT, Aleksey Shipilev wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename native argument to maybe_narrow_oop for more clarity > > src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.hpp line 104: > >> 102: >> 103: template >> 104: inline oop load_reference_barrier_native(oop obj, T* load_addr); > > These might be forked to a separate cleanup? Not insisting... That would make backports a bit cleaner, though. I am changing back the template arg of load_reference_barrier_mutator() instead, we are using class everywhere else. > src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp line 81: > >> 79: _heap(ShenandoahHeap::heap()), >> 80: _mark_context(_heap->marking_context()), >> 81: _strong(true) > > Do we want to turn this to yet another template parameter, like for dedup? That would also resolve passing `true` or `false` to `strong` argument without comments. We need to switch strength in ShenandoahConcurrentMark::do_task() and we get passed-in a ready closure there. I am not sure how we could do that with template-args. Template args only make sense for things that don't change during marking. > src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.inline.hpp line 264: > >> 262: marked = mark_context->mark_strong(obj, marked_first); >> 263: } else { >> 264: marked = mark_context->mark_final(obj, marked_first); > > Is this `mark_final` actually `mark_weak`? We could name it so, but it really means 'reachable through a FinalReference' so 'finalizably reachable' and 'marked final(izable)' seems the more correct term. It is weaker than 'strong' though, so yeah we could rename this. WDYT? > src/hotspot/share/gc/shenandoah/shenandoahMarkBitMap.inline.hpp line 64: > >> 62: } >> 63: >> 64: inline bool ShenandoahMarkBitMap::mark_final(HeapWord* heap_addr, bool& marked_first) { > > It looks to me that `marked_first` is always the same as the return value? If so, can we drop that out-argument? Yes, that is true for mark_final(), but not for mark_strong(). I am changing it as you suggested. ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From shade at openjdk.java.net Tue Oct 27 14:59:24 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 27 Oct 2020 14:59:24 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v12] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: On Tue, 27 Oct 2020 14:54:41 GMT, Roman Kennke wrote: >> src/hotspot/share/gc/shenandoah/shenandoahMarkBitMap.inline.hpp line 64: >> >>> 62: } >>> 63: >>> 64: inline bool ShenandoahMarkBitMap::mark_final(HeapWord* heap_addr, bool& marked_first) { >> >> It looks to me that `marked_first` is always the same as the return value? If so, can we drop that out-argument? > > Yes, that is true for mark_final(), but not for mark_strong(). I am changing it as you suggested. No wait, maybe that's fine then. Let me think about it. ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From shade at openjdk.java.net Tue Oct 27 15:21:24 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 27 Oct 2020 15:21:24 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v12] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: On Tue, 27 Oct 2020 10:51:49 GMT, Aleksey Shipilev wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename native argument to maybe_narrow_oop for more clarity > > src/hotspot/share/gc/shenandoah/c2/shenandoahSupport.cpp line 996: > >> 994: break; >> 995: default: >> 996: ShouldNotReachHere(); > > I expect some compilers to complain here about the uninitialized `name`, please add `name = NULL;` before the `ShouldNotReachHere()`? Ditto for `calladdr`, now that I looking at this block again. Maybe just initialize them at declaration. ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From mcimadamore at openjdk.java.net Tue Oct 27 15:53:31 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Tue, 27 Oct 2020 15:53:31 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v16] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the third incubation round of the foreign memory access API incubation (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: > > * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from multiple threads > * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class has been added, which defines several useful dereference routines; these are really just thin wrappers around memory access var handles, but they make the barrier of entry for using this API somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit of dereference. > > This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done by calling `MemoryAddress::asSegmentRestricted`). > > A list of the API, implementation and test changes is provided below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be happy to point at existing discussions, and/or to provide the feedback required. > > A big thank to Erik Osterlund, Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. > > Thanks > Maurizio > > Javadoc: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which it is easy to derive all the other handles using plain var handle combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. > * `MemoryLayouts` > * A new layout, for machine addresses, has been added to the mix. > > > > ### Implementation changes > > There are two main things to discuss here: support for shared segments, and the general simplification of the memory access var handle support. > > #### Shared segments > > The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment is shared, it would be possible for a thread to close it while another is accessing it. > > After considering several options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. > > Sadly, none of these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). > > The question is, then, once we detect that a thread is accessing the very segment we're about to close, what should happen? We first experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread is accessing the segment being closed. > > As written in the javadoc, this doesn't mean that clients should just catch and try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should be treated as such. > > In terms of gritty implementation, we needed to centralize memory access routines in a single place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` annotation, which tells the VM that something important is going on. > > To achieve this, we created a new (autogenerated) class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during access (which is important when registering segments against cleaners). > > Of course, to make memory access safe, memory access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead of unsafe, so that a liveness check can be triggered (in case a scope is present). > > `ScopedMemoryAccess` has a `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed successfully. > > The implementation of `MemoryScope` (now significantly simplified from what we had before), has two implementations, one for confined segments and one for shared segments; the main difference between the two is what happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. > > #### Memory access var handles overhaul > > The key realization here was that if all memory access var handles took a coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle form. > > This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that e.g. additional offset is injected into a base memory access var handle. > > This also helped in simplifying the implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level access on the innards of the memory access var handle. All that code is now gone. > > #### Test changes > > Not much to see here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared segment case. > > [1] - https://openjdk.java.net/jeps/393 > [2] - https://openjdk.java.net/jeps/389 > [3] - https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html > [4] - https://openjdk.java.net/jeps/312 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: * Add final to MappedByteBuffer::SCOPED_MEMORY_ACCESS field * Tweak TestLayouts to make it 32-bit friendly after recent MemoryLayouts tweaks ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/548/files - new: https://git.openjdk.java.net/jdk/pull/548/files/f844f544..e43f5d76 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=15 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=14-15 Stats: 8 lines in 2 files changed: 0 ins; 3 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/548.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/548/head:pull/548 PR: https://git.openjdk.java.net/jdk/pull/548 From gziemski at openjdk.java.net Tue Oct 27 16:08:21 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Tue, 27 Oct 2020 16:08:21 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) In-Reply-To: References: Message-ID: <4_dt8K2LhJ6WqhVpPYiT_wbMIi-Qfqf6lt5GCc5Tq6M=.d74d54bf-88dd-4c9a-9ca6-c1e1b86916b6@github.com> On Mon, 26 Oct 2020 15:32:49 GMT, Gerard Ziemski wrote: >> Hi Gerard, >> >> I think we have a fundamental problem here that UseOSErrorReporting was only ever intended for use on Windows. It simply allows VMError::report_and_die to return instead of actually making the VM "die". For Windows this means we can continue to propagate the windows exception and thus allow Windows Error Reporting (WER) to take over. Whether this actually works correctly or not is a different matter. >> >> For non-Windows there is no pre-established alternative code path for report_and_die() returning. >> >> In the bug report you write: >> >>> On Mac/Linux it would look more like this: >>> >>> #1 catch signal in our handler >>> #2 generate hs_err log >>> #3 turn off our signal handler >>> #4 continue the process normally, allowing it to crash again in the same spot, with the same signal being generated >>> >> >> To me you are now inventing what UseOSErrorReporting should mean on non-Windows, and I don't agree with it. I don't think it should mean that we re-crash using the "default" signal response and consider that as using "OS error reporting". To me that is just not valid, especially when we cannot return from a signal handling context in many cases without incurring undefined behaviour. To me #4 is not a valid expectation as we have no way to know what will happen next if the signal handler returns. It would also be wrong to just continue execution after an assertion or guarantee fails. >> >> I'm assuming that the motivation here is that on macOS if we use the default signal handling modes then macOS will do its own error reporting? If so I would suggest that the right response may be to return from report_and_die (on macOS only) and then deliberately crash after restoring the default handler. Obviously that will change which "crash" the OS reports but that is likely to happen anyway as you cannot guarantee how you will crash after trying to continue (and this goes beyond our general "best effort" approaches in signal handling.) >> >> Beyond that I share Thomas's concerns about making sweeping changes to installed signal handlers. >> >> So my preferred approaches here would be: >> >> 1. Make UseOSErrorReporting Windows only; or >> 2. Make UseOSErrorReporting Windows and macOS only. Then on macOS do a targeted crash after report_and_die() returns. >> >> Thanks, >> David > >> Hi Gerard, >> >> I think we have a fundamental problem here that UseOSErrorReporting was only ever intended for use on Windows. It simply allows VMError::report_and_die to return instead of actually making the VM "die". For Windows this means we can continue to propagate the windows exception and thus allow Windows Error Reporting (WER) to take over. Whether this actually works correctly or not is a different matter. >> >> For non-Windows there is no pre-established alternative code path for report_and_die() returning. >> >> In the bug report you write: >> >> > On Mac/Linux it would look more like this: >> > #1 catch signal in our handler >> > #2 generate hs_err log >> > #3 turn off our signal handler >> > #4 continue the process normally, allowing it to crash again in the same spot, with the same signal being generated >> >> To me you are now inventing what UseOSErrorReporting should mean on non-Windows, and I don't agree with it. I don't think it should mean that we re-crash using the "default" signal response and consider that as using "OS error reporting". To me that is just not valid, especially when we cannot return from a signal handling context in many cases without incurring undefined behaviour. To me #4 is not a valid expectation as we have no way to know what will happen next if the signal handler returns. It would also be wrong to just continue execution after an assertion or guarantee fails. >> >> I'm assuming that the motivation here is that on macOS if we use the default signal handling modes then macOS will do its own error reporting? If so I would suggest that the right response may be to return from report_and_die (on macOS only) and then deliberately crash after restoring the default handler. Obviously that will change which "crash" the OS reports but that is likely to happen anyway as you cannot guarantee how you will crash after trying to continue (and this goes beyond our general "best effort" approaches in signal handling.) >> >> Beyond that I share Thomas's concerns about making sweeping changes to installed signal handlers. >> >> So my preferred approaches here would be: >> >> 1. Make UseOSErrorReporting Windows only; or >> 2. Make UseOSErrorReporting Windows and macOS only. Then on macOS do a targeted crash after report_and_die() returns. > > hi David, > > Many thanks for the review and finding the background info on the history of this issue. > > How we do things when a user turns ON the "UseOSErrorReporting" flag is just an implementation detail. > > On Windows we forward the crash to the OS to handle it, but just because in this fix we "just" turn off our signal handlers, reset them to SIG_DFL and return to let it crash again, does not mean it's not a meaningful way to forward it to OS, if that's how the OS wants it - please see this technical note from Apple https://developer.apple.com/forums/thread/113742 where Apple suggest the way to let the macOS handle the crash is to: > > "unregister your signal handler (set it to SIG_DFL) and then return. This will cause the crashed process to continue execution, crash again, and generate a crash report via the Apple crash reporter." > > That's how Apple suggest we do it for Mac. > > I can limit the scope of this fix to just macOS here, like I was planning it for JDK-8237727, and for Linux simply disable the flag for now and leave any more sophisticated fix for a next issue. I do think, however, that on Linux anything better than 2 min hang would be better. > _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ > > On 27/10/2020 1:35 am, Gerard Ziemski wrote: > > > On Mon, 26 Oct 2020 04:33:03 GMT, David Holmes wrote: > > > Hi Gerard, > > > I think we have a fundamental problem here that UseOSErrorReporting was only ever intended for use on Windows. It simply allows VMError::report_and_die to return instead of actually making the VM "die". For Windows this means we can continue to propagate the windows exception and thus allow Windows Error Reporting (WER) to take over. Whether this actually works correctly or not is a different matter. > > > For non-Windows there is no pre-established alternative code path for report_and_die() returning. > > > In the bug report you write: > > > > On Mac/Linux it would look more like this: > > > > #1 catch signal in our handler > > > > #2 generate hs_err log > > > > #3 turn off our signal handler > > > > #4 continue the process normally, allowing it to crash again in the same spot, with the same signal being generated > > > > > > > > > To me you are now inventing what UseOSErrorReporting should mean on non-Windows, and I don't agree with it. I don't think it should mean that we re-crash using the "default" signal response and consider that as using "OS error reporting". To me that is just not valid, especially when we cannot return from a signal handling context in many cases without incurring undefined behaviour. To me #4 is not a valid expectation as we have no way to know what will happen next if the signal handler returns. It would also be wrong to just continue execution after an assertion or guarantee fails. > > > I'm assuming that the motivation here is that on macOS if we use the default signal handling modes then macOS will do its own error reporting? If so I would suggest that the right response may be to return from report_and_die (on macOS only) and then deliberately crash after restoring the default handler. Obviously that will change which "crash" the OS reports but that is likely to happen anyway as you cannot guarantee how you will crash after trying to continue (and this goes beyond our general "best effort" approaches in signal handling.) > > > Beyond that I share Thomas's concerns about making sweeping changes to installed signal handlers. > > > So my preferred approaches here would be: > > > 1. Make UseOSErrorReporting Windows only; or > > > 2. Make UseOSErrorReporting Windows and macOS only. Then on macOS do a targeted crash after report_and_die() returns. > > > > > > hi David, > > Many thanks for the review and finding the background info on the history of this issue. > > How we do things when a user turns ON the "UseOSErrorReporting" flag is just an implementation detail. > > No there is a semantic underpining as to what it means for there to be > OS error reporting on a given platform. Windows has a nicely defined > model. Other platforms not so nice. On macOS they really don't want apps > to attempt any kind of crash handling on their own. :) > > > On Windows we forward the crash to the OS to handle it, but just because in this fix we "just" turn off our signal handlers, reset them to SIG_DFL and return to let it crash again, does not mean it's not a meaningful way to forward it to OS, if that's how the OS wants it - please see this technical note from Apple https://developer.apple.com/forums/thread/113742 where Apple suggest the way to let the macOS handle the crash is to: > > "unregister your signal handler (set it to SIG_DFL) and then return. This will cause the crashed process to continue execution, crash again, and generate a crash report via the Apple crash reporter." > > That's how Apple suggest we do it for Mac. > > That is a blog by an Apple developer giving some very general advice, > and IMO lacking in some necessary detail. That quote above is in the > context of answering: > > "Finally, there?s the question of how to exit from your signal handler." > > The suggestion to "then return" hits UB for the synchronous error > signals - a fact not mentioned in the blog entry. The assertion that: > > "This will cause the crashed process to continue execution, crash again, > ... " > > is a naive oversimplification. If you just seg-faulted doing a read from > memory how can you continue execution? My understanding is that we would not be going to continue execution past the seg-faulted instruction, but instead resume at the seg-fault instruction (with the same memory/register contents, unless our signal handler modified any of that), which would cause the same signal to be raised at the exact same frame, resulting in the exact same behavior. That's what my experimentation shows and what I understood the Apple's recommendation is based on. > What does that mean when the read > yielded no value? Will you just continue with a random value? Will the > system try to re-execute the read and so crash again? Maybe it will > crash again, maybe it won't. Maybe it will do something in the meantime > that leads to totally unexpected behaviour (as Thomas previously > described). Hence my suggestion that if you are going to attempt this > path for macOS then you need to introduce the second crash so we know > exactly what will happen. But that will show up as a different crash and might be confusing. > Returning from the original signal handler is > not an option IMO. I think our differences of opinion all hinges on what happens when code returns from its signal handler: #1 Does it resume and actually redoes the exact same instruction? (which this time may succeed?) #2 Does it resume and raise the exact same signal? (exhibits the exact same behavior as original?) You and Thomas seem to believe that it's #1, I thought (based on https://developer.apple.com/forums/thread/113742 ) that it was more like #2. I will continue this investigation in JDK-8237727 Here I will not be as ambitious and I will simply fix the problem at hand: i.e. address the 2 minutes hang by disabling the option for macOS and Linux. ------------- PR: https://git.openjdk.java.net/jdk/pull/813 From akozlov at openjdk.java.net Tue Oct 27 16:58:36 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 27 Oct 2020 16:58:36 GMT Subject: RFR: 8255254: Split os::reserve_memory and os::map_memory_to_file interfaces [v4] In-Reply-To: References: Message-ID: > Hi, > > Please review a change to extract map_memory_to_file interface out of reserve_memory when the latter takes file descriptor. > > The change should be a pure refactoring without changes in functionality. The only part is disturbing: a comment in original os_posix.cpp:316 seems to refer to else clause and it contradicts to the actual code. Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Merge remote-tracking branch 'upstream/master' into 8255254-split-reserve-memory - Fix reviewing findings 2 - Fix review findings - Split reserve and map interfaces ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/812/files - new: https://git.openjdk.java.net/jdk/pull/812/files/8908acb7..6335d8ce Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=812&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=812&range=02-03 Stats: 27860 lines in 409 files changed: 23579 ins; 2888 del; 1393 mod Patch: https://git.openjdk.java.net/jdk/pull/812.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/812/head:pull/812 PR: https://git.openjdk.java.net/jdk/pull/812 From gziemski at openjdk.java.net Tue Oct 27 17:06:32 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Tue, 27 Oct 2020 17:06:32 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) [v2] In-Reply-To: References: Message-ID: > hi all, > > Please review this simple fix for POSIX platforms, which addresses a time out that occurs while handling a crash with UseOSErrorReporting turned ON. > > It appears that "UseOSErrorReporting" flag was only ever meant to be used on Windows platform and was mistakenly left available for other platforms. In this fix we make sure to only use the flag on Windows platform and make it a NOP for other platforms. > > Note #1: A similar hang issue occurs today even on Windows, with the only difference being that before a process times out (takes 2 minutes) it runs out of stack space in about 250 loops, so that's the only reason it doesn't linger for that long. Windows issue is tracked separately by https://bugs.openjdk.java.net/browse/JDK-8250782 > > Note #2: Creating native crash log (on macOS) is a non-trivial, research wise effort, that is tracked by https://bugs.openjdk.java.net/browse/JDK-8237727 > > Note #3 Removal of the "UseOSErrorReporting" flag will be depended on whether we can do #2 and at that time we can decide whether to keep it and implement it for other platforms or whether to remove it, provided that #2 can not be done reliably. Gerard Ziemski has updated the pull request incrementally with two additional commits since the last revision: - Only use UseOsErrorReporting on Windows - Revert "reset signal handlers to their system defaults if handling crash with UseOSErrorReporting" This reverts commit f6340643974f3e0cc3ab95fbbba51b23b8d9af31. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/813/files - new: https://git.openjdk.java.net/jdk/pull/813/files/f6340643..74d6c9a6 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=813&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=813&range=00-01 Stats: 44 lines in 6 files changed: 5 ins; 29 del; 10 mod Patch: https://git.openjdk.java.net/jdk/pull/813.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/813/head:pull/813 PR: https://git.openjdk.java.net/jdk/pull/813 From burban at openjdk.java.net Tue Oct 27 18:27:21 2020 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Tue, 27 Oct 2020 18:27:21 GMT Subject: RFR: 8254072: AArch64: Get rid of --disable-warnings-as-errors on Windows+ARM64 build [v4] In-Reply-To: References: Message-ID: On Tue, 27 Oct 2020 14:04:04 GMT, Andrew Haley wrote: >> Bernhard Urban-Forster has updated the pull request incrementally with two additional commits since the last revision: >> >> - uppercase suffix >> - add assert > > Marked as reviewed by aph (Reviewer). Thank you for the reviews, Magnus and Andrew! ------------- PR: https://git.openjdk.java.net/jdk/pull/530 From shade at openjdk.java.net Tue Oct 27 18:51:22 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 27 Oct 2020 18:51:22 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v12] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: On Tue, 27 Oct 2020 11:50:10 GMT, Roman Kennke wrote: >> src/hotspot/share/gc/shenandoah/shenandoahThreadLocalData.hpp line 55: >> >>> 53: int _disarmed_value; >>> 54: double _paced_time; >>> 55: ShenandoahMarkRefsSuperClosure* _mark_closure; >> >> This rubs me the wrong way. Closures are usually stack-allocated, so we are exposing the stack pointer here. > > Yeah we need to pass it between the mark-loop and the reference-processor. It's still thread-local. While I am browsing the code here... why not record it in `ShenandoahRefProcThreadLocal` then? It has a benefit of not polluting the "common" `ShenandoahThreadLocalData`, and clearly says the whole thing is about Shenandoah reference processor. ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From mcimadamore at openjdk.java.net Tue Oct 27 19:13:36 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Tue, 27 Oct 2020 19:13:36 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v17] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the third incubation round of the foreign memory access API incubation (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: > > * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from multiple threads > * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class has been added, which defines several useful dereference routines; these are really just thin wrappers around memory access var handles, but they make the barrier of entry for using this API somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit of dereference. > > This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done by calling `MemoryAddress::asSegmentRestricted`). > > A list of the API, implementation and test changes is provided below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be happy to point at existing discussions, and/or to provide the feedback required. > > A big thank to Erik Osterlund, Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. > > Thanks > Maurizio > > Javadoc: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which it is easy to derive all the other handles using plain var handle combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. > * `MemoryLayouts` > * A new layout, for machine addresses, has been added to the mix. > > > > ### Implementation changes > > There are two main things to discuss here: support for shared segments, and the general simplification of the memory access var handle support. > > #### Shared segments > > The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment is shared, it would be possible for a thread to close it while another is accessing it. > > After considering several options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. > > Sadly, none of these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). > > The question is, then, once we detect that a thread is accessing the very segment we're about to close, what should happen? We first experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread is accessing the segment being closed. > > As written in the javadoc, this doesn't mean that clients should just catch and try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should be treated as such. > > In terms of gritty implementation, we needed to centralize memory access routines in a single place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` annotation, which tells the VM that something important is going on. > > To achieve this, we created a new (autogenerated) class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during access (which is important when registering segments against cleaners). > > Of course, to make memory access safe, memory access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead of unsafe, so that a liveness check can be triggered (in case a scope is present). > > `ScopedMemoryAccess` has a `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed successfully. > > The implementation of `MemoryScope` (now significantly simplified from what we had before), has two implementations, one for confined segments and one for shared segments; the main difference between the two is what happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. > > #### Memory access var handles overhaul > > The key realization here was that if all memory access var handles took a coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle form. > > This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that e.g. additional offset is injected into a base memory access var handle. > > This also helped in simplifying the implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level access on the innards of the memory access var handle. All that code is now gone. > > #### Test changes > > Not much to see here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared segment case. > > [1] - https://openjdk.java.net/jeps/393 > [2] - https://openjdk.java.net/jeps/389 > [3] - https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html > [4] - https://openjdk.java.net/jeps/312 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: More 32-bit fixes for TestLayouts ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/548/files - new: https://git.openjdk.java.net/jdk/pull/548/files/e43f5d76..b01af093 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=16 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=15-16 Stats: 10 lines in 1 file changed: 2 ins; 4 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/548.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/548/head:pull/548 PR: https://git.openjdk.java.net/jdk/pull/548 From shade at openjdk.java.net Tue Oct 27 19:25:19 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 27 Oct 2020 19:25:19 GMT Subject: RFR: 8255397: x86: coalesce reference and int entry points into vtos bytecodes In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 14:53:08 GMT, Claes Redestad wrote: > On x86 - both 32- and 64-bit - the code laid out for transitionining into a vtos bytecode when having a reference and int top-of-stack state is semantically identical, and can be coalesced. > > This patch removes a short jump on some cases which is marginally beneficial when interpreting, while measurably reducing overhead of generating the interpreter itself. It rubs me the wrong way that we are effectively changing `push_ptr` to `push_i` for `aep`. While it is implemented in the same manner in `interp_masm_x86.cpp` -- delegating to `push`, it still means if `push_i` implementation changes, `aep` would do the `push_i` _as if_ it is integer, not pointer. Ditto a change in `push_ptr` (adding verification, maybe?) would miss this code. So, how much of the improvement we are talking about to sacrifice this? ------------- PR: https://git.openjdk.java.net/jdk/pull/865 From coleenp at openjdk.java.net Tue Oct 27 19:49:23 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 27 Oct 2020 19:49:23 GMT Subject: RFR: 8255397: x86: coalesce reference and int entry points into vtos bytecodes In-Reply-To: References: Message-ID: <1nt4qu8S4j-VPeDuWjIQpERAA1c24GaoOY7kF9X78AU=.2747b674-3237-41fe-b7c0-e1de1ff15426@github.com> On Tue, 27 Oct 2020 19:22:51 GMT, Aleksey Shipilev wrote: >> On x86 - both 32- and 64-bit - the code laid out for transitionining into a vtos bytecode when having a reference and int top-of-stack state is semantically identical, and can be coalesced. >> >> This patch removes a short jump on some cases which is marginally beneficial when interpreting, while measurably reducing overhead of generating the interpreter itself. > > It rubs me the wrong way that we are effectively changing `push_ptr` to `push_i` for `aep`. While it is implemented in the same manner in `interp_masm_x86.cpp` -- delegating to `push`, it still means if `push_i` implementation changes, `aep` would do the `push_i` _as if_ it is integer, not pointer. Ditto a change in `push_ptr` (adding verification, maybe?) would miss this code. > > So, how much of the improvement we are talking about to sacrifice this? Yes, I had the same queasy feeling looking at this. It's just unclear at this callsite that push_i doesn't do the wrong thing for 64 bit. ------------- PR: https://git.openjdk.java.net/jdk/pull/865 From redestad at openjdk.java.net Tue Oct 27 19:49:23 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Tue, 27 Oct 2020 19:49:23 GMT Subject: RFR: 8255397: x86: coalesce reference and int entry points into vtos bytecodes In-Reply-To: References: Message-ID: On Tue, 27 Oct 2020 19:22:51 GMT, Aleksey Shipilev wrote: > It rubs me the wrong way that we are effectively changing `push_ptr` to `push_i` for `aep`. While it is implemented in the same manner in `interp_masm_x86.cpp` -- delegating to `push`, it still means if `push_i` implementation changes, `aep` would do the `push_i` _as if_ it is integer, not pointer. Ditto a change in `push_ptr` (adding verification, maybe?) would miss this code. Verification is done explicitly with `__ verify_oop(..)` and friends, so it seems unlikely we'll overload `push_ptr` any time soon (and they have been semantically identical for many years, even before the merging of 32- and 64-bit `interp_masm_x86...`). But I acknowledge this adds a fragility here, but perhaps there are some assertions we can add to put a check that `push_ptr` and `push_i` stays semantically the same? > > So, how much of the improvement we are talking about to sacrifice this? A few hundred thousand instructions and branches on Hello World (seems unconditional jumps are logged as branches by `perf`?): Baseline: 103,795,433 instructions # 0.59 insn per cycle ( +- 0.07% ) 20,263,519 branches # 200.867 M/sec ( +- 0.08% ) 731,187 branch-misses # 3.61% of all branches ( +- 0.15% ) 0.067306367 seconds time elapsed ( +- 0.24% ) Patch: 103,466,523 instructions # 0.59 insn per cycle ( +- 0.07% ) 20,068,162 branches # 201.935 M/sec ( +- 0.08% ) 727,575 branch-misses # 3.63% of all branches ( +- 0.13% ) 0.066568115 seconds time elapsed ( +- 0.27% ) For Hello World maybe half of that comes from reduced overhead of generating, the rest from quickening quite a few bytecode transitions. There's a scaling component (seen a few million instruction gains on slightly larger apps), but it's nothing huge. ------------- PR: https://git.openjdk.java.net/jdk/pull/865 From akozlov at openjdk.java.net Tue Oct 27 20:40:19 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 27 Oct 2020 20:40:19 GMT Subject: Integrated: 8255254: Split os::reserve_memory and os::map_memory_to_file interfaces In-Reply-To: References: Message-ID: <0Ph3NtRDiGpD5P5iAAzq5NttVn7JLTI6PPRJ_pV-6FE=.e5baebd3-efa6-4108-b61f-e244eaba1058@github.com> On Thu, 22 Oct 2020 15:40:19 GMT, Anton Kozlov wrote: > Hi, > > Please review a change to extract map_memory_to_file interface out of reserve_memory when the latter takes file descriptor. > > The change should be a pure refactoring without changes in functionality. The only part is disturbing: a comment in original os_posix.cpp:316 seems to refer to else clause and it contradicts to the actual code. This pull request has now been integrated. Changeset: acd0e256 Author: Anton Kozlov Committer: Vladimir Kempik URL: https://git.openjdk.java.net/jdk/commit/acd0e256 Stats: 171 lines in 9 files changed: 88 ins; 55 del; 28 mod 8255254: Split os::reserve_memory and os::map_memory_to_file interfaces Reviewed-by: stefank, stuefe ------------- PR: https://git.openjdk.java.net/jdk/pull/812 From kbarrett at openjdk.java.net Tue Oct 27 21:59:20 2020 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Tue, 27 Oct 2020 21:59:20 GMT Subject: RFR: 8255298: Remove SurvivorAlignmentInBytes functionality [v3] In-Reply-To: References: Message-ID: On Tue, 27 Oct 2020 09:42:31 GMT, Thomas Schatzl wrote: >> Hi all, >> >> can I have reviews to remove the SurvivorAlignmentInBytes functionality? It has not been in use for a long time if ever, and can be removed. Searching the web also indicates that apart from the usual lists of all options and CRs it is never mentioned. >> >> SurvivorAlignmentInBytes is an experimental option so no further process is required. >> >> Testing: tier1-5 >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > kbarrett review Marked as reviewed by kbarrett (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/838 From smarks at openjdk.java.net Wed Oct 28 03:49:19 2020 From: smarks at openjdk.java.net (Stuart Marks) Date: Wed, 28 Oct 2020 03:49:19 GMT Subject: RFR: 8188055: (ref) Add Reference::refersTo predicate [v6] In-Reply-To: References: <9x0zaxknpYXGIvHun1CkLP0lEC8NQmPTnANxQKjhHF8=.907bdb15-2e2e-4f84-8fe4-ea4ed50534cd@github.com> <3JzF7OkemZ-Lxc4jZgdEh3qNDzW8wF7ITeq-s7_TOlo=.11e4e40b-b775-47cf-9862-735fbc61ffd3@github.com> <3kV3qhFRXBadf7Tol9n0Yomud_ndV_T_p7ShUfk4eVE=.d7151a63-0066-4020-b0ef-bae0d03dc133@github.com> Message-ID: On Sat, 24 Oct 2020 22:22:56 GMT, Peter Levart wrote: >> Reference instances should not be leaked and so I don't see very common that caller of `Reference::get` does not know the referent's type. It also depends on the `refersTo` check against `null` vs an object. Any known use case would be helpful if any (some existing code that wants to call `refersTo` to compare a `Reference` of raw type with an object of unknown type). >> >> FWIW, when converting a few use of `Reference::get` to `refersTo` in JDK, there is only one case (`equals(Object o)` method that needs the cast. >> >> http://cr.openjdk.java.net/~mchung/jdk15/webrevs/8188055/jdk-use-refersTo/index.html > > @mlchung I don't have many known use cases, but how about WeakHashMap.containsKey(Object key) for example? Currently `WeakHashMap.Entry extends WeakReference` but it would be more type safe if it extended `WeakReference`. In that case an `entry.refersTo(key)` would not compile... > What I'm trying to say is that even if `Reference` instances are not "leaked", you can get an untyped object reference from outside and you may want to identity-compare it with the Reference's referent. Some thoughts regarding the parameter type of refersTo. Summary: I think `refersTo(T)` is fine and that we don't want to change it to `refersTo(Object)`. I don't think we have a migration issue similar to generifying collections, where there was a possibility of changing `contains(Object)` to `contains(E)`. If that had been done, it would have been a source compatibility issue, because changing the signature of the method potentially affects existing code that calls the method. That doesn't apply here because we're adding a new method. The question now falls to whether it's preferable to have more convenience with `refersTo(Object)` or more type-safety with `refersTo(T)`. With the generic collections issue, the migration issue probably drove the decision to keep `contains(Object)`, but this has resulted in a continual set of complaints about the lack of an error when code passes an instance of the "wrong" type. I think that kind of error is likely to occur with `refersTo`. Since we don't have a source compatibility issue here, we can choose the safer API and avoid this kind of problem entirely. The safer API does raise the possibility of having to add inconvenient unchecked casts and local variables in certain places, but I think Mandy's comment about the code already having a reference of the "right" type is correct. Her prototype webrev linked above shows that having to add unchecked casts is fairly infrequent. ------------- PR: https://git.openjdk.java.net/jdk/pull/498 From smarks at openjdk.java.net Wed Oct 28 03:52:24 2020 From: smarks at openjdk.java.net (Stuart Marks) Date: Wed, 28 Oct 2020 03:52:24 GMT Subject: RFR: 8188055: (ref) Add Reference::refersTo predicate [v6] In-Reply-To: <0dhF_xxcp1VoUowwdZenB2qWa9ILcZjTMe3lsaRrg7k=.3c633db8-f745-4353-ad34-a64fbc96d4e0@github.com> References: <0dhF_xxcp1VoUowwdZenB2qWa9ILcZjTMe3lsaRrg7k=.3c633db8-f745-4353-ad34-a64fbc96d4e0@github.com> Message-ID: On Wed, 21 Oct 2020 02:28:30 GMT, Kim Barrett wrote: >> Finally returning to this review that was started in April 2020. I've >> recast it as a github PR. I think the security concern raised by Gil >> has been adequately answered. >> https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-April/029203.html >> https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-July/030401.html >> https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-August/030677.html >> https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-September/030793.html >> >> Please review a new function: java.lang.ref.Reference.refersTo. >> >> This function is needed to test the referent of a Reference object without >> artificially extending the lifetime of the referent object, as may happen >> when calling Reference.get. Some garbage collectors require extending the >> lifetime of a weak referent when accessed, in order to maintain collector >> invariants. Lifetime extension may occur with any collector when the >> Reference is a SoftReference, as calling get indicates recent access. This >> new function also allows testing the referent of a PhantomReference, which >> can't be accessed by calling get. >> >> The new function uses native methods whose implementations are in the VM so >> they can use the Access API. It is the intent that these methods will be >> intrinsified by optimizing compilers like C2 or graal, but that hasn't been >> implemented yet. Bear that in mind before rushing off to change existing >> uses of Reference.get. >> >> There are two native methods involved, one in Reference and an override in >> PhantomReference, both package private in java.lang.ref. The reason for this >> split is to simplify the intrinsification. This is a change from the version >> from April 2020; that version had a single native method in Reference, >> implemented using the ON_UNKNOWN_OOP_REF Access reference strength category. >> However, adding support for that category in the compilers adds significant >> implementation effort and complexity. Splitting avoids that complexity. >> >> Testing: >> mach5 tier1 >> Locally (linux-x64) verified the new test passes with various garbage collectors. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > improve wording in refersTo javadoc Marked as reviewed by smarks (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/498 From stuefe at openjdk.java.net Wed Oct 28 07:02:17 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 28 Oct 2020 07:02:17 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) [v2] In-Reply-To: <4_dt8K2LhJ6WqhVpPYiT_wbMIi-Qfqf6lt5GCc5Tq6M=.d74d54bf-88dd-4c9a-9ca6-c1e1b86916b6@github.com> References: <4_dt8K2LhJ6WqhVpPYiT_wbMIi-Qfqf6lt5GCc5Tq6M=.d74d54bf-88dd-4c9a-9ca6-c1e1b86916b6@github.com> Message-ID: On Tue, 27 Oct 2020 16:05:44 GMT, Gerard Ziemski wrote: > I think our differences of opinion all hinges on what happens when code returns from its signal handler: > > #1 Does it resume and actually redoes the exact same instruction? (which this time may succeed?) > #2 Does it resume and raise the exact same signal? (exhibits the exact same behavior as original?) > #3 Does it resume past the instruction that originally caused the exception? > > You and Thomas seem to believe that it's #3 (or is that #1 ?), I thought (based on https://developer.apple.com/forums/thread/113742 ) that it was more like #2. > No, not #3. #2 is an interesting thought, but I don't think so. Were it so, our polling page mechanism would not work: triggering a SEGV by accessing a poisened page, and in signal handling, unpoisening the page and returning, which then re-executes the same load, but since the page is now unpoisened no fault happens. Which, btw, is an excellent example of a case where returning from a signal handler does _not_ re-raise the same signal. On purpose in this case, but our point is that the same thing may happen accidentally. I think what happens is that the register contents - so, the crash context - which had been active when the thread got the first fault gets reinstated after signal handler returns, and we resume processing with the same state. So, all registers are the same, including pc. We would attempt to reload the instruction from the same address and re-execute it. But since the underlying memory could have changed in the meantime (starting at: the point the pc points to had been invalid and is now valid, e.g. a bug in the JIT, to: the instruction was a mov/store and its destination had been invalid and is now valid, and so on) there are conceivable scenarios where we may not crash a second time. > I will continue this investigation in JDK-8237727 > > Here I will not be as ambitious and I will simply fix the problem at hand: i.e. address the 2 minutes hang by disabling the option for macOS and Linux. This is reasonable, thank you. ------------- PR: https://git.openjdk.java.net/jdk/pull/813 From stuefe at openjdk.java.net Wed Oct 28 07:07:22 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 28 Oct 2020 07:07:22 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) [v2] In-Reply-To: References: Message-ID: <-ea8RDBX1RgeWaDuWs2i-ddatSdtPWJSrTJFNajbay8=.05b2bfc2-eeea-4fe9-ab0a-14cc50e9e1b3@github.com> On Tue, 27 Oct 2020 17:06:32 GMT, Gerard Ziemski wrote: >> hi all, >> >> Please review this simple fix for POSIX platforms, which addresses a time out that occurs while handling a crash with UseOSErrorReporting turned ON. >> >> It appears that "UseOSErrorReporting" flag was only ever meant to be used on Windows platform and was mistakenly left available for other platforms. In this fix we make sure to only use the flag on Windows platform and make it a NOP for other platforms. >> >> Note #1: A similar hang issue occurs today even on Windows, with the only difference being that before a process times out (takes 2 minutes) it runs out of stack space in about 250 loops, so that's the only reason it doesn't linger for that long. Windows issue is tracked separately by https://bugs.openjdk.java.net/browse/JDK-8250782 >> >> Note #2: Creating native crash log (on macOS) is a non-trivial, research wise effort, that is tracked by https://bugs.openjdk.java.net/browse/JDK-8237727 >> >> Note #3 Removal of the "UseOSErrorReporting" flag will be depended on whether we can do #2 and at that time we can decide whether to keep it and implement it for other platforms or whether to remove it, provided that #2 can not be done reliably. > > Gerard Ziemski has updated the pull request incrementally with two additional commits since the last revision: > > - Only use UseOsErrorReporting on Windows > - Revert "reset signal handlers to their system defaults if handling crash with UseOSErrorReporting" > > This reverts commit f6340643974f3e0cc3ab95fbbba51b23b8d9af31. Could you please do a small cleanup: UseOSErrorReporting is defined as pd flag, with definitions in all os-dependent globals.. files. Unnecessarily, since the default value is always false. We could remove the pd definitions and just make this a normal flag in globals.hpp. (Would be cleaner to move it to globals_windows.hpp but this would probably need a csr since its a product flag) ------------- Changes requested by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/813 From david.holmes at oracle.com Wed Oct 28 07:44:48 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 28 Oct 2020 17:44:48 +1000 Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) In-Reply-To: <4_dt8K2LhJ6WqhVpPYiT_wbMIi-Qfqf6lt5GCc5Tq6M=.d74d54bf-88dd-4c9a-9ca6-c1e1b86916b6@github.com> References: <4_dt8K2LhJ6WqhVpPYiT_wbMIi-Qfqf6lt5GCc5Tq6M=.d74d54bf-88dd-4c9a-9ca6-c1e1b86916b6@github.com> Message-ID: <22a941da-8201-98ba-eb2d-be84ac33215d@oracle.com> On 28/10/2020 2:08 am, Gerard Ziemski wrote: > On Mon, 26 Oct 2020 15:32:49 GMT, Gerard Ziemski wrote: >>> On Windows we forward the crash to the OS to handle it, but just because in this fix we "just" turn off our signal handlers, reset them to SIG_DFL and return to let it crash again, does not mean it's not a meaningful way to forward it to OS, if that's how the OS wants it - please see this technical note from Apple https://developer.apple.com/forums/thread/113742 where Apple suggest the way to let the macOS handle the crash is to: >>> "unregister your signal handler (set it to SIG_DFL) and then return. This will cause the crashed process to continue execution, crash again, and generate a crash report via the Apple crash reporter." >>> That's how Apple suggest we do it for Mac. >> >> That is a blog by an Apple developer giving some very general advice, >> and IMO lacking in some necessary detail. That quote above is in the >> context of answering: >> >> "Finally, there?s the question of how to exit from your signal handler." >> >> The suggestion to "then return" hits UB for the synchronous error >> signals - a fact not mentioned in the blog entry. The assertion that: >> >> "This will cause the crashed process to continue execution, crash again, >> ... " >> >> is a naive oversimplification. If you just seg-faulted doing a read from >> memory how can you continue execution? > > My understanding is that we would not be going to continue execution past the seg-faulted instruction, but instead resume at the seg-fault instruction (with the same memory/register contents, unless our signal handler modified any of that), which would cause the same signal to be raised at the exact same frame, resulting in the exact same behavior. That's what my experimentation shows and what I understood the Apple's recommendation is based on. > >> What does that mean when the read >> yielded no value? Will you just continue with a random value? Will the >> system try to re-execute the read and so crash again? Maybe it will >> crash again, maybe it won't. Maybe it will do something in the meantime >> that leads to totally unexpected behaviour (as Thomas previously >> described). Hence my suggestion that if you are going to attempt this >> path for macOS then you need to introduce the second crash so we know >> exactly what will happen. > > But that will show up as a different crash and might be confusing. > >> Returning from the original signal handler is >> not an option IMO. > > I think our differences of opinion all hinges on what happens when code returns from its signal handler: > > #1 Does it resume and actually redoes the exact same instruction? (which this time may succeed?) > #2 Does it resume and raise the exact same signal? (exhibits the exact same behavior as original?) > > You and Thomas seem to believe that it's #1, I thought (based on https://developer.apple.com/forums/thread/113742 ) that it was more like #2. My position was based purely on the POSIX specification that returning from a signal handler, for specific signals, leads to undefined behaviour. I had overlooked (thanks Thomas for flagging it!) the fact that we already utilise returning normally from signal handlers for a range of things - safepoint/handshake polls; implicit null pointer checks. So I was looking for something more definitive from macOS that things would work as you suggest. And the sigaction manpage does seem to suggest that: "The call to the handler is arranged so that if the signal handling routine returns normally the process will resume execution in the context from before the signal's delivery." So as Thomas discusses the issue is not whether #1 or #2 is correct, as they both are, it just depends on the exact context of the original signal whether re-executing the failed instruction will fail again, or whether it could succeed. While I can imagine general scenarios where the instruction could now succeed, I don't know how realistic they are in the JVM context. > I will continue this investigation in JDK-8237727 > > Here I will not be as ambitious and I will simply fix the problem at hand: i.e. address the 2 minutes hang by disabling the option for macOS and Linux. Okay. Thanks, David ----- > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/813 > From david.holmes at oracle.com Wed Oct 28 07:47:15 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 28 Oct 2020 17:47:15 +1000 Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) [v2] In-Reply-To: <-ea8RDBX1RgeWaDuWs2i-ddatSdtPWJSrTJFNajbay8=.05b2bfc2-eeea-4fe9-ab0a-14cc50e9e1b3@github.com> References: <-ea8RDBX1RgeWaDuWs2i-ddatSdtPWJSrTJFNajbay8=.05b2bfc2-eeea-4fe9-ab0a-14cc50e9e1b3@github.com> Message-ID: On 28/10/2020 5:07 pm, Thomas Stuefe wrote: > On Tue, 27 Oct 2020 17:06:32 GMT, Gerard Ziemski wrote: > >>> hi all, >>> >>> Please review this simple fix for POSIX platforms, which addresses a time out that occurs while handling a crash with UseOSErrorReporting turned ON. >>> >>> It appears that "UseOSErrorReporting" flag was only ever meant to be used on Windows platform and was mistakenly left available for other platforms. In this fix we make sure to only use the flag on Windows platform and make it a NOP for other platforms. >>> >>> Note #1: A similar hang issue occurs today even on Windows, with the only difference being that before a process times out (takes 2 minutes) it runs out of stack space in about 250 loops, so that's the only reason it doesn't linger for that long. Windows issue is tracked separately by https://bugs.openjdk.java.net/browse/JDK-8250782 >>> >>> Note #2: Creating native crash log (on macOS) is a non-trivial, research wise effort, that is tracked by https://bugs.openjdk.java.net/browse/JDK-8237727 >>> >>> Note #3 Removal of the "UseOSErrorReporting" flag will be depended on whether we can do #2 and at that time we can decide whether to keep it and implement it for other platforms or whether to remove it, provided that #2 can not be done reliably. >> >> Gerard Ziemski has updated the pull request incrementally with two additional commits since the last revision: >> >> - Only use UseOsErrorReporting on Windows >> - Revert "reset signal handlers to their system defaults if handling crash with UseOSErrorReporting" >> >> This reverts commit f6340643974f3e0cc3ab95fbbba51b23b8d9af31. > > Could you please do a small cleanup: > > UseOSErrorReporting is defined as pd flag, with definitions in all os-dependent globals.. files. Unnecessarily, since the default value is always false. We could remove the pd definitions and just make this a normal flag in globals.hpp. > > (Would be cleaner to move it to globals_windows.hpp but this would probably need a csr since its a product flag) Any behavioural change in the existing product flag also requires a CSR request so we may as well make this truly windows-only, then add in macOS later. Thanks, David > ------------- > > Changes requested by stuefe (Reviewer). > > PR: https://git.openjdk.java.net/jdk/pull/813 > From shade at openjdk.java.net Wed Oct 28 08:20:27 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 28 Oct 2020 08:20:27 GMT Subject: RFR: 8255397: x86: coalesce reference and int entry points into vtos bytecodes In-Reply-To: References: Message-ID: On Tue, 27 Oct 2020 19:46:29 GMT, Claes Redestad wrote: >> It rubs me the wrong way that we are effectively changing `push_ptr` to `push_i` for `aep`. While it is implemented in the same manner in `interp_masm_x86.cpp` -- delegating to `push`, it still means if `push_i` implementation changes, `aep` would do the `push_i` _as if_ it is integer, not pointer. Ditto a change in `push_ptr` (adding verification, maybe?) would miss this code. >> >> So, how much of the improvement we are talking about to sacrifice this? > >> It rubs me the wrong way that we are effectively changing `push_ptr` to `push_i` for `aep`. While it is implemented in the same manner in `interp_masm_x86.cpp` -- delegating to `push`, it still means if `push_i` implementation changes, `aep` would do the `push_i` _as if_ it is integer, not pointer. Ditto a change in `push_ptr` (adding verification, maybe?) would miss this code. > > Verification is done explicitly with `__ verify_oop(..)` and friends, so it seems unlikely we'll overload `push_ptr` any time soon (and they have been semantically identical for many years, even before the merging of 32- and 64-bit `interp_masm_x86...`). But I acknowledge this adds a fragility here, but perhaps there are some assertions we can add to put a check that `push_ptr` and `push_i` stays semantically the same? > >> >> So, how much of the improvement we are talking about to sacrifice this? > > A few hundred thousand instructions and branches on Hello World (seems unconditional jumps are logged as branches by `perf`?): > > Baseline: > 103,795,433 instructions # 0.59 insn per cycle ( +- 0.07% ) > 20,263,519 branches # 200.867 M/sec ( +- 0.08% ) > 731,187 branch-misses # 3.61% of all branches ( +- 0.15% ) 0.067306367 seconds time elapsed ( +- 0.24% ) > > Patch: > 103,466,523 instructions # 0.59 insn per cycle ( +- 0.07% ) > 20,068,162 branches # 201.935 M/sec ( +- 0.08% ) > 727,575 branch-misses # 3.63% of all branches ( +- 0.13% ) 0.066568115 seconds time elapsed ( +- 0.27% ) > > For Hello World maybe half of that comes from reduced overhead of generating, the rest from quickening quite a few bytecode transitions. There's a scaling component (seen a few million instruction gains on slightly larger apps), but it's nothing huge. Okay, so that is 0.3% less instructions and ~1% less branches on Hello World. That's interesting. Would rebalancing the entry points order give the similar improvement without messing up the code? For example, what happens if we move `aep` to be the last entry point, and set up `[bcsi]ep` for a short jump? There is a middle-ground, I think: introduce `push_i_or_ptr` and delegate it to `push`. That would make it clear what usages expect `push_i` and `push_ptr` shapes to match, and if later it proves to be a problem, we could easily revert all new usages to the old form. ------------- PR: https://git.openjdk.java.net/jdk/pull/865 From github.com+10835776+stsypanov at openjdk.java.net Wed Oct 28 08:44:40 2020 From: github.com+10835776+stsypanov at openjdk.java.net (=?UTF-8?B?0KHQtdGA0LPQtdC5?= =?UTF-8?B?IA==?= =?UTF-8?B?0KbRi9C/0LDQvdC+0LI=?=) Date: Wed, 28 Oct 2020 08:44:40 GMT Subject: RFR: 8255299: Drop explicit zeroing at instantiation of Atomic* objects [v2] In-Reply-To: References: Message-ID: > As discussed in https://github.com/openjdk/jdk/pull/510 there is never a reason to explicitly instantiate any instance of `Atomic*` class with its default value, i.e. `new AtomicInteger(0)` could be replaced with `new AtomicInteger()` which is faster: > @State(Scope.Thread) > @OutputTimeUnit(TimeUnit.NANOSECONDS) > @BenchmarkMode(value = Mode.AverageTime) > public class AtomicBenchmark { > @Benchmark > public Object defaultValue() { > return new AtomicInteger(); > } > @Benchmark > public Object explicitValue() { > return new AtomicInteger(0); > } > } > THis benchmark demonstrates that `explicitValue()` is much slower: > Benchmark Mode Cnt Score Error Units > AtomicBenchmark.defaultValue avgt 30 4.778 ? 0.403 ns/op > AtomicBenchmark.explicitValue avgt 30 11.846 ? 0.273 ns/op > So meanwhile https://bugs.openjdk.java.net/browse/JDK-8145948 is still in progress we could trivially replace explicit zeroing with default constructors gaining some performance benefit with no risk. > > I've tested the changes locally, both tier1 and tier 2 are ok. > > Could one create an issue for tracking this? ?????? ??????? has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last revision: 8255299: Drop explicit zeroing at instantiation of Atomic* objects ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/818/files - new: https://git.openjdk.java.net/jdk/pull/818/files/c1fb362f..7dc646d0 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=818&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=818&range=00-01 Stats: 4576 lines in 201 files changed: 2659 ins; 1135 del; 782 mod Patch: https://git.openjdk.java.net/jdk/pull/818.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/818/head:pull/818 PR: https://git.openjdk.java.net/jdk/pull/818 From github.com+10835776+stsypanov at openjdk.java.net Wed Oct 28 08:44:41 2020 From: github.com+10835776+stsypanov at openjdk.java.net (=?UTF-8?B?0KHQtdGA0LPQtdC5?= =?UTF-8?B?IA==?= =?UTF-8?B?0KbRi9C/0LDQvdC+0LI=?=) Date: Wed, 28 Oct 2020 08:44:41 GMT Subject: RFR: 8255299: Drop explicit zeroing at instantiation of Atomic* objects [v2] In-Reply-To: References: Message-ID: On Sat, 24 Oct 2020 23:12:09 GMT, Phil Race wrote: >> ?????? ??????? has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last revision: >> >> 8255299: Drop explicit zeroing at instantiation of Atomic* objects > > client changes are fine Rebased onto master to have the fix introduced in https://github.com/openjdk/jdk/pull/778 ------------- PR: https://git.openjdk.java.net/jdk/pull/818 From serb at openjdk.java.net Wed Oct 28 08:52:19 2020 From: serb at openjdk.java.net (Sergey Bylokhov) Date: Wed, 28 Oct 2020 08:52:19 GMT Subject: RFR: 8255299: Drop explicit zeroing at instantiation of Atomic* objects [v2] In-Reply-To: References: Message-ID: On Wed, 28 Oct 2020 08:40:02 GMT, ?????? ??????? wrote: >> client changes are fine > > Rebased onto master to have the fix introduced in https://github.com/openjdk/jdk/pull/778 FYI it is better to use merge, instead of rebase+force push. Rebase breaks history and all existed code comments. ------------- PR: https://git.openjdk.java.net/jdk/pull/818 From plevart at openjdk.java.net Wed Oct 28 08:57:25 2020 From: plevart at openjdk.java.net (Peter Levart) Date: Wed, 28 Oct 2020 08:57:25 GMT Subject: RFR: 8188055: (ref) Add Reference::refersTo predicate [v6] In-Reply-To: References: <9x0zaxknpYXGIvHun1CkLP0lEC8NQmPTnANxQKjhHF8=.907bdb15-2e2e-4f84-8fe4-ea4ed50534cd@github.com> <3JzF7OkemZ-Lxc4jZgdEh3qNDzW8wF7ITeq-s7_TOlo=.11e4e40b-b775-47cf-9862-735fbc61ffd3@github.com> <3kV3qhFRXBadf7Tol9n0Yomud_ndV_T_p7ShUfk4eVE=.d7151a63-0066-4020-b0ef-bae0d03dc133@github.com> Message-ID: On Wed, 28 Oct 2020 03:46:55 GMT, Stuart Marks wrote: > Some thoughts regarding the parameter type of refersTo. Summary: I think `refersTo(T)` is fine and that we don't want to change it to `refersTo(Object)`. > I agree that we don't have a migration problem here that collections had. So let it be `refersTo(T)` then. ------------- PR: https://git.openjdk.java.net/jdk/pull/498 From github.com+10835776+stsypanov at openjdk.java.net Wed Oct 28 08:59:23 2020 From: github.com+10835776+stsypanov at openjdk.java.net (=?UTF-8?B?0KHQtdGA0LPQtdC5?= =?UTF-8?B?IA==?= =?UTF-8?B?0KbRi9C/0LDQvdC+0LI=?=) Date: Wed, 28 Oct 2020 08:59:23 GMT Subject: RFR: 8255299: Drop explicit zeroing at instantiation of Atomic* objects [v2] In-Reply-To: References: Message-ID: On Wed, 28 Oct 2020 08:49:38 GMT, Sergey Bylokhov wrote: >> Rebased onto master to have the fix introduced in https://github.com/openjdk/jdk/pull/778 > > FYI it is better to use merge, instead of rebase+force push. Rebase breaks history and all existed code comments. @mrserb thanks for pointing this out! ------------- PR: https://git.openjdk.java.net/jdk/pull/818 From rkennke at openjdk.java.net Wed Oct 28 09:17:31 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Wed, 28 Oct 2020 09:17:31 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v16] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads that make heavvy use of such weak references will therefore potentially cause significant GC pauses. > > There are 3 main items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for FinalReference, marking stops there, and does not mark through the referent. > 2. Concurrent processing is performed after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers in Reference.get() intrinsics that return NULL when the referent is not reachable. > > Testing: hotspot_gc_shenadoah (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions, tier1, tier2, vmTestbase_vm_metaspace, vmTestbase_nsk_jvmti, with -XX:+UseShenandoahGC without regressions, specjvm with various levels of verification Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 60 commits: - Merge branch 'master' into shenandoah-concurrent-weakrefs - Rename ShenandoahLRBKind -> AccessKind - Relax verification only for j.l.r.Reference objects - Intendation fixes - Rename native argument to maybe_narrow_oop for more clarity - Change ShenandoahLRBKind to be an enum class instead of plain enum, and some minor touch-ups - Add fallback support for new properties in ObjArrayChunkedTask - Fix 32bit interpreter LRB-native call - Explicitely use concurrent vs stw reference processing, don't rely on is_at_shenandoah_safepoint() - Exclude Shenandoah from TestSoftReferencesBehaviorOnOOME.java, it doesn't play with concurrent reference processing - ... and 50 more: https://git.openjdk.java.net/jdk/compare/7a7ce021...c36f745a ------------- Changes: https://git.openjdk.java.net/jdk/pull/505/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=15 Stats: 2394 lines in 56 files changed: 1628 ins; 567 del; 199 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From eosterlund at openjdk.java.net Wed Oct 28 09:42:32 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 28 Oct 2020 09:42:32 GMT Subject: RFR: 8255233: InterpreterRuntime::at_unwind should be a JRT_LEAF [v4] In-Reply-To: References: Message-ID: > InterpreterRuntime::at_unwind is called at the very beginning of remove_activation(), to notify concurrent stack processing that a frame is about to be unwound. It is currently a JRT_ENTRY, because it needs a last_Java_frame to see what frame is about to get unwound. > > However, there are special return paths used by JVMTI pop frame, that checks if the caller frame is deoptimized, then calls a special path that removes the top activation, assuming that does not enter the deopt handler. The new JRT_ENTRY makes that reasoning invalid. > > Therefore, we need this to be a JRT_LEAF, that sets a last Java frame, to make everyone happy. This patch performs that change. > > I have run tier 1-5 testing, and manually tested: > > while true; do make test JTREG="RETAIN=all" TEST=open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/ForceEarlyReturn/ForceEarlyReturn002 TEST_OPTS_JAVA_OPTIONS="-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=0.0001 -XX:ZFragmentationLimit=0.01 -XX:+VerifyOops -XX:+ZVerifyViews" ; done > > Before the fix it crashes ~1/15 runs with a bad oop. After the fix, it doesn't crash. I have run it more times than my tmux buffer fits (for a day), and it does not fail any more with this fix. > > Unfortunately, my testing on AArch64 has been stalled for a day, so I have sent out this PR without the testing of those bits being finished. I won't push until I get the results back, of course. But I am expecting that to be fine, as there is nothing special going on there and it compiles. Will post a comment when the complete results have arrived. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: x86 32 bit fix ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/828/files - new: https://git.openjdk.java.net/jdk/pull/828/files/a9a59f3b..5786f6ce Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=828&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=828&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/828.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/828/head:pull/828 PR: https://git.openjdk.java.net/jdk/pull/828 From shade at openjdk.java.net Wed Oct 28 09:46:32 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 28 Oct 2020 09:46:32 GMT Subject: RFR: 8255523: Clean up temporary shared_locs initializations Message-ID: <5OHoOpEGZ0j62A09r5hqDzxIu39qDSrxrC-b2tWOAzg=.4555cf67-82ee-4827-9154-1322b3f4fcf8@github.com> See #648. Apparently, LLVM 11 complains that we are computing the number of elements over the array of a different type. Instead of ignoring the warning, it seems better to just clean up that code. We can allocate the whole thing as resource array of the same size. `sizeOf(relocInfo) = 2`, since it carries `unsigned short`. ------------- Commit messages: - 8255523: Clean up temporary shared_locs initializations Changes: https://git.openjdk.java.net/jdk/pull/897/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=897&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255523 Stats: 9 lines in 1 file changed: 4 ins; 0 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/897.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/897/head:pull/897 PR: https://git.openjdk.java.net/jdk/pull/897 From dfuchs at openjdk.java.net Wed Oct 28 09:55:19 2020 From: dfuchs at openjdk.java.net (Daniel Fuchs) Date: Wed, 28 Oct 2020 09:55:19 GMT Subject: RFR: 8188055: (ref) Add Reference::refersTo predicate [v6] In-Reply-To: References: <9x0zaxknpYXGIvHun1CkLP0lEC8NQmPTnANxQKjhHF8=.907bdb15-2e2e-4f84-8fe4-ea4ed50534cd@github.com> <3JzF7OkemZ-Lxc4jZgdEh3qNDzW8wF7ITeq-s7_TOlo=.11e4e40b-b775-47cf-9862-735fbc61ffd3@github.com> <3kV3qhFRXBadf7Tol9n0Yomud_ndV_T_p7ShUfk4eVE=.d7151a63-0066-4020-b0ef-bae0d03dc133@github.com> Message-ID: On Wed, 28 Oct 2020 08:54:31 GMT, Peter Levart wrote: >> Some thoughts regarding the parameter type of refersTo. Summary: I think `refersTo(T)` is fine and that we don't want to change it to `refersTo(Object)`. >> >> I don't think we have a migration issue similar to generifying collections, where there was a possibility of changing `contains(Object)` to `contains(E)`. If that had been done, it would have been a source compatibility issue, because changing the signature of the method potentially affects existing code that calls the method. That doesn't apply here because we're adding a new method. >> >> The question now falls to whether it's preferable to have more convenience with `refersTo(Object)` or more type-safety with `refersTo(T)`. With the generic collections issue, the migration issue probably drove the decision to keep `contains(Object)`, but this has resulted in a continual set of complaints about the lack of an error when code passes an instance of the "wrong" type. I think that kind of error is likely to occur with `refersTo`. Since we don't have a source compatibility issue here, we can choose the safer API and avoid this kind of problem entirely. >> >> The safer API does raise the possibility of having to add inconvenient unchecked casts and local variables in certain places, but I think Mandy's comment about the code already having a reference of the "right" type is correct. Her prototype webrev linked above shows that having to add unchecked casts is fairly infrequent. > >> Some thoughts regarding the parameter type of refersTo. Summary: I think `refersTo(T)` is fine and that we don't want to change it to `refersTo(Object)`. >> > I agree that we don't have a migration problem here that collections had. So let it be `refersTo(T)` then. I agree as well. ------------- PR: https://git.openjdk.java.net/jdk/pull/498 From shade at openjdk.java.net Wed Oct 28 09:56:31 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 28 Oct 2020 09:56:31 GMT Subject: RFR: 8142984: Zero: fast accessors should handle both getters and setters [v2] In-Reply-To: <0skRfs7hB88JHFy53lVD0Fvt-JlF2HWGTC05qMgHidA=.346c67ce-af20-41ba-b4fa-a24e8ca6c0e2@github.com> References: <0skRfs7hB88JHFy53lVD0Fvt-JlF2HWGTC05qMgHidA=.346c67ce-af20-41ba-b4fa-a24e8ca6c0e2@github.com> Message-ID: > It started as removing the TODO item in `abstractInterpreter.cpp`. Zero is the only implementation that treats `accessor` to mean `getter`, which makes the awkward choice in the entry selection. After going back and forth (including trying to remove the fast accessor methods altogether in [JDK-8255066](https://bugs.openjdk.java.net/browse/JDK-8255066)), I settled on implementing the fast Zero `setter`-s too, plus renaming and whipping the existing `getter` code in shape. The end result seems to be more straight-forward than it was before. > > On the plus side, it improves `make bootcycle-images` in release mode from ~47m40s to ~46m50s, because we are saving time doing the `normal_entry` for setters. > > The "normal", non-Zero template interpreter is not affected, because it does not have any specializations for `accessor`, `getter` or `setter`, and instead just doing the normal entry. > > Testing: > - [x] Linux x86_64 {fastdebug, release} Zero `make bootcycle-images` > - [x] Linux aarch64 {fastdebug, release} Zero `make bootcycle-images` > - [x] Linux x86_64 Zero release jcstress > - [x] Linux aarch64 Zero release jcstress Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit: 8142984: Zero: fast accessors should handle both getters and setters ------------- Changes: https://git.openjdk.java.net/jdk/pull/728/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=728&range=01 Stats: 202 lines in 7 files changed: 97 ins; 38 del; 67 mod Patch: https://git.openjdk.java.net/jdk/pull/728.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/728/head:pull/728 PR: https://git.openjdk.java.net/jdk/pull/728 From redestad at openjdk.java.net Wed Oct 28 11:14:03 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Wed, 28 Oct 2020 11:14:03 GMT Subject: RFR: 8255397: x86: coalesce reference and int entry points into vtos bytecodes [v2] In-Reply-To: References: Message-ID: > On x86 - both 32- and 64-bit - the code laid out for transitionining into a vtos bytecode when having a reference and int top-of-stack state is semantically identical, and can be coalesced. > > This patch removes a short jump on some cases which is marginally beneficial when interpreting, while measurably reducing overhead of generating the interpreter itself. Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Drop verification from comment - Introduce push_i_or_ptr - Merge branch 'master' into atos_itos_opt - x86: coalesce some ptr and int entry points ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/865/files - new: https://git.openjdk.java.net/jdk/pull/865/files/c44b673f..6595372e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=865&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=865&range=00-01 Stats: 2959 lines in 131 files changed: 2034 ins; 559 del; 366 mod Patch: https://git.openjdk.java.net/jdk/pull/865.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/865/head:pull/865 PR: https://git.openjdk.java.net/jdk/pull/865 From shade at openjdk.java.net Wed Oct 28 11:14:15 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 28 Oct 2020 11:14:15 GMT Subject: RFR: 8255397: x86: coalesce reference and int entry points into vtos bytecodes [v2] In-Reply-To: References: Message-ID: On Wed, 28 Oct 2020 11:11:19 GMT, Claes Redestad wrote: >> On x86 - both 32- and 64-bit - the code laid out for transitionining into a vtos bytecode when having a reference and int top-of-stack state is semantically identical, and can be coalesced. >> >> This patch removes a short jump on some cases which is marginally beneficial when interpreting, while measurably reducing overhead of generating the interpreter itself. > > Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Drop verification from comment > - Introduce push_i_or_ptr > - Merge branch 'master' into atos_itos_opt > - x86: coalesce some ptr and int entry points Changes requested by shade (Reviewer). src/hotspot/cpu/x86/interp_masm_x86.cpp line 609: > 607: > 608: void InterpreterMacroAssembler::push_i_or_ptr(Register r) { > 609: push_i(r); Should be `push(r)`: it is both cleaner and avoids a middle call to `push_i(r)`. src/hotspot/cpu/x86/interp_masm_x86.hpp line 152: > 150: // an int might have some advantage, while still documenting the fact that a > 151: // ptr might be pushed to the stack. This method will never do any > 152: // verification of the oop. I don't think we need to mention verification here. ------------- PR: https://git.openjdk.java.net/jdk/pull/865 From redestad at openjdk.java.net Wed Oct 28 11:14:20 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Wed, 28 Oct 2020 11:14:20 GMT Subject: RFR: 8255397: x86: coalesce reference and int entry points into vtos bytecodes [v2] In-Reply-To: References: Message-ID: On Wed, 28 Oct 2020 11:07:53 GMT, Aleksey Shipilev wrote: >> Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: >> >> - Drop verification from comment >> - Introduce push_i_or_ptr >> - Merge branch 'master' into atos_itos_opt >> - x86: coalesce some ptr and int entry points > > Changes requested by shade (Reviewer). > Would rebalancing the entry points order give the similar improvement without messing up the code? For example, what happens if we move `aep` to be the last entry point, and set up `[bcsi]ep` for a short jump? Looks like a wash, at least on Hello World > There is a middle-ground, I think: introduce `push_i_or_ptr` and delegate it to `push`. That would make it clear what usages expect `push_i` and `push_ptr` shapes to match, and if later it proves to be a problem, we could easily revert all new usages to the old form. Good suggestion. > src/hotspot/cpu/x86/interp_masm_x86.hpp line 152: > >> 150: // an int might have some advantage, while still documenting the fact that a >> 151: // ptr might be pushed to the stack. This method will never do any >> 152: // verification of the oop. > > I don't think we need to mention verification here. Sure, I'll drop that. ------------- PR: https://git.openjdk.java.net/jdk/pull/865 From redestad at openjdk.java.net Wed Oct 28 11:25:58 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Wed, 28 Oct 2020 11:25:58 GMT Subject: RFR: 8255397: x86: coalesce reference and int entry points into vtos bytecodes [v3] In-Reply-To: References: Message-ID: > On x86 - both 32- and 64-bit - the code laid out for transitionining into a vtos bytecode when having a reference and int top-of-stack state is semantically identical, and can be coalesced. > > This patch removes a short jump on some cases which is marginally beneficial when interpreting, while measurably reducing overhead of generating the interpreter itself. Claes Redestad has updated the pull request incrementally with one additional commit since the last revision: Use push directly in push_i_or_ptr ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/865/files - new: https://git.openjdk.java.net/jdk/pull/865/files/6595372e..77bf4fcf Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=865&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=865&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/865.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/865/head:pull/865 PR: https://git.openjdk.java.net/jdk/pull/865 From shade at openjdk.java.net Wed Oct 28 11:26:00 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 28 Oct 2020 11:26:00 GMT Subject: RFR: 8255397: x86: coalesce reference and int entry points into vtos bytecodes [v3] In-Reply-To: References: Message-ID: On Wed, 28 Oct 2020 11:22:32 GMT, Claes Redestad wrote: >> On x86 - both 32- and 64-bit - the code laid out for transitionining into a vtos bytecode when having a reference and int top-of-stack state is semantically identical, and can be coalesced. >> >> This patch removes a short jump on some cases which is marginally beneficial when interpreting, while measurably reducing overhead of generating the interpreter itself. > > Claes Redestad has updated the pull request incrementally with one additional commit since the last revision: > > Use push directly in push_i_or_ptr This looks good to me. Coleen needs to ack too. ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/865 From dholmes at openjdk.java.net Wed Oct 28 11:44:44 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 28 Oct 2020 11:44:44 GMT Subject: RFR: 8255233: InterpreterRuntime::at_unwind should be a JRT_LEAF [v4] In-Reply-To: References: Message-ID: On Wed, 28 Oct 2020 09:42:32 GMT, Erik ?sterlund wrote: >> InterpreterRuntime::at_unwind is called at the very beginning of remove_activation(), to notify concurrent stack processing that a frame is about to be unwound. It is currently a JRT_ENTRY, because it needs a last_Java_frame to see what frame is about to get unwound. >> >> However, there are special return paths used by JVMTI pop frame, that checks if the caller frame is deoptimized, then calls a special path that removes the top activation, assuming that does not enter the deopt handler. The new JRT_ENTRY makes that reasoning invalid. >> >> Therefore, we need this to be a JRT_LEAF, that sets a last Java frame, to make everyone happy. This patch performs that change. >> >> I have run tier 1-5 testing, and manually tested: >> >> while true; do make test JTREG="RETAIN=all" TEST=open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/ForceEarlyReturn/ForceEarlyReturn002 TEST_OPTS_JAVA_OPTIONS="-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=0.0001 -XX:ZFragmentationLimit=0.01 -XX:+VerifyOops -XX:+ZVerifyViews" ; done >> >> Before the fix it crashes ~1/15 runs with a bad oop. After the fix, it doesn't crash. I have run it more times than my tmux buffer fits (for a day), and it does not fail any more with this fix. >> >> Unfortunately, my testing on AArch64 has been stalled for a day, so I have sent out this PR without the testing of those bits being finished. I won't push until I get the results back, of course. But I am expecting that to be fine, as there is nothing special going on there and it compiles. Will post a comment when the complete results have arrived. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > x86 32 bit fix Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/828 From github.com+10835776+stsypanov at openjdk.java.net Wed Oct 28 12:14:48 2020 From: github.com+10835776+stsypanov at openjdk.java.net (=?UTF-8?B?0KHQtdGA0LPQtdC5?= =?UTF-8?B?IA==?= =?UTF-8?B?0KbRi9C/0LDQvdC+0LI=?=) Date: Wed, 28 Oct 2020 12:14:48 GMT Subject: Integrated: 8255299: Drop explicit zeroing at instantiation of Atomic* objects In-Reply-To: References: Message-ID: On Thu, 22 Oct 2020 20:46:15 GMT, ?????? ??????? wrote: > As discussed in https://github.com/openjdk/jdk/pull/510 there is never a reason to explicitly instantiate any instance of `Atomic*` class with its default value, i.e. `new AtomicInteger(0)` could be replaced with `new AtomicInteger()` which is faster: > @State(Scope.Thread) > @OutputTimeUnit(TimeUnit.NANOSECONDS) > @BenchmarkMode(value = Mode.AverageTime) > public class AtomicBenchmark { > @Benchmark > public Object defaultValue() { > return new AtomicInteger(); > } > @Benchmark > public Object explicitValue() { > return new AtomicInteger(0); > } > } > THis benchmark demonstrates that `explicitValue()` is much slower: > Benchmark Mode Cnt Score Error Units > AtomicBenchmark.defaultValue avgt 30 4.778 ? 0.403 ns/op > AtomicBenchmark.explicitValue avgt 30 11.846 ? 0.273 ns/op > So meanwhile https://bugs.openjdk.java.net/browse/JDK-8145948 is still in progress we could trivially replace explicit zeroing with default constructors gaining some performance benefit with no risk. > > I've tested the changes locally, both tier1 and tier 2 are ok. > > Could one create an issue for tracking this? This pull request has now been integrated. Changeset: 3c4fc793 Author: Sergey Tsypanov Committer: Daniel Fuchs URL: https://git.openjdk.java.net/jdk/commit/3c4fc793 Stats: 19 lines in 17 files changed: 0 ins; 3 del; 16 mod 8255299: Drop explicit zeroing at instantiation of Atomic* objects Reviewed-by: redestad, serb, prr ------------- PR: https://git.openjdk.java.net/jdk/pull/818 From dfuchs at openjdk.java.net Wed Oct 28 12:14:47 2020 From: dfuchs at openjdk.java.net (Daniel Fuchs) Date: Wed, 28 Oct 2020 12:14:47 GMT Subject: RFR: 8255299: Drop explicit zeroing at instantiation of Atomic* objects [v2] In-Reply-To: References: Message-ID: On Wed, 28 Oct 2020 08:56:05 GMT, ?????? ??????? wrote: >> FYI it is better to use merge, instead of rebase+force push. Rebase breaks history and all existed code comments. > > @mrserb thanks for pointing this out! Thanks for updating with latest master changes Sergey! My tests were all green. ------------- PR: https://git.openjdk.java.net/jdk/pull/818 From rkennke at openjdk.java.net Wed Oct 28 12:33:51 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Wed, 28 Oct 2020 12:33:51 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v12] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: On Tue, 27 Oct 2020 13:57:28 GMT, Roman Kennke wrote: >> src/hotspot/share/gc/shenandoah/c2/shenandoahBarrierSetC2.cpp line 1062: >> >>> 1060: Node* in2 = n->in(2); >>> 1061: if (in1->bottom_type() == TypePtr::NULL_PTR && >>> 1062: (in2->Opcode() != Op_ShenandoahLoadReferenceBarrier || >> >> This is a bugfix, right? It changes `in1` (seemingly incorrect) to `in2` (seemingly correct). If so, maybe we should split it out to fix previous releases too? > > Actually I think I *introduced* a bug there. It seems curious that it worked that way :-) I'm fixing it. You are right. Tracking & fixing this here: https://bugs.openjdk.java.net/browse/JDK-8255534 ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From rkennke at openjdk.java.net Wed Oct 28 13:56:48 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Wed, 28 Oct 2020 13:56:48 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v12] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: <5hNdcR3PevZ5mvvcPbZcivU9EdEUL6Kw3K6fv_rnzeA=.5150351b-814e-474f-9399-70580e3f4d30@github.com> On Tue, 27 Oct 2020 11:04:02 GMT, Aleksey Shipilev wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename native argument to maybe_narrow_oop for more clarity > > src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 1733: > >> 1731: } >> 1732: } >> 1733: > > Why this move? That seems to be a leftover from when the verifier was not yet working correctly. I think it can be reverted. ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From coleenp at openjdk.java.net Wed Oct 28 14:06:48 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 28 Oct 2020 14:06:48 GMT Subject: RFR: 8255397: x86: coalesce reference and int entry points into vtos bytecodes [v3] In-Reply-To: References: Message-ID: On Wed, 28 Oct 2020 11:25:58 GMT, Claes Redestad wrote: >> On x86 - both 32- and 64-bit - the code laid out for transitionining into a vtos bytecode when having a reference and int top-of-stack state is semantically identical, and can be coalesced. >> >> This patch removes a short jump on some cases which is marginally beneficial when interpreting, while measurably reducing overhead of generating the interpreter itself. > > Claes Redestad has updated the pull request incrementally with one additional commit since the last revision: > > Use push directly in push_i_or_ptr Yes that was a good suggestion @shipilev and the change looks much better to me now. Thanks! ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/865 From redestad at openjdk.java.net Wed Oct 28 14:18:50 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Wed, 28 Oct 2020 14:18:50 GMT Subject: RFR: 8255397: x86: coalesce reference and int entry points into vtos bytecodes [v3] In-Reply-To: References: Message-ID: On Wed, 28 Oct 2020 11:22:32 GMT, Aleksey Shipilev wrote: >> Claes Redestad has updated the pull request incrementally with one additional commit since the last revision: >> >> Use push directly in push_i_or_ptr > > This looks good to me. Coleen needs to ack too. @shipilev @coleenp - thank you for reviewing! ------------- PR: https://git.openjdk.java.net/jdk/pull/865 From redestad at openjdk.java.net Wed Oct 28 14:18:51 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Wed, 28 Oct 2020 14:18:51 GMT Subject: Integrated: 8255397: x86: coalesce reference and int entry points into vtos bytecodes In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 14:53:08 GMT, Claes Redestad wrote: > On x86 - both 32- and 64-bit - the code laid out for transitionining into a vtos bytecode when having a reference and int top-of-stack state is semantically identical, and can be coalesced. > > This patch removes a short jump on some cases which is marginally beneficial when interpreting, while measurably reducing overhead of generating the interpreter itself. This pull request has now been integrated. Changeset: bbf0a31e Author: Claes Redestad URL: https://git.openjdk.java.net/jdk/commit/bbf0a31e Stats: 18 lines in 3 files changed: 13 ins; 3 del; 2 mod 8255397: x86: coalesce reference and int entry points into vtos bytecodes Reviewed-by: shade, coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/865 From eosterlund at openjdk.java.net Wed Oct 28 14:20:45 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 28 Oct 2020 14:20:45 GMT Subject: Integrated: 8255233: InterpreterRuntime::at_unwind should be a JRT_LEAF In-Reply-To: References: Message-ID: On Fri, 23 Oct 2020 09:45:11 GMT, Erik ?sterlund wrote: > InterpreterRuntime::at_unwind is called at the very beginning of remove_activation(), to notify concurrent stack processing that a frame is about to be unwound. It is currently a JRT_ENTRY, because it needs a last_Java_frame to see what frame is about to get unwound. > > However, there are special return paths used by JVMTI pop frame, that checks if the caller frame is deoptimized, then calls a special path that removes the top activation, assuming that does not enter the deopt handler. The new JRT_ENTRY makes that reasoning invalid. > > Therefore, we need this to be a JRT_LEAF, that sets a last Java frame, to make everyone happy. This patch performs that change. > > I have run tier 1-5 testing, and manually tested: > > while true; do make test JTREG="RETAIN=all" TEST=open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/ForceEarlyReturn/ForceEarlyReturn002 TEST_OPTS_JAVA_OPTIONS="-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=0.0001 -XX:ZFragmentationLimit=0.01 -XX:+VerifyOops -XX:+ZVerifyViews" ; done > > Before the fix it crashes ~1/15 runs with a bad oop. After the fix, it doesn't crash. I have run it more times than my tmux buffer fits (for a day), and it does not fail any more with this fix. > > Unfortunately, my testing on AArch64 has been stalled for a day, so I have sent out this PR without the testing of those bits being finished. I won't push until I get the results back, of course. But I am expecting that to be fine, as there is nothing special going on there and it compiles. Will post a comment when the complete results have arrived. This pull request has now been integrated. Changeset: aaf4f690 Author: Erik ?sterlund URL: https://git.openjdk.java.net/jdk/commit/aaf4f690 Stats: 10 lines in 3 files changed: 4 ins; 3 del; 3 mod 8255233: InterpreterRuntime::at_unwind should be a JRT_LEAF Reviewed-by: coleenp, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/828 From rkennke at openjdk.java.net Wed Oct 28 14:41:54 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Wed, 28 Oct 2020 14:41:54 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v12] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: <4jaq_DRQhEuLVTL3CoeYEJrlanL_svXcTVhiFzjH37o=.31457feb-c243-4ca6-bd87-bb03563161ac@github.com> On Wed, 28 Oct 2020 14:38:02 GMT, Roman Kennke wrote: >> src/hotspot/share/gc/shenandoah/shenandoahHeap.inline.hpp line 404: >> >>> 402: assert(ctx->is_complete(), "sanity"); >>> 403: >>> 404: const ShenandoahMarkBitMap* mark_bit_map = ctx->mark_bit_map(); >> >> Why `const`? Not necessarily wrong, but inconsistent with the rest of the method. > > mark_bit_map() only returns a const now, and I think that's better: *if* we are to expose the bitmap, then better make it read-only. OTOH, it is only used to call get_next_marked_addr() which we can do just as well through ctx, and not expose the bitmap to begin with. ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From rkennke at openjdk.java.net Wed Oct 28 14:41:53 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Wed, 28 Oct 2020 14:41:53 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v12] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: On Tue, 27 Oct 2020 11:06:28 GMT, Aleksey Shipilev wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename native argument to maybe_narrow_oop for more clarity > > src/hotspot/share/gc/shenandoah/shenandoahHeap.inline.hpp line 404: > >> 402: assert(ctx->is_complete(), "sanity"); >> 403: >> 404: const ShenandoahMarkBitMap* mark_bit_map = ctx->mark_bit_map(); > > Why `const`? Not necessarily wrong, but inconsistent with the rest of the method. mark_bit_map() only returns a const now, and I think that's better: *if* we are to expose the bitmap, then better make it read-only. ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From aph at openjdk.java.net Wed Oct 28 15:56:56 2020 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 28 Oct 2020 15:56:56 GMT Subject: RFR: JDK-8255544: Create a checked cast Message-ID: In many places we've added C-style casts to silence compiler warnings, for example when truncating a size_t to an int when we know the size_t is a small struct. Such casts are inherently risky, because they effectively disable useful compiler warnings. We should add a form of cast that checks at runtime that a truncation does not overflow. ------------- Commit messages: - JDK-8255544: Create a checked cast Changes: https://git.openjdk.java.net/jdk/pull/904/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=904&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255544 Stats: 15 lines in 1 file changed: 15 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/904.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/904/head:pull/904 PR: https://git.openjdk.java.net/jdk/pull/904 From rkennke at openjdk.java.net Wed Oct 28 15:59:00 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Wed, 28 Oct 2020 15:59:00 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v17] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: <3b2ySofmKxLflg6HTT4grDQrG05_e7jM9tpDFwlbFuw=.6fde5caa-5d73-49b7-b230-96a83b58c106@github.com> > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads that make heavvy use of such weak references will therefore potentially cause significant GC pauses. > > There are 3 main items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for FinalReference, marking stops there, and does not mark through the referent. > 2. Concurrent processing is performed after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers in Reference.get() intrinsics that return NULL when the referent is not reachable. > > Testing: hotspot_gc_shenadoah (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions, tier1, tier2, vmTestbase_vm_metaspace, vmTestbase_nsk_jvmti, with -XX:+UseShenandoahGC without regressions, specjvm with various levels of verification Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 71 commits: - Add ShenandoahWorkerPolicy entry for conc-weak-refs - Merge branch 'master' into shenandoah-concurrent-weakrefs - Better encapsulation of the bitmap within ShMarkingContext - Fix docs in shenandoahReferenceProcessor.hpp - Change CWR to CWRF phase timing to avoid clash with conc-weak-roots - Fix copyrights of shenandoahMarkBitMap.* - Move back before-evac verification to where it has been - Whitespace fixes - Use template-class instead of template-typename for load_reference_barrier* entries - Initialize name and calladdr to make compiler happy about empty default branch in switch - ... and 61 more: https://git.openjdk.java.net/jdk/compare/aaf4f690...69c81d74 ------------- Changes: https://git.openjdk.java.net/jdk/pull/505/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=16 Stats: 2416 lines in 55 files changed: 1645 ins; 565 del; 206 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From alanb at openjdk.java.net Wed Oct 28 15:59:54 2020 From: alanb at openjdk.java.net (Alan Bateman) Date: Wed, 28 Oct 2020 15:59:54 GMT Subject: RFR: 8188055: (ref) Add Reference::refersTo predicate [v6] In-Reply-To: <0dhF_xxcp1VoUowwdZenB2qWa9ILcZjTMe3lsaRrg7k=.3c633db8-f745-4353-ad34-a64fbc96d4e0@github.com> References: <0dhF_xxcp1VoUowwdZenB2qWa9ILcZjTMe3lsaRrg7k=.3c633db8-f745-4353-ad34-a64fbc96d4e0@github.com> Message-ID: On Wed, 21 Oct 2020 02:28:30 GMT, Kim Barrett wrote: >> Finally returning to this review that was started in April 2020. I've >> recast it as a github PR. I think the security concern raised by Gil >> has been adequately answered. >> https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-April/029203.html >> https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-July/030401.html >> https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-August/030677.html >> https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-September/030793.html >> >> Please review a new function: java.lang.ref.Reference.refersTo. >> >> This function is needed to test the referent of a Reference object without >> artificially extending the lifetime of the referent object, as may happen >> when calling Reference.get. Some garbage collectors require extending the >> lifetime of a weak referent when accessed, in order to maintain collector >> invariants. Lifetime extension may occur with any collector when the >> Reference is a SoftReference, as calling get indicates recent access. This >> new function also allows testing the referent of a PhantomReference, which >> can't be accessed by calling get. >> >> The new function uses native methods whose implementations are in the VM so >> they can use the Access API. It is the intent that these methods will be >> intrinsified by optimizing compilers like C2 or graal, but that hasn't been >> implemented yet. Bear that in mind before rushing off to change existing >> uses of Reference.get. >> >> There are two native methods involved, one in Reference and an override in >> PhantomReference, both package private in java.lang.ref. The reason for this >> split is to simplify the intrinsification. This is a change from the version >> from April 2020; that version had a single native method in Reference, >> implemented using the ON_UNKNOWN_OOP_REF Access reference strength category. >> However, adding support for that category in the compilers adds significant >> implementation effort and complexity. Splitting avoids that complexity. >> >> Testing: >> mach5 tier1 >> Locally (linux-x64) verified the new test passes with various garbage collectors. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > improve wording in refersTo javadoc The API looks good, thanks for getting this in. ------------- Marked as reviewed by alanb (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/498 From aph at openjdk.java.net Wed Oct 28 16:06:46 2020 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 28 Oct 2020 16:06:46 GMT Subject: RFR: JDK-8255544: Create a checked cast In-Reply-To: References: Message-ID: On Wed, 28 Oct 2020 15:50:52 GMT, Andrew Haley wrote: > In many places we've added C-style casts to silence compiler warnings, for example when truncating a size_t to an int when we know the size_t is a small struct. Such casts are inherently risky, because they effectively disable useful compiler warnings. We should add a form of cast that checks at runtime that a truncation does not overflow. One thing I should have said: the need for this was triggered by a recent patch to silence many warnings emitted the MSVC AArch64 compiler. It would have been possible to put the checks into the AArch64 back end, but I think we need a centralized way to do it. ------------- PR: https://git.openjdk.java.net/jdk/pull/904 From gziemski at openjdk.java.net Wed Oct 28 16:18:53 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Wed, 28 Oct 2020 16:18:53 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) [v2] In-Reply-To: <-ea8RDBX1RgeWaDuWs2i-ddatSdtPWJSrTJFNajbay8=.05b2bfc2-eeea-4fe9-ab0a-14cc50e9e1b3@github.com> References: <-ea8RDBX1RgeWaDuWs2i-ddatSdtPWJSrTJFNajbay8=.05b2bfc2-eeea-4fe9-ab0a-14cc50e9e1b3@github.com> Message-ID: On Wed, 28 Oct 2020 07:04:40 GMT, Thomas Stuefe wrote: >> Gerard Ziemski has updated the pull request incrementally with two additional commits since the last revision: >> >> - Only use UseOsErrorReporting on Windows >> - Revert "reset signal handlers to their system defaults if handling crash with UseOSErrorReporting" >> >> This reverts commit f6340643974f3e0cc3ab95fbbba51b23b8d9af31. > > Could you please do a small cleanup: > > UseOSErrorReporting is defined as pd flag, with definitions in all os-dependent globals.. files. Unnecessarily, since the default value is always false. We could remove the pd definitions and just make this a normal flag in globals.hpp. > > (Would be cleaner to move it to globals_windows.hpp but this would probably need a csr since its a product flag) Thank you Thomas and David, I'm learning a lot from your reviews! Can you please take a look at the current fix in Webrevs section? (the trivial change - make it Windows only fix, but don't remove the flag yet from other platforms) ------------- PR: https://git.openjdk.java.net/jdk/pull/813 From gziemski at openjdk.java.net Wed Oct 28 16:52:01 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Wed, 28 Oct 2020 16:52:01 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) [v3] In-Reply-To: References: Message-ID: <6rybIs1odojqcKQ6zzl39wj2IxGxnMCVQXgpeajzqns=.64424fc9-74d7-4e2b-9d89-df6f7da44770@github.com> > hi all, > > Please review this simple fix for POSIX platforms, which addresses a time out that occurs while handling a crash with UseOSErrorReporting turned ON. > > It appears that "UseOSErrorReporting" flag was only ever meant to be used on Windows platform and was mistakenly left available for other platforms. In this fix we make sure to only use the flag on Windows platform and make it a NOP for other platforms. > > Note #1: A similar hang issue occurs today even on Windows, with the only difference being that before a process times out (takes 2 minutes) it runs out of stack space in about 250 loops, so that's the only reason it doesn't linger for that long. Windows issue is tracked separately by https://bugs.openjdk.java.net/browse/JDK-8250782 > > Note #2: Creating native crash log (on macOS) is a non-trivial, research wise effort, that is tracked by https://bugs.openjdk.java.net/browse/JDK-8237727 > > Note #3 Removal of the "UseOSErrorReporting" flag will be depended on whether we can do #2 and at that time we can decide whether to keep it and implement it for other platforms or whether to remove it, provided that #2 can not be done reliably. Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: make UseOSErrorReporting flag Windows only ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/813/files - new: https://git.openjdk.java.net/jdk/pull/813/files/74d6c9a6..b849b3c4 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=813&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=813&range=01-02 Stats: 18 lines in 5 files changed: 3 ins; 8 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/813.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/813/head:pull/813 PR: https://git.openjdk.java.net/jdk/pull/813 From akozlov at openjdk.java.net Wed Oct 28 16:53:00 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Wed, 28 Oct 2020 16:53:00 GMT Subject: RFR: 8255416: Investigate err_msg to detect unnecessary uses Message-ID: Hi, When a single string without formatting arguments is provided to `err_msg`, it's redundancy, as the same message could be used without any err_msg. This is a follow-up to the discussion https://github.com/openjdk/jdk/pull/812#discussion_r511784050 Please review a change that makes `err_msg` with a single string to fail compilation. Detected uses of err_msg with a single string were eliminated as well. ------------- Commit messages: - Fix unnecessary err_msg uses - Prevent err_msg without format arguments Changes: https://git.openjdk.java.net/jdk/pull/905/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=905&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255416 Stats: 34 lines in 9 files changed: 18 ins; 0 del; 16 mod Patch: https://git.openjdk.java.net/jdk/pull/905.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/905/head:pull/905 PR: https://git.openjdk.java.net/jdk/pull/905 From stuefe at openjdk.java.net Wed Oct 28 17:16:47 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 28 Oct 2020 17:16:47 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) [v3] In-Reply-To: <6rybIs1odojqcKQ6zzl39wj2IxGxnMCVQXgpeajzqns=.64424fc9-74d7-4e2b-9d89-df6f7da44770@github.com> References: <6rybIs1odojqcKQ6zzl39wj2IxGxnMCVQXgpeajzqns=.64424fc9-74d7-4e2b-9d89-df6f7da44770@github.com> Message-ID: <0peTckbitQpAZsGhPCZsBBh74gmIBgea50MLkLezS94=.f6187eec-ff8c-40e1-aad1-315bc4de70ee@github.com> On Wed, 28 Oct 2020 16:52:01 GMT, Gerard Ziemski wrote: >> hi all, >> >> Please review this simple fix for POSIX platforms, which addresses a time out that occurs while handling a crash with UseOSErrorReporting turned ON. >> >> It appears that "UseOSErrorReporting" flag was only ever meant to be used on Windows platform and was mistakenly left available for other platforms. In this fix we make sure to only use the flag on Windows platform and make it a NOP for other platforms. >> >> Note #1: A similar hang issue occurs today even on Windows, with the only difference being that before a process times out (takes 2 minutes) it runs out of stack space in about 250 loops, so that's the only reason it doesn't linger for that long. Windows issue is tracked separately by https://bugs.openjdk.java.net/browse/JDK-8250782 >> >> Note #2: Creating native crash log (on macOS) is a non-trivial, research wise effort, that is tracked by https://bugs.openjdk.java.net/browse/JDK-8237727 >> >> Note #3 Removal of the "UseOSErrorReporting" flag will be depended on whether we can do #2 and at that time we can decide whether to keep it and implement it for other platforms or whether to remove it, provided that #2 can not be done reliably. > > Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: > > make UseOSErrorReporting flag Windows only HI Gerard, The patch is fine in its current form to me (including your last push). Whether or not to do a CSR I leave up to you and David. As my final remark to our "return from signal handler" discussion: I'd probably be more chill if this were a simple application. Like vi :) But we do so many unusual things (including generating, then running our own code) and the VM is the base for such a large software stack that I rather be careful. All my remaining remarks are nits. Take what you like, ignore the rest. Thank you, Thomas src/hotspot/share/utilities/vmError.cpp line 1437: > 1435: } else { > 1436: #if defined(_WINDOWS) > 1437: // If UseOsErrorReporting we call this for each level of the call stack Could you please change this comment to refer to UseOSErrorReporting? (Note the capital s). Makes it easier to grep for it. Same goes for os_windows.cpp:2357 . src/hotspot/share/utilities/vmError.cpp line 1631: > 1629: } > 1630: > 1631: #if defined(_WINDOWS) If you like you could abbreviate this Hunk with something like if (WINDOWS_ONLY(!UseOsErrorReporting) NOT_WINDOWS(true)) { but this is fine too, I leave it up to you. src/hotspot/share/utilities/vmError.cpp line 1440: > 1438: // while searching for the exception handler. Only the first level needs > 1439: // to be reported. > 1440: if (UseOSErrorReporting && log_done) return; This has nothing to do with you patch, which is fine: But the more I look at this line the more confused I get. I am not sure what the point is. log_done means we have written the hs-err file successfully and got a signal after the call to VMError::report() but before returning from this function resp. before calling abort. That covers a whole section between lines 1545 and 1629, I am surprised how much we do there. I am almost certain some things will not behave well when secondary crashes happen and we re-enter this function, e.g.: 1556 JFR_ONLY(Jfr::on_vm_shutdown(true);) 1557 1558 if (PrintNMTStatistics) { 1559 fdStream fds(fd_out); 1560 MemTracker::final_report(&fds); 1561 } both should be guarded against re-entering when this function is called repeatedly and they have had their song-and-dance already. Otherwise e.g. we may see the NMT output twice if a signal occurs after line 1560. All idle musings, potentially a future cleanup. ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/813 From rkennke at openjdk.java.net Wed Oct 28 18:15:01 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Wed, 28 Oct 2020 18:15:01 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v18] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: <82-q-nPH3pC7oWA8mTZQtE5t5Qc6Gs7Dsf2G-NbbLvU=.3ed9e332-f0f0-4f47-b5dd-8ce7aaaa6d2e@github.com> > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads that make heavvy use of such weak references will therefore potentially cause significant GC pauses. > > There are 3 main items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for FinalReference, marking stops there, and does not mark through the referent. > 2. Concurrent processing is performed after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers in Reference.get() intrinsics that return NULL when the referent is not reachable. > > Testing: hotspot_gc_shenadoah (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions, tier1, tier2, vmTestbase_vm_metaspace, vmTestbase_nsk_jvmti, with -XX:+UseShenandoahGC without regressions, specjvm with various levels of verification Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Consolidate native-LRB invocation ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/505/files - new: https://git.openjdk.java.net/jdk/pull/505/files/69c81d74..0dc1ed88 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=17 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=16-17 Stats: 5 lines in 1 file changed: 4 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From shade at openjdk.java.net Wed Oct 28 18:18:51 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 28 Oct 2020 18:18:51 GMT Subject: RFR: 8255416: Investigate err_msg to detect unnecessary uses In-Reply-To: References: Message-ID: On Wed, 28 Oct 2020 16:46:47 GMT, Anton Kozlov wrote: > Hi, > > When a single string without formatting arguments is provided to `err_msg`, it's redundancy, as the same message could be used without any err_msg. This is a follow-up to the discussion https://github.com/openjdk/jdk/pull/812#discussion_r511784050 > > Please review a change that makes `err_msg` with a single string to fail compilation. > > Detected uses of err_msg with a single string were eliminated as well. src/hotspot/share/gc/shenandoah/mode/shenandoahMode.hpp line 35: > 33: do { \ > 34: if (!(name)) { \ > 35: const char *msg = "GC mode needs -XX:+" #name " to work correctly"; \ Please decide which way the `*` leans in this change. I prefer `const char* msg`, like in the change below. ------------- PR: https://git.openjdk.java.net/jdk/pull/905 From shade at openjdk.java.net Wed Oct 28 18:18:54 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 28 Oct 2020 18:18:54 GMT Subject: RFR: 8255416: Investigate err_msg to detect unnecessary uses In-Reply-To: References: Message-ID: On Wed, 28 Oct 2020 18:13:55 GMT, Aleksey Shipilev wrote: >> Hi, >> >> When a single string without formatting arguments is provided to `err_msg`, it's redundancy, as the same message could be used without any err_msg. This is a follow-up to the discussion https://github.com/openjdk/jdk/pull/812#discussion_r511784050 >> >> Please review a change that makes `err_msg` with a single string to fail compilation. >> >> Detected uses of err_msg with a single string were eliminated as well. > > src/hotspot/share/gc/shenandoah/mode/shenandoahMode.hpp line 35: > >> 33: do { \ >> 34: if (!(name)) { \ >> 35: const char *msg = "GC mode needs -XX:+" #name " to work correctly"; \ > > Please decide which way the `*` leans in this change. I prefer `const char* msg`, like in the change below. In fact, maybe just inline this literal down in `vm_exit_during_initialization` invocation. ------------- PR: https://git.openjdk.java.net/jdk/pull/905 From shade at openjdk.java.net Wed Oct 28 18:30:59 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 28 Oct 2020 18:30:59 GMT Subject: RFR: 8255550: x86: Assembler::cmpq(Address dst, Register src) encoding is incorrect Message-ID: Compare: void Assembler::cmpq(Address dst, Register src) { InstructionMark im(this); emit_int16(get_prefixq(dst, src), 0x3B); emit_operand(src, dst); } void Assembler::cmpq(Register dst, Address src) { InstructionMark im(this); emit_int16(get_prefixq(src, dst), 0x3B); emit_operand(dst, src); } They use the same opcode -- `0x3B`, which is for `CMP r, r/m`. While `cmpq(Address,Register)` actually should be using `0x39` for `CMP r/m, r`. I also suspect they emit basically the same instruction, because the `get_prefixq` and `emit_operand` argument order is irrelevant. AFAIU, it does not break horribly, because the `cmpq(Address,Register)` is not used anywhere except the new code in `MacroAssembler::safepoint_poll`, added by [JDK-8253180](https://bugs.openjdk.java.net/browse/JDK-8253180). This was found by Zhengyu, when he tried to enable that new code on x86_32 by inverting `cmpq(addr, reg); jcc(above, slow_path)` to `cmpptr(reg, addr); jcc(belowEquals, slow_path)`. Then, everything blew up, because the semantics of `cmpq(addr,reg)` was wrong, and this inversion was subtly broken. Current candidate patch encodes this `cmpq` properly. Since that changes the semantics, I had to flip the condition code in its only use. Alternatives: - I considered removing `cmpq(Address,Register)` altogether, but it would require more work to untangle `cmpptr(Address,Register)` and `cmpptr(Address,AddressLiteral)` for x86_32. - We can also split out `MacroAssembler::safepoint_poll` change to use `cmpq(Register,Address)` to begin with, but current shape gives us a way to test the encoding. Additional testing: - [x] tier1 with Shenandoah (a few failures are pre-existing) - [x] tier1 with Z (AFAICS, all failing tests are OOME'ing or break SA, and probably are problem-listed) ------------- Commit messages: - 8255550: x86: Assembler::cmpq(Address dst, Register src) encoding is incorrect Changes: https://git.openjdk.java.net/jdk/pull/910/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=910&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255550 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/910.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/910/head:pull/910 PR: https://git.openjdk.java.net/jdk/pull/910 From stuefe at openjdk.java.net Wed Oct 28 18:41:44 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 28 Oct 2020 18:41:44 GMT Subject: RFR: 8255416: Investigate err_msg to detect unnecessary uses In-Reply-To: References: Message-ID: On Wed, 28 Oct 2020 16:46:47 GMT, Anton Kozlov wrote: > Hi, > > When a single string without formatting arguments is provided to `err_msg`, it's redundancy, as the same message could be used without any err_msg. This is a follow-up to the discussion https://github.com/openjdk/jdk/pull/812#discussion_r511784050 > > Please review a change that makes `err_msg` with a single string to fail compilation. > > Detected uses of err_msg with a single string were eliminated as well. Changes requested by stuefe (Reviewer). src/hotspot/share/gc/shenandoah/mode/shenandoahMode.hpp line 44: > 42: if ((name)) { \ > 43: const char* msg = "GC mode needs -XX:-" #name " to work correctly"; \ > 44: vm_exit_during_initialization("Error", msg); \ Same as above, can be inlined into one call. No need for the temporary variable. src/hotspot/share/utilities/formatBuffer.hpp line 124: > 122: // If compilation fails because of ambiguity between this and real constructor, you > 123: // could drop err_msg use at all. > 124: inline FormatErrBuffer(const char* msg) { ShouldNotReachHere(); } I do not think this is a good idea (apart from it being too complex for a not that serious issue). The asserts fire, of course, only at runtime. But this function is used usually in some error context, as part of of error reporting. I do not think our tests cover all those paths. Either somehow make this a compile time error or just leave it as it is. We also could, since this buffer object is used usually as input for vm_exit_during_initialization(), give that function a var-arg overload. ------------- PR: https://git.openjdk.java.net/jdk/pull/905 From shade at openjdk.java.net Wed Oct 28 18:59:49 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 28 Oct 2020 18:59:49 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v12] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: On Tue, 27 Oct 2020 11:47:47 GMT, Roman Kennke wrote: >> src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp line 360: >> >>> 358: >>> 359: if (ShenandoahBarrierSet::use_load_reference_barrier_native(decorators, type)) { >>> 360: load_reference_barrier_native(masm, dst, src, (decorators & IN_NATIVE) == 0); >> >> I am a bit confused. If we introduce the local variable, would it be `maybe_narrow_oop`? How does it relate to `IS_NATIVE`? Also, see that `use_load_reference_barrier_native` already tests `IS_NATIVE`. > > Yeah, this is confusing. Zhengyu also stumbled over this. Notice that (decorators & IN_NATIVE) == 0 tests for 'is *not* native'. The point is that native-access is *always* uncompressed-oops, while accessing a referent is narrowOop or oop depending on UseCompressedOops. Hence the distinction. If you have a good suggestion on how to make this less confusing, I'd appreciate it. But that's the thing that gets my head spinning. Why do we call into `lrb_native` for referents? This contradicts the idea that "native-access is always uncompressed-oops". I think this tries to overload "native" with more meaning that it is equipped to carry. I think at very least it should say: if (ShenandoahBarrierSet::use_load_reference_barrier_native(decorators, type)) { // API impedance: when used on native refs, lrb-native necessarily works with full oops, // but when used for weak/phantom refs, it might need to work with narrow oops. // Therefore, we need to ask barrier code to look back at UseCompressedOops and // decide, when lrb-native is not IN_NATIVE. TODO: Resolve this API impedance. bool maybe_narrow_oop = (decorators & IN_NATIVE) == 0; load_reference_barrier_native(masm, dst, src, maybe_narrow_oop); } else { load_reference_barrier(masm, dst, src); } ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From shade at openjdk.java.net Wed Oct 28 18:59:51 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 28 Oct 2020 18:59:51 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v12] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: <4d4Sj-7PwPgKu80Uv1jE0d9Xcg6kQeojjIUuDtXtaIY=.7f3024d0-9f63-4ebd-9c67-fe6fe17c584f@github.com> On Tue, 27 Oct 2020 10:41:39 GMT, Aleksey Shipilev wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename native argument to maybe_narrow_oop for more clarity > > src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp line 525: > >> 523: >> 524: if (ShenandoahBarrierSet::use_load_reference_barrier_native(decorators, type)) { >> 525: load_reference_barrier_native(masm, dst, src, (decorators & IN_NATIVE) == 0); > > Same comment as in `aarch64` code. This conversation can be resolved, as it is the same as for aarch64 code. ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From redestad at openjdk.java.net Wed Oct 28 19:33:47 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Wed, 28 Oct 2020 19:33:47 GMT Subject: RFR: 8255285: Move JVMFlag origins into a new enum JVMFlagOrigin [v2] In-Reply-To: References: Message-ID: On Mon, 26 Oct 2020 22:20:28 GMT, Ioi Lam wrote: >> Many JVM function take an `JVMFlag::Flags` parameter to indicate the origin of the flag -- i.e., "who is setting this flag". E.g., in arguments.hpp: >> >> static bool parse_argument(const char* arg, JVMFlag::Flags origin); >> >> However, `JVMFlag::Flags` contains many other bits that are unrelated to the origin. We should add a new enum `JVMFlagOrigin` that has only the valid values for the origin. This makes it possible to do more type-safety checks at C++ compilation time. >> >> This patch also renamed the confusing bit `JVMFlag::ORIG_COMMAND_LINE` to `WAS_SET_IN_COMMAND_LINE` and added documentation, so that it won't be confused with `JVMFlagOrigin::COMMAND_LINE`. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Removed aliases of JVMFlagOrigin::X as JVMFlag::X Marked as reviewed by redestad (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/823 From sspitsyn at openjdk.java.net Wed Oct 28 19:47:47 2020 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Wed, 28 Oct 2020 19:47:47 GMT Subject: RFR: 8255243: Reinforce escape barrier interactions with ZGC conc stack processing [v2] In-Reply-To: References: Message-ID: On Tue, 27 Oct 2020 13:31:36 GMT, Erik ?sterlund wrote: >> The escape barrier reallocates scalarized objects potentially deep into the stack of a remote thread. Each allocation can safepoint, causing referenced frames to be invalid. Some sprinklings were added that deal with that, but I believe it was subsequently broken with the integration of the new vector API, that has its own new deoptimization code that did not know about this. Not surprisingly, the integration of the new vector API had no idea about this subtlety, and allocates an object, and then reads an object deep from the stack of a remote thread (using an escape barrier). I suppose the issue is that all these 3 things were integrated at almost the same time. The problematic code sequence is in VectorSupport::allocate_vector() in vectorSupport.cpp, which is called from Deoptimization::realloc_objects(). It first allocates an oop (possibly safepointing), and then reads a vector oop from the stack. This is usually fine, but not through the escape barrier, with concurrent stack s canning. While I have not seen any crashes yet, I can see from code inspection, that there is no way that this works correctly. >> >> In order to make this less fragile for future changes, we should really have a RAII object that keeps the target thread's stack of the escape barrier, stable and processed, across safepoints. This patch fixes that. Then it becomes much easier to reason about its correctness, compared to hoping the various hooks are applied after each safepoint. >> >> With this new robustness fix, the thread running the escape barrier, keeps the target thread stack processed, straight through safepoints on the requesting thread, making it easy and intuitive to understand why this works correctly. The RAII object basically just has to cover the code block that pokes at the remote stack and goes in and out of safepoints, arbitrarily. Arguably, this escape barrier doesn't need to be blazingly fast, and can afford keeping stacks sane through its operation. > > Erik ?sterlund has updated the pull request incrementally with two additional commits since the last revision: > > - Better encapsulate object deoptimization in EscapeBarrier also to facilitate correct interaction with concurrent thread stack processing. > > The Stackwalk for object deoptimization in VM_GetOrSetLocal::doit_prologue is not prepared for concurrent thread stack processing. > EscapeBarrier::deoptimize_objects(int depth) is extended to cover a range of frames from depth d1 to depth d2. It is also prepared for concurrent thread stack processing. With this change it is used to deoptimize objects in the prologue of VM_GetOrSetLocal. > - Review comments Hi Erik and Richard, Changes in the serviceability files looks fine. Thanks, Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/832 From rkennke at openjdk.java.net Wed Oct 28 19:58:48 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Wed, 28 Oct 2020 19:58:48 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v12] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: On Wed, 28 Oct 2020 18:55:39 GMT, Aleksey Shipilev wrote: >> Yeah, this is confusing. Zhengyu also stumbled over this. Notice that (decorators & IN_NATIVE) == 0 tests for 'is *not* native'. The point is that native-access is *always* uncompressed-oops, while accessing a referent is narrowOop or oop depending on UseCompressedOops. Hence the distinction. If you have a good suggestion on how to make this less confusing, I'd appreciate it. > > But that's the thing that gets my head spinning. Why do we call into `lrb_native` for referents? This contradicts the idea that "native-access is always uncompressed-oops". I think this tries to overload "native" with more meaning that it is equipped to carry. > > I think at very least it should say: > > if (ShenandoahBarrierSet::use_load_reference_barrier_native(decorators, type)) { > // API impedance: when used on native refs, lrb-native necessarily works with full oops, > // but when used for weak/phantom refs, it might need to work with narrow oops. > // Therefore, we need to ask barrier code to look back at UseCompressedOops and > // decide, when lrb-native is not IN_NATIVE. TODO: Resolve this API impedance. > bool maybe_narrow_oop = (decorators & IN_NATIVE) == 0; > load_reference_barrier_native(masm, dst, src, maybe_narrow_oop); > } else { > load_reference_barrier(masm, dst, src); > } I thought about it today. This whole idea of 'LRB-native' is flawed. What it does is prevent resurrection of objects when loading from a field or off-heap-location that is 'weak' or 'phantom'. This has nothing to do with -native. It has to do with the field being not-strong. For this reason I think we should call that variant of LRB something like LRB-weak instead. This warrants a larger reshuffling that I'd do either before or after this change goes in. I'd probably also merge our 3(!) different runtime LRB impls into one, with templated path to prevent resurrection, etc. I guess the 2 interpreter LRB entries can also be unified, and differ only in the target entry being called. The only distinction for which IN_NATIVE is relevant is for figuring out whether or not the reference is compressed or not. That is all. ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From eosterlund at openjdk.java.net Wed Oct 28 21:39:44 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 28 Oct 2020 21:39:44 GMT Subject: RFR: 8255243: Reinforce escape barrier interactions with ZGC conc stack processing [v2] In-Reply-To: References: Message-ID: On Wed, 28 Oct 2020 19:44:33 GMT, Serguei Spitsyn wrote: > Hi Erik and Richard, > > Changes in the serviceability files looks fine. > > Thanks, > > Serguei > > Thanks for the review Serguei! ------------- PR: https://git.openjdk.java.net/jdk/pull/832 From akozlov at openjdk.java.net Wed Oct 28 21:57:56 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Wed, 28 Oct 2020 21:57:56 GMT Subject: RFR: 8255416: Investigate err_msg to detect unnecessary uses [v2] In-Reply-To: References: Message-ID: <5rGPbr1hKFnW_88PKXtDsnCBL1KoUK1Hl72uAJW-hD0=.9f4072b2-878a-44a2-90ab-5e8a2841a047@github.com> > Hi, > > When a single string without formatting arguments is provided to `err_msg`, it's redundancy, as the same message could be used without any err_msg. This is a follow-up to the discussion https://github.com/openjdk/jdk/pull/812#discussion_r511784050 > > Please review a change that makes `err_msg` with a single string to fail compilation. > > Detected uses of err_msg with a single string were eliminated as well. Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: Fix codestyle ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/905/files - new: https://git.openjdk.java.net/jdk/pull/905/files/93af93b0..52ff2ccb Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=905&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=905&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/905.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/905/head:pull/905 PR: https://git.openjdk.java.net/jdk/pull/905 From akozlov at openjdk.java.net Wed Oct 28 21:57:57 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Wed, 28 Oct 2020 21:57:57 GMT Subject: RFR: 8255416: Investigate err_msg to detect unnecessary uses [v2] In-Reply-To: References: Message-ID: <0_DgjolEWtNQOrJDn_ydLCTYH1gfPwE2tmdoQ9R3hHk=.cecf2d74-6c79-4270-905e-c651f076bbda@github.com> On Wed, 28 Oct 2020 18:15:47 GMT, Aleksey Shipilev wrote: >> src/hotspot/share/gc/shenandoah/mode/shenandoahMode.hpp line 35: >> >>> 33: do { \ >>> 34: if (!(name)) { \ >>> 35: const char *msg = "GC mode needs -XX:+" #name " to work correctly"; \ >> >> Please decide which way the `*` leans in this change. I prefer `const char* msg`, like in the change below. > > In fact, maybe just inline this literal down in `vm_exit_during_initialization` invocation. Of course it should align to the left, like in the rest of hotspot. Thanks for noticing! As for inlining of the message, there are pros and cons. The arguments should be aligned, so it would become do { \ if (!(name)) { \ vm_exit_during_initialization("Error", msg); \ "GC mode needs -XX:+" #name " to work correctly"; \ } \ } while (0) (not pretty at all, you see). After that, there is an option to split the string into multiple lines in attempt to shrink the length, but then the line would become ungreppable (much bigger evil). Among options, I decided to respect intention and style of original author, who introduced a variable for err_msg, which is unusual and should have some valid rationale behind, like the one above. ------------- PR: https://git.openjdk.java.net/jdk/pull/905 From akozlov at openjdk.java.net Wed Oct 28 21:57:58 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Wed, 28 Oct 2020 21:57:58 GMT Subject: RFR: 8255416: Investigate err_msg to detect unnecessary uses [v2] In-Reply-To: References: Message-ID: <-P5id4x50oiTzG6JHl4GD-HJbqYM4juVm9IpsNvXlV4=.7e07be8c-8389-41e2-a563-1f3be709ac51@github.com> On Wed, 28 Oct 2020 18:22:37 GMT, Thomas Stuefe wrote: >> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix codestyle > > src/hotspot/share/gc/shenandoah/mode/shenandoahMode.hpp line 44: > >> 42: if ((name)) { \ >> 43: const char* msg = "GC mode needs -XX:-" #name " to work correctly"; \ >> 44: vm_exit_during_initialization("Error", msg); \ > > Same as above, can be inlined into one call. No need for the temporary variable. Please see my comment in the thread above. > src/hotspot/share/utilities/formatBuffer.hpp line 124: > >> 122: // If compilation fails because of ambiguity between this and real constructor, you >> 123: // could drop err_msg use at all. >> 124: inline FormatErrBuffer(const char* msg) { ShouldNotReachHere(); } > > I do not think this is a good idea (apart from it being too complex for a not that serious issue). > > The asserts fire, of course, only at runtime. But this function is used usually in some error context, as part of of error reporting. I do not think our tests cover all those paths. > > Either somehow make this a compile time error or just leave it as it is. We also could, since this buffer object is used usually as input for vm_exit_during_initialization(), give that function a var-arg overload. Actually, this code prevents a single string argument in compile time. It relies on two constructors to introduce ambiguity in overload resolution that makes C++ compiler complain and abort compilation. Comment above the dummy constructor should clarify that for anyone stepping on compile error. inline FormatErrBuffer(const char* format, ...) ATTRIBUTE_PRINTF(2, 3); inline FormatErrBuffer(const char* msg) { ShouldNotReachHere(); } For sanity check, I've used this sample code https://godbolt.org/z/szs6rE And the cases that were fixed have been detected by the compiler. I don't think that this problem is a serious issue as well. But as for me, it is a minor code complexity increase to ensure that the minor problem of extra string copy will never appear again. ------------- PR: https://git.openjdk.java.net/jdk/pull/905 From rkennke at openjdk.java.net Wed Oct 28 22:12:59 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Wed, 28 Oct 2020 22:12:59 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v19] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: <-ziqs4ofM_qJ3qmhcOFvAm9sbfO2AfAAAfiL4M6JgNc=.d7bd465c-a120-4aca-a159-f237950e085e@github.com> > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads that make heavvy use of such weak references will therefore potentially cause significant GC pauses. > > There are 3 main items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for FinalReference, marking stops there, and does not mark through the referent. > 2. Concurrent processing is performed after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers in Reference.get() intrinsics that return NULL when the referent is not reachable. > > Testing: hotspot_gc_shenadoah (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions, tier1, tier2, vmTestbase_vm_metaspace, vmTestbase_nsk_jvmti, with -XX:+UseShenandoahGC without regressions, specjvm with various levels of verification Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 73 commits: - Merge branch 'master' into shenandoah-concurrent-weakrefs - Consolidate native-LRB invocation - Add ShenandoahWorkerPolicy entry for conc-weak-refs - Merge branch 'master' into shenandoah-concurrent-weakrefs - Better encapsulation of the bitmap within ShMarkingContext - Fix docs in shenandoahReferenceProcessor.hpp - Change CWR to CWRF phase timing to avoid clash with conc-weak-roots - Fix copyrights of shenandoahMarkBitMap.* - Move back before-evac verification to where it has been - Whitespace fixes - ... and 63 more: https://git.openjdk.java.net/jdk/compare/3f20612e...0561c2a1 ------------- Changes: https://git.openjdk.java.net/jdk/pull/505/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=18 Stats: 2387 lines in 54 files changed: 1630 ins; 565 del; 192 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From sviswanathan at openjdk.java.net Wed Oct 28 22:24:46 2020 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Wed, 28 Oct 2020 22:24:46 GMT Subject: RFR: 8255550: x86: Assembler::cmpq(Address dst, Register src) encoding is incorrect In-Reply-To: References: Message-ID: On Wed, 28 Oct 2020 18:24:10 GMT, Aleksey Shipilev wrote: > Compare: > > void Assembler::cmpq(Address dst, Register src) { > InstructionMark im(this); > emit_int16(get_prefixq(dst, src), 0x3B); > emit_operand(src, dst); > } > > void Assembler::cmpq(Register dst, Address src) { > InstructionMark im(this); > emit_int16(get_prefixq(src, dst), 0x3B); > emit_operand(dst, src); > } > > They use the same opcode -- `0x3B`, which is for `CMP r, r/m`. While `cmpq(Address,Register)` actually should be using `0x39` for `CMP r/m, r`. I also suspect they emit basically the same instruction, because the `get_prefixq` and `emit_operand` argument order is irrelevant. > > AFAIU, it does not break horribly, because the `cmpq(Address,Register)` is not used anywhere except the new code in `MacroAssembler::safepoint_poll`, added by [JDK-8253180](https://bugs.openjdk.java.net/browse/JDK-8253180). This was found by Zhengyu, when he tried to enable that new code on x86_32 by inverting `cmpq(addr, reg); jcc(above, slow_path)` to `cmpptr(reg, addr); jcc(belowEquals, slow_path)`. Then, everything blew up, because the semantics of `cmpq(addr,reg)` was wrong, and this inversion was subtly broken. > > Current candidate patch encodes this `cmpq` properly. Since that changes the semantics, I had to flip the condition code in its only use. I opted to do this, because _maybe_ some code in downstream projects want to use this odd `cmpq`. Although even if so, the uses could be trivially rewritten. > > Alternatives: > - I considered removing `cmpq(Address,Register)` altogether, but it would require more work to untangle `cmpptr(Address,Register)` and `cmpptr(Address,AddressLiteral)` for x86_32. > - We can also split out `MacroAssembler::safepoint_poll` change to use `cmpq(Register,Address)` to begin with, but current shape gives us a way to test the encoding. > > Additional testing: > - [x] tier1 with Shenandoah (a few failures are pre-existing) > - [x] tier1 with Z (AFAICS, all failing tests are OOME'ing or break SA, and probably are problem-listed) The fix looks good to me. cmpq(Address,Register) should be using 0x39 as the opcode. ------------- PR: https://git.openjdk.java.net/jdk/pull/910 From rkennke at openjdk.java.net Wed Oct 28 22:32:03 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Wed, 28 Oct 2020 22:32:03 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v20] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads that make heavvy use of such weak references will therefore potentially cause significant GC pauses. > > There are 3 main items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for FinalReference, marking stops there, and does not mark through the referent. > 2. Concurrent processing is performed after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers in Reference.get() intrinsics that return NULL when the referent is not reachable. > > Testing: hotspot_gc_shenadoah (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions, tier1, tier2, vmTestbase_vm_metaspace, vmTestbase_nsk_jvmti, with -XX:+UseShenandoahGC without regressions, specjvm with various levels of verification Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Add missing merge changes of shenandoahTaskQueue.hpp ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/505/files - new: https://git.openjdk.java.net/jdk/pull/505/files/0561c2a1..072e3817 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=19 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=18-19 Stats: 38 lines in 1 file changed: 28 ins; 0 del; 10 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From kvn at openjdk.java.net Wed Oct 28 23:13:52 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 28 Oct 2020 23:13:52 GMT Subject: RFR: 8255550: x86: Assembler::cmpq(Address dst, Register src) encoding is incorrect In-Reply-To: References: Message-ID: On Wed, 28 Oct 2020 18:24:10 GMT, Aleksey Shipilev wrote: > Compare: > > void Assembler::cmpq(Address dst, Register src) { > InstructionMark im(this); > emit_int16(get_prefixq(dst, src), 0x3B); > emit_operand(src, dst); > } > > void Assembler::cmpq(Register dst, Address src) { > InstructionMark im(this); > emit_int16(get_prefixq(src, dst), 0x3B); > emit_operand(dst, src); > } > > They use the same opcode -- `0x3B`, which is for `CMP r, r/m`. While `cmpq(Address,Register)` actually should be using `0x39` for `CMP r/m, r`. I also suspect they emit basically the same instruction, because the `get_prefixq` and `emit_operand` argument order is irrelevant. > > AFAIU, it does not break horribly, because the `cmpq(Address,Register)` is not used anywhere except the new code in `MacroAssembler::safepoint_poll`, added by [JDK-8253180](https://bugs.openjdk.java.net/browse/JDK-8253180). This was found by Zhengyu, when he tried to enable that new code on x86_32 by inverting `cmpq(addr, reg); jcc(above, slow_path)` to `cmpptr(reg, addr); jcc(belowEquals, slow_path)`. Then, everything blew up, because the semantics of `cmpq(addr,reg)` was wrong, and this inversion was subtly broken. > > Current candidate patch encodes this `cmpq` properly. Since that changes the semantics, I had to flip the condition code in its only use. I opted to do this, because _maybe_ some code in downstream projects want to use this odd `cmpq`. Although even if so, the uses could be trivially rewritten. > > Alternatives: > - I considered removing `cmpq(Address,Register)` altogether, but it would require more work to untangle `cmpptr(Address,Register)` and `cmpptr(Address,AddressLiteral)` for x86_32. > - We can also split out `MacroAssembler::safepoint_poll` change to use `cmpq(Register,Address)` to begin with, but current shape gives us a way to test the encoding. > > Additional testing: > - [x] tier1 with Shenandoah (a few failures are pre-existing) > - [x] tier1 with Z (AFAICS, all failing tests are OOME'ing or break SA, and probably are problem-listed) Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/910 From dholmes at openjdk.java.net Thu Oct 29 05:07:44 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 29 Oct 2020 05:07:44 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) [v3] In-Reply-To: <6rybIs1odojqcKQ6zzl39wj2IxGxnMCVQXgpeajzqns=.64424fc9-74d7-4e2b-9d89-df6f7da44770@github.com> References: <6rybIs1odojqcKQ6zzl39wj2IxGxnMCVQXgpeajzqns=.64424fc9-74d7-4e2b-9d89-df6f7da44770@github.com> Message-ID: <3D8q_g6SYuSs4guk3x1SVKxinDjSuh3J9Oa8CSbs8tQ=.11c0f642-dcd2-4b5f-aca9-64efcba604c3@github.com> On Wed, 28 Oct 2020 16:52:01 GMT, Gerard Ziemski wrote: >> hi all, >> >> Please review this simple fix for POSIX platforms, which addresses a time out that occurs while handling a crash with UseOSErrorReporting turned ON. >> >> It appears that "UseOSErrorReporting" flag was only ever meant to be used on Windows platform and was mistakenly left available for other platforms. In this fix we make sure to only use the flag on Windows platform and make it a NOP for other platforms. >> >> Note #1: A similar hang issue occurs today even on Windows, with the only difference being that before a process times out (takes 2 minutes) it runs out of stack space in about 250 loops, so that's the only reason it doesn't linger for that long. Windows issue is tracked separately by https://bugs.openjdk.java.net/browse/JDK-8250782 >> >> Note #2: Creating native crash log (on macOS) is a non-trivial, research wise effort, that is tracked by https://bugs.openjdk.java.net/browse/JDK-8237727 >> >> Note #3 Removal of the "UseOSErrorReporting" flag will be depended on whether we can do #2 and at that time we can decide whether to keep it and implement it for other platforms or whether to remove it, provided that #2 can not be done reliably. > > Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: > > make UseOSErrorReporting flag Windows only Looks fine to me, but will require a trivial CSR request. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/813 From dholmes at openjdk.java.net Thu Oct 29 05:07:46 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 29 Oct 2020 05:07:46 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) [v3] In-Reply-To: <0peTckbitQpAZsGhPCZsBBh74gmIBgea50MLkLezS94=.f6187eec-ff8c-40e1-aad1-315bc4de70ee@github.com> References: <6rybIs1odojqcKQ6zzl39wj2IxGxnMCVQXgpeajzqns=.64424fc9-74d7-4e2b-9d89-df6f7da44770@github.com> <0peTckbitQpAZsGhPCZsBBh74gmIBgea50MLkLezS94=.f6187eec-ff8c-40e1-aad1-315bc4de70ee@github.com> Message-ID: On Wed, 28 Oct 2020 17:11:05 GMT, Thomas Stuefe wrote: >> Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: >> >> make UseOSErrorReporting flag Windows only > > src/hotspot/share/utilities/vmError.cpp line 1440: > >> 1438: // while searching for the exception handler. Only the first level needs >> 1439: // to be reported. >> 1440: if (UseOSErrorReporting && log_done) return; > > This has nothing to do with you patch, which is fine: > > But the more I look at this line the more confused I get. I am not sure what the point is. log_done means we have written the hs-err file successfully and got a signal after the call to VMError::report() but before returning from this function resp. before calling abort. > > That covers a whole section between lines 1545 and 1629, I am surprised how much we do there. I am almost certain some things will not behave well when secondary crashes happen and we re-enter this function, e.g.: > 1556 JFR_ONLY(Jfr::on_vm_shutdown(true);) > 1557 > 1558 if (PrintNMTStatistics) { > 1559 fdStream fds(fd_out); > 1560 MemTracker::final_report(&fds); > 1561 } > both should be guarded against re-entering when this function is called repeatedly and they have had their song-and-dance already. Otherwise e.g. we may see the NMT output twice if a signal occurs after line 1560. > > All idle musings, potentially a future cleanup. That code was added by: https://bugs.openjdk.java.net/browse/JDK-4997835 But it is not clear how that code actually relates to that bug report! The comment indicates we will call report_and_die multiple times as we search for the exception handler - in which case we only want to report once - but I'm not seeing where these calls originate from. ------------- PR: https://git.openjdk.java.net/jdk/pull/813 From dholmes at openjdk.java.net Thu Oct 29 05:11:46 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 29 Oct 2020 05:11:46 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) [v3] In-Reply-To: <6rybIs1odojqcKQ6zzl39wj2IxGxnMCVQXgpeajzqns=.64424fc9-74d7-4e2b-9d89-df6f7da44770@github.com> References: <6rybIs1odojqcKQ6zzl39wj2IxGxnMCVQXgpeajzqns=.64424fc9-74d7-4e2b-9d89-df6f7da44770@github.com> Message-ID: On Wed, 28 Oct 2020 16:52:01 GMT, Gerard Ziemski wrote: >> hi all, >> >> Please review this simple fix for POSIX platforms, which addresses a time out that occurs while handling a crash with UseOSErrorReporting turned ON. >> >> It appears that "UseOSErrorReporting" flag was only ever meant to be used on Windows platform and was mistakenly left available for other platforms. In this fix we make sure to only use the flag on Windows platform and make it a NOP for other platforms. >> >> Note #1: A similar hang issue occurs today even on Windows, with the only difference being that before a process times out (takes 2 minutes) it runs out of stack space in about 250 loops, so that's the only reason it doesn't linger for that long. Windows issue is tracked separately by https://bugs.openjdk.java.net/browse/JDK-8250782 >> >> Note #2: Creating native crash log (on macOS) is a non-trivial, research wise effort, that is tracked by https://bugs.openjdk.java.net/browse/JDK-8237727 >> >> Note #3 Removal of the "UseOSErrorReporting" flag will be depended on whether we can do #2 and at that time we can decide whether to keep it and implement it for other platforms or whether to remove it, provided that #2 can not be done reliably. > > Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: > > make UseOSErrorReporting flag Windows only src/hotspot/os/windows/globals_windows.hpp line 39: > 37: constraint) \ > 38: \ > 39: product(bool, UseOSErrorReporting, false \ Comma missing after "false" ------------- PR: https://git.openjdk.java.net/jdk/pull/813 From iklam at openjdk.java.net Thu Oct 29 05:37:06 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 29 Oct 2020 05:37:06 GMT Subject: RFR: 8255285: Move JVMFlag origins into a new enum JVMFlagOrigin [v3] In-Reply-To: References: Message-ID: > Many JVM function take an `JVMFlag::Flags` parameter to indicate the origin of the flag -- i.e., "who is setting this flag". E.g., in arguments.hpp: > > static bool parse_argument(const char* arg, JVMFlag::Flags origin); > > However, `JVMFlag::Flags` contains many other bits that are unrelated to the origin. We should add a new enum `JVMFlagOrigin` that has only the valid values for the origin. This makes it possible to do more type-safety checks at C++ compilation time. > > This patch also renamed the confusing bit `JVMFlag::ORIG_COMMAND_LINE` to `WAS_SET_IN_COMMAND_LINE` and added documentation, so that it won't be confused with `JVMFlagOrigin::COMMAND_LINE`. Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: - fixed build - Merge branch 'master' into 8255285-new-enum-JVMFlagOrigin - renamed WAS_SET_IN_COMMAND_LINE to WAS_SET_ON_COMMAND_LINE - Removed aliases of JVMFlagOrigin::X as JVMFlag::X - fixed whitespaces - jvmflagorigin ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/823/files - new: https://git.openjdk.java.net/jdk/pull/823/files/53fed1b0..b1d53802 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=823&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=823&range=01-02 Stats: 76759 lines in 1201 files changed: 56580 ins; 15008 del; 5171 mod Patch: https://git.openjdk.java.net/jdk/pull/823.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/823/head:pull/823 PR: https://git.openjdk.java.net/jdk/pull/823 From shade at openjdk.java.net Thu Oct 29 06:20:43 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 29 Oct 2020 06:20:43 GMT Subject: RFR: 8255550: x86: Assembler::cmpq(Address dst, Register src) encoding is incorrect In-Reply-To: References: Message-ID: On Wed, 28 Oct 2020 23:10:54 GMT, Vladimir Kozlov wrote: >> Compare: >> >> void Assembler::cmpq(Address dst, Register src) { >> InstructionMark im(this); >> emit_int16(get_prefixq(dst, src), 0x3B); >> emit_operand(src, dst); >> } >> >> void Assembler::cmpq(Register dst, Address src) { >> InstructionMark im(this); >> emit_int16(get_prefixq(src, dst), 0x3B); >> emit_operand(dst, src); >> } >> >> They use the same opcode -- `0x3B`, which is for `CMP r, r/m`. While `cmpq(Address,Register)` actually should be using `0x39` for `CMP r/m, r`. I also suspect they emit basically the same instruction, because the `get_prefixq` and `emit_operand` argument order is irrelevant. >> >> AFAIU, it does not break horribly, because the `cmpq(Address,Register)` is not used anywhere except the new code in `MacroAssembler::safepoint_poll`, added by [JDK-8253180](https://bugs.openjdk.java.net/browse/JDK-8253180). This was found by Zhengyu, when he tried to enable that new code on x86_32 by inverting `cmpq(addr, reg); jcc(above, slow_path)` to `cmpptr(reg, addr); jcc(belowEquals, slow_path)`. Then, everything blew up, because the semantics of `cmpq(addr,reg)` was wrong, and this inversion was subtly broken. >> >> Current candidate patch encodes this `cmpq` properly. Since that changes the semantics, I had to flip the condition code in its only use. I opted to do this, because _maybe_ some code in downstream projects want to use this odd `cmpq`. Although even if so, the uses could be trivially rewritten. >> >> Alternatives: >> - I considered removing `cmpq(Address,Register)` altogether, but it would require more work to untangle `cmpptr(Address,Register)` and `cmpptr(Address,AddressLiteral)` for x86_32. >> - We can also split out `MacroAssembler::safepoint_poll` change to use `cmpq(Register,Address)` to begin with, but current shape gives us a way to test the encoding. >> >> Additional testing: >> - [x] tier1 with Shenandoah (a few failures are pre-existing) >> - [x] tier1 with Z (AFAICS, all failing tests are OOME'ing or break SA, and probably are problem-listed) > > Good. Thanks for review @vnkozlov and @sviswa7! @fisk, does the changed code in `safepoint_poll` still looks good for you? ------------- PR: https://git.openjdk.java.net/jdk/pull/910 From jbhateja at openjdk.java.net Thu Oct 29 06:33:50 2020 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Thu, 29 Oct 2020 06:33:50 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v4] In-Reply-To: References: Message-ID: <6MClf7up0tikZCf-1JAmKXaNMstf2aELFl3ArqQU7DE=.50c1fa2a-93e5-4501-973a-84a942e6d409@github.com> On Mon, 19 Oct 2020 18:33:22 GMT, Vladimir Kozlov wrote: >> Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: >> >> - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8252848 >> - Replacing explicit type checks with existing type checking routines >> - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8252848 >> - 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions. > > There is regression after 8252847 changes: 8254890. > It should be fixed before we proceed with these changes. @vnkozlov , @neliasso , @nsjian , kindly let me know if there are further review comments on this patch. ------------- PR: https://git.openjdk.java.net/jdk/pull/302 From redestad at openjdk.java.net Thu Oct 29 07:27:49 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Thu, 29 Oct 2020 07:27:49 GMT Subject: RFR: 8255285: Move JVMFlag origins into a new enum JVMFlagOrigin [v3] In-Reply-To: References: Message-ID: <3LBoCf3p7ka9vYFoRk-1edEOIZKGOk8RM1GU_lU1ewI=.05cee9e6-2dd4-47b3-8350-df78fee62546@github.com> On Thu, 29 Oct 2020 05:37:06 GMT, Ioi Lam wrote: >> Many JVM function take an `JVMFlag::Flags` parameter to indicate the origin of the flag -- i.e., "who is setting this flag". E.g., in arguments.hpp: >> >> static bool parse_argument(const char* arg, JVMFlag::Flags origin); >> >> However, `JVMFlag::Flags` contains many other bits that are unrelated to the origin. We should add a new enum `JVMFlagOrigin` that has only the valid values for the origin. This makes it possible to do more type-safety checks at C++ compilation time. >> >> This patch also renamed the confusing bit `JVMFlag::ORIG_COMMAND_LINE` to `WAS_SET_IN_COMMAND_LINE` and added documentation, so that it won't be confused with `JVMFlagOrigin::COMMAND_LINE`. > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - fixed build > - Merge branch 'master' into 8255285-new-enum-JVMFlagOrigin > - renamed WAS_SET_IN_COMMAND_LINE to WAS_SET_ON_COMMAND_LINE > - Removed aliases of JVMFlagOrigin::X as JVMFlag::X > - fixed whitespaces > - jvmflagorigin Marked as reviewed by redestad (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/823 From eosterlund at openjdk.java.net Thu Oct 29 07:30:45 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 29 Oct 2020 07:30:45 GMT Subject: RFR: 8255550: x86: Assembler::cmpq(Address dst, Register src) encoding is incorrect In-Reply-To: References: Message-ID: On Wed, 28 Oct 2020 18:24:10 GMT, Aleksey Shipilev wrote: > Compare: > > void Assembler::cmpq(Address dst, Register src) { > InstructionMark im(this); > emit_int16(get_prefixq(dst, src), 0x3B); > emit_operand(src, dst); > } > > void Assembler::cmpq(Register dst, Address src) { > InstructionMark im(this); > emit_int16(get_prefixq(src, dst), 0x3B); > emit_operand(dst, src); > } > > They use the same opcode -- `0x3B`, which is for `CMP r, r/m`. While `cmpq(Address,Register)` actually should be using `0x39` for `CMP r/m, r`. I also suspect they emit basically the same instruction, because the `get_prefixq` and `emit_operand` argument order is irrelevant. > > AFAIU, it does not break horribly, because the `cmpq(Address,Register)` is not used anywhere except the new code in `MacroAssembler::safepoint_poll`, added by [JDK-8253180](https://bugs.openjdk.java.net/browse/JDK-8253180). This was found by Zhengyu, when he tried to enable that new code on x86_32 by inverting `cmpq(addr, reg); jcc(above, slow_path)` to `cmpptr(reg, addr); jcc(belowEquals, slow_path)`. Then, everything blew up, because the semantics of `cmpq(addr,reg)` was wrong, and this inversion was subtly broken. > > Current candidate patch encodes this `cmpq` properly. Since that changes the semantics, I had to flip the condition code in its only use. I opted to do this, because _maybe_ some code in downstream projects want to use this odd `cmpq`. Although even if so, the uses could be trivially rewritten. > > Alternatives: > - I considered removing `cmpq(Address,Register)` altogether, but it would require more work to untangle `cmpptr(Address,Register)` and `cmpptr(Address,AddressLiteral)` for x86_32. > - We can also split out `MacroAssembler::safepoint_poll` change to use `cmpq(Register,Address)` to begin with, but current shape gives us a way to test the encoding. > > Additional testing: > - [x] tier1 with Shenandoah (a few failures are pre-existing) > - [x] tier1 with Z (AFAICS, all failing tests are OOME'ing or break SA, and probably are problem-listed) Thanks for poking me. I would prefer to change to the cmpq instruction that has the opposite order in the stack watermark barrier instead. Everywhere in the code I talk about the condition being sp being "above" watermark. Changing it to less makes me twist my head in ways that heads should not twist. ------------- Changes requested by eosterlund (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/910 From shade at openjdk.java.net Thu Oct 29 07:34:45 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 29 Oct 2020 07:34:45 GMT Subject: RFR: 8255550: x86: Assembler::cmpq(Address dst, Register src) encoding is incorrect In-Reply-To: References: Message-ID: <-5qhSYs7-YkY1N2eZTHZo6JUORs3nzdNJkKWfk-UHTA=.e4178ac7-1973-4e2d-a445-9d831356747e@github.com> On Thu, 29 Oct 2020 07:28:29 GMT, Erik ?sterlund wrote: > Thanks for poking me. I would prefer to change to the cmpq instruction that has the opposite order in the stack watermark barrier instead. Everywhere in the code I talk about the condition being sp being "above" watermark. Changing it to less makes me twist my head in ways that heads should not twist. Ok, so let's do this: can you change the parameter order in `cmpq` instruction in `safepoint_poll`, run the tests you probably know are sensitive to this, and I'll drop the `safepoint_poll` hunk from here after that change lands? ------------- PR: https://git.openjdk.java.net/jdk/pull/910 From shade at openjdk.java.net Thu Oct 29 07:40:44 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 29 Oct 2020 07:40:44 GMT Subject: RFR: 8255550: x86: Assembler::cmpq(Address dst, Register src) encoding is incorrect In-Reply-To: <-5qhSYs7-YkY1N2eZTHZo6JUORs3nzdNJkKWfk-UHTA=.e4178ac7-1973-4e2d-a445-9d831356747e@github.com> References: <-5qhSYs7-YkY1N2eZTHZo6JUORs3nzdNJkKWfk-UHTA=.e4178ac7-1973-4e2d-a445-9d831356747e@github.com> Message-ID: On Thu, 29 Oct 2020 07:31:48 GMT, Aleksey Shipilev wrote: > > Thanks for poking me. I would prefer to change to the cmpq instruction that has the opposite order in the stack watermark barrier instead. Everywhere in the code I talk about the condition being sp being "above" watermark. Changing it to less makes me twist my head in ways that heads should not twist. > > Ok, so let's do this: can you change the parameter order in `cmpq` instruction in `safepoint_poll`, run the tests you probably know are sensitive to this, and I'll drop the `safepoint_poll` hunk from here after that change lands? What I meant was "in a separate PR", not to mess up with the change here. I think it amounts to: diff --git a/src/hotspot/cpu/x86/macroAssembler_x86.cpp b/src/hotspot/cpu/x86/macroAssembler_x86.cpp index a8da3aa17b8..81303ea76c4 100644 --- a/src/hotspot/cpu/x86/macroAssembler_x86.cpp +++ b/src/hotspot/cpu/x86/macroAssembler_x86.cpp @@ -2765,7 +2765,7 @@ void MacroAssembler::safepoint_poll(Label& slow_path, Register thread_reg, bool if (at_return) { // Note that when in_nmethod is set, the stack pointer is incremented before the poll. Therefore, // we may safely use rsp instead to perform the stack watermark check. - cmpq(Address(thread_reg, Thread::polling_word_offset()), in_nmethod ? rsp : rbp); + cmpq(in_nmethod ? rsp : rbp, Address(thread_reg, Thread::polling_word_offset())); jcc(Assembler::above, slow_path); return; } I can do that, if you want, and if you trust `tier1` is enough. ------------- PR: https://git.openjdk.java.net/jdk/pull/910 From shade at openjdk.java.net Thu Oct 29 07:52:43 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 29 Oct 2020 07:52:43 GMT Subject: RFR: 8255416: Investigate err_msg to detect unnecessary uses [v2] In-Reply-To: <0_DgjolEWtNQOrJDn_ydLCTYH1gfPwE2tmdoQ9R3hHk=.cecf2d74-6c79-4270-905e-c651f076bbda@github.com> References: <0_DgjolEWtNQOrJDn_ydLCTYH1gfPwE2tmdoQ9R3hHk=.cecf2d74-6c79-4270-905e-c651f076bbda@github.com> Message-ID: <5w-53vDjzVgTBPYbZBkfo6hVSlOWJXDTM9oMdvA7LuU=.e76c93fc-919f-4406-bb71-7a00ad622901@github.com> On Wed, 28 Oct 2020 21:53:11 GMT, Anton Kozlov wrote: >> In fact, maybe just inline this literal down in `vm_exit_during_initialization` invocation. > > Of course it should align to the left, like in the rest of hotspot. Thanks for noticing! > > As for inlining of the message, there are pros and cons. The arguments should be aligned, so it would become > do { \ > if (!(name)) { \ > vm_exit_during_initialization("Error", msg); \ > "GC mode needs -XX:+" #name " to work correctly"; \ > } \ > } while (0) > > (not pretty at all, you see). > > After that, there is an option to split the string into multiple lines in attempt to shrink the length, but then the line would become ungreppable (much bigger evil). > > Among options, I decided to respect intention and style of original author, who introduced a variable for err_msg, which is unusual and should have some valid rationale behind, like the one above. I _am_ the author of those `SHENANDOAH_CHECK_FLAG_SET` blocks ;) This should be fine: do { \ if (!(name)) { \ vm_exit_during_initialization("Error", \ "GC mode needs -XX:+" #name " to work correctly"); \ } \ } while (0) ------------- PR: https://git.openjdk.java.net/jdk/pull/905 From njian at openjdk.java.net Thu Oct 29 08:02:44 2020 From: njian at openjdk.java.net (Ningsheng Jian) Date: Thu, 29 Oct 2020 08:02:44 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v9] In-Reply-To: References: <8Ryyxuf5P2D6WNyj4riYCTgN0U6WLrLpBmxhNbnmPpQ=.b2ed5660-99d0-49d1-83e0-8b2de518d7b8@github.com> Message-ID: On Fri, 23 Oct 2020 12:00:55 GMT, Jatin Bhateja wrote: >> src/hotspot/share/opto/vectornode.cpp line 775: >> >>> 773: VectorMaskGenNode* make(int opc, Node* src, const Type* ty, const Type* ety) { >>> 774: return new VectorMaskGenNode(src, ty, ety); >>> 775: } >> >> These are not used? > > This is a just a helper routine not used currently though. So maybe the nodes creation code in generate_partial_inlining_block() can use these helper functions? ------------- PR: https://git.openjdk.java.net/jdk/pull/302 From njian at openjdk.java.net Thu Oct 29 08:05:43 2020 From: njian at openjdk.java.net (Ningsheng Jian) Date: Thu, 29 Oct 2020 08:05:43 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v9] In-Reply-To: References: <8Ryyxuf5P2D6WNyj4riYCTgN0U6WLrLpBmxhNbnmPpQ=.b2ed5660-99d0-49d1-83e0-8b2de518d7b8@github.com> Message-ID: On Fri, 23 Oct 2020 12:00:46 GMT, Jatin Bhateja wrote: > As currently there is no support for mask registers in RA, for X86 long ideal type is sufficient for a mask producing node (def operand is a mask register) ; But for complete support returning Op_RegVMask as an ideal_reg() type for masked Ideal node should do the trick without creating an explicit new ideal Type for mask generating nodes. Spill sizes and number of slots may be different for X86 and ARM (SVE). So, do you have a plan to support Op_RegVMask? In SVE, we will use this kind of node for mask/predicate type. > Shallow copy during Node::clone should be sufficient here since encapsulated element type will be preserved. Checking the code in Node::clone() again, I think the object copy relies on size_of(), so you need to override that to get the correct object size for copying. ------------- PR: https://git.openjdk.java.net/jdk/pull/302 From shade at openjdk.java.net Thu Oct 29 08:17:43 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 29 Oct 2020 08:17:43 GMT Subject: RFR: 8255550: x86: Assembler::cmpq(Address dst, Register src) encoding is incorrect In-Reply-To: References: <-5qhSYs7-YkY1N2eZTHZo6JUORs3nzdNJkKWfk-UHTA=.e4178ac7-1973-4e2d-a445-9d831356747e@github.com> Message-ID: On Thu, 29 Oct 2020 07:37:35 GMT, Aleksey Shipilev wrote: >>> Thanks for poking me. I would prefer to change to the cmpq instruction that has the opposite order in the stack watermark barrier instead. Everywhere in the code I talk about the condition being sp being "above" watermark. Changing it to less makes me twist my head in ways that heads should not twist. >> >> Ok, so let's do this: can you change the parameter order in `cmpq` instruction in `safepoint_poll`, run the tests you probably know are sensitive to this, and I'll drop the `safepoint_poll` hunk from here after that change lands? > >> > Thanks for poking me. I would prefer to change to the cmpq instruction that has the opposite order in the stack watermark barrier instead. Everywhere in the code I talk about the condition being sp being "above" watermark. Changing it to less makes me twist my head in ways that heads should not twist. >> >> Ok, so let's do this: can you change the parameter order in `cmpq` instruction in `safepoint_poll`, run the tests you probably know are sensitive to this, and I'll drop the `safepoint_poll` hunk from here after that change lands? > > What I meant was "in a separate PR", not to mess up with the change here. I think it amounts to: > > diff --git a/src/hotspot/cpu/x86/macroAssembler_x86.cpp b/src/hotspot/cpu/x86/macroAssembler_x86.cpp > index a8da3aa17b8..81303ea76c4 100644 > --- a/src/hotspot/cpu/x86/macroAssembler_x86.cpp > +++ b/src/hotspot/cpu/x86/macroAssembler_x86.cpp > @@ -2765,7 +2765,7 @@ void MacroAssembler::safepoint_poll(Label& slow_path, Register thread_reg, bool > if (at_return) { > // Note that when in_nmethod is set, the stack pointer is incremented before the poll. Therefore, > // we may safely use rsp instead to perform the stack watermark check. > - cmpq(Address(thread_reg, Thread::polling_word_offset()), in_nmethod ? rsp : rbp); > + cmpq(in_nmethod ? rsp : rbp, Address(thread_reg, Thread::polling_word_offset())); > jcc(Assembler::above, slow_path); > return; > } > > I can do that, if you want, and if you trust `tier1` is enough. Forked `safepoint_poll` change to JDK-8255579, #924 ------------- PR: https://git.openjdk.java.net/jdk/pull/910 From akozlov at openjdk.java.net Thu Oct 29 08:20:45 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 29 Oct 2020 08:20:45 GMT Subject: RFR: 8255416: Investigate err_msg to detect unnecessary uses [v2] In-Reply-To: <5w-53vDjzVgTBPYbZBkfo6hVSlOWJXDTM9oMdvA7LuU=.e76c93fc-919f-4406-bb71-7a00ad622901@github.com> References: <0_DgjolEWtNQOrJDn_ydLCTYH1gfPwE2tmdoQ9R3hHk=.cecf2d74-6c79-4270-905e-c651f076bbda@github.com> <5w-53vDjzVgTBPYbZBkfo6hVSlOWJXDTM9oMdvA7LuU=.e76c93fc-919f-4406-bb71-7a00ad622901@github.com> Message-ID: On Thu, 29 Oct 2020 07:49:21 GMT, Aleksey Shipilev wrote: >> Of course it should align to the left, like in the rest of hotspot. Thanks for noticing! >> >> As for inlining of the message, there are pros and cons. The arguments should be aligned, so it would become >> do { \ >> if (!(name)) { \ >> vm_exit_during_initialization("Error", msg); \ >> "GC mode needs -XX:+" #name " to work correctly"; \ >> } \ >> } while (0) >> >> (not pretty at all, you see). >> >> After that, there is an option to split the string into multiple lines in attempt to shrink the length, but then the line would become ungreppable (much bigger evil). >> >> Among options, I decided to respect intention and style of original author, who introduced a variable for err_msg, which is unusual and should have some valid rationale behind, like the one above. > > I _am_ the author of those `SHENANDOAH_CHECK_FLAG_SET` blocks ;) > > This should be fine: > > do { \ > if (!(name)) { \ > vm_exit_during_initialization("Error", \ > "GC mode needs -XX:+" #name " to work correctly"); \ > } \ > } while (0) OK, if you're OK with that. But let's introduce some consistency with e.g. https://github.com/openjdk/jdk/blob/a804c6a6/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp#L420 and use 8 spaces for the argument indentation do { \ if (!(name)) { \ vm_exit_during_initialization("Error", \ "GC mode needs -XX:+" #name " to work correctly"); \ } \ } while (0) ------------- PR: https://git.openjdk.java.net/jdk/pull/905 From shade at openjdk.java.net Thu Oct 29 08:20:45 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 29 Oct 2020 08:20:45 GMT Subject: RFR: 8255416: Investigate err_msg to detect unnecessary uses [v2] In-Reply-To: References: <0_DgjolEWtNQOrJDn_ydLCTYH1gfPwE2tmdoQ9R3hHk=.cecf2d74-6c79-4270-905e-c651f076bbda@github.com> <5w-53vDjzVgTBPYbZBkfo6hVSlOWJXDTM9oMdvA7LuU=.e76c93fc-919f-4406-bb71-7a00ad622901@github.com> Message-ID: On Thu, 29 Oct 2020 08:15:42 GMT, Anton Kozlov wrote: >> I _am_ the author of those `SHENANDOAH_CHECK_FLAG_SET` blocks ;) >> >> This should be fine: >> >> do { \ >> if (!(name)) { \ >> vm_exit_during_initialization("Error", \ >> "GC mode needs -XX:+" #name " to work correctly"); \ >> } \ >> } while (0) > > OK, if you're OK with that. But let's introduce some consistency with e.g. https://github.com/openjdk/jdk/blob/a804c6a6/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp#L420 and use 8 spaces for the argument indentation > > do { \ > if (!(name)) { \ > vm_exit_during_initialization("Error", \ > "GC mode needs -XX:+" #name " to work correctly"); \ > } \ > } while (0) Fine with me! ------------- PR: https://git.openjdk.java.net/jdk/pull/905 From shade at openjdk.java.net Thu Oct 29 08:21:52 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 29 Oct 2020 08:21:52 GMT Subject: RFR: 8255579: x86: Use cmpq(Register,Address) in safepoint_poll Message-ID: JDK-8253180 added a new block in `safepoint_poll` that uses broken `cmpq` (JDK-8255550): it effectively does the comparison with operands swapped. It makes sense to use the non-broken `cmpq`, as to avoid changing the condition code and thus making the code less understandable. See the discussion in #910. Testing: - [x] `tier1` with Z (some SA tests fail, and some other fail with OOME -- seem to be expected/problem-listed) ------------- Commit messages: - 8255579: x86: Use cmpq(Register,Address) in safepoint_poll Changes: https://git.openjdk.java.net/jdk/pull/924/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=924&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255579 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/924.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/924/head:pull/924 PR: https://git.openjdk.java.net/jdk/pull/924 From shade at openjdk.java.net Thu Oct 29 08:26:48 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 29 Oct 2020 08:26:48 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v12] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: On Wed, 28 Oct 2020 19:55:37 GMT, Roman Kennke wrote: >> But that's the thing that gets my head spinning. Why do we call into `lrb_native` for referents? This contradicts the idea that "native-access is always uncompressed-oops". I think this tries to overload "native" with more meaning that it is equipped to carry. >> >> I think at very least it should say: >> >> if (ShenandoahBarrierSet::use_load_reference_barrier_native(decorators, type)) { >> // API impedance: when used on native refs, lrb-native necessarily works with full oops, >> // but when used for weak/phantom refs, it might need to work with narrow oops. >> // Therefore, we need to ask barrier code to look back at UseCompressedOops and >> // decide, when lrb-native is not IN_NATIVE. TODO: Resolve this API impedance. >> bool maybe_narrow_oop = (decorators & IN_NATIVE) == 0; >> load_reference_barrier_native(masm, dst, src, maybe_narrow_oop); >> } else { >> load_reference_barrier(masm, dst, src); >> } > > I thought about it today. This whole idea of 'LRB-native' is flawed. What it does is prevent resurrection of objects when loading from a field or off-heap-location that is 'weak' or 'phantom'. This has nothing to do with -native. It has to do with the field being not-strong. For this reason I think we should call that variant of LRB something like LRB-weak instead. This warrants a larger reshuffling that I'd do either before or after this change goes in. I'd probably also merge our 3(!) different runtime LRB impls into one, with templated path to prevent resurrection, etc. I guess the 2 interpreter LRB entries can also be unified, and differ only in the target entry being called. > > The only distinction for which IN_NATIVE is relevant is for figuring out whether or not the reference is compressed or not. That is all. OK, so that would resolve the `TODO` I suggested in the code comment above, right? If so, we can just put in the code comment for now. ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From stuefe at openjdk.java.net Thu Oct 29 08:36:44 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 29 Oct 2020 08:36:44 GMT Subject: RFR: 8255416: Investigate err_msg to detect unnecessary uses [v2] In-Reply-To: <-P5id4x50oiTzG6JHl4GD-HJbqYM4juVm9IpsNvXlV4=.7e07be8c-8389-41e2-a563-1f3be709ac51@github.com> References: <-P5id4x50oiTzG6JHl4GD-HJbqYM4juVm9IpsNvXlV4=.7e07be8c-8389-41e2-a563-1f3be709ac51@github.com> Message-ID: On Wed, 28 Oct 2020 21:53:19 GMT, Anton Kozlov wrote: >> src/hotspot/share/utilities/formatBuffer.hpp line 124: >> >>> 122: // If compilation fails because of ambiguity between this and real constructor, you >>> 123: // could drop err_msg use at all. >>> 124: inline FormatErrBuffer(const char* msg) { ShouldNotReachHere(); } >> >> I do not think this is a good idea (apart from it being too complex for a not that serious issue). >> >> The asserts fire, of course, only at runtime. But this function is used usually in some error context, as part of of error reporting. I do not think our tests cover all those paths. >> >> Either somehow make this a compile time error or just leave it as it is. We also could, since this buffer object is used usually as input for vm_exit_during_initialization(), give that function a var-arg overload. > > Actually, this code prevents a single string argument in compile time. It relies on two constructors to introduce ambiguity in overload resolution that makes C++ compiler complain and abort compilation. Comment above the dummy constructor should clarify that for anyone stepping on compile error. > inline FormatErrBuffer(const char* format, ...) ATTRIBUTE_PRINTF(2, 3); > inline FormatErrBuffer(const char* msg) { ShouldNotReachHere(); } > For sanity check, I've used this sample code https://godbolt.org/z/szs6rE > > And the cases that were fixed have been detected by the compiler. > > I don't think that this problem is a serious issue as well. But as for me, it is a minor code complexity increase to ensure that the minor problem of extra string copy will never appear again. Oh, now I see it. Smart. But it occurred to me that someone may want to start a err_msg buffer up with a string literal, only to then add additional content via FormatBuffer::append(). So initializing it with a literal and no arguments may be a valid usecase. So I'd still prefer keeping this out. Alternatively, if you like to keep it in, could we just not implement the second constructor? Which should give us linker errors if the static check fails. >> src/hotspot/share/gc/shenandoah/mode/shenandoahMode.hpp line 44: >> >>> 42: if ((name)) { \ >>> 43: const char* msg = "GC mode needs -XX:-" #name " to work correctly"; \ >>> 44: vm_exit_during_initialization("Error", msg); \ >> >> Same as above, can be inlined into one call. No need for the temporary variable. > > Please see my comment in the thread above. Whatever you decide with Alexey is fine. The temporary variable is my least favorite option. ------------- PR: https://git.openjdk.java.net/jdk/pull/905 From akozlov at openjdk.java.net Thu Oct 29 08:42:57 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 29 Oct 2020 08:42:57 GMT Subject: RFR: 8255416: Investigate err_msg to detect unnecessary uses [v3] In-Reply-To: References: Message-ID: > Hi, > > When a single string without formatting arguments is provided to `err_msg`, it's redundancy, as the same message could be used without any err_msg. This is a follow-up to the discussion https://github.com/openjdk/jdk/pull/812#discussion_r511784050 > > Please review a change that makes `err_msg` with a single string to fail compilation. > > Detected uses of err_msg with a single string were eliminated as well. Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: shenandoah: inline message to vm_exit call ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/905/files - new: https://git.openjdk.java.net/jdk/pull/905/files/52ff2ccb..8a99cdcc Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=905&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=905&range=01-02 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/905.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/905/head:pull/905 PR: https://git.openjdk.java.net/jdk/pull/905 From hoffmann at mountainminds.com Thu Oct 29 08:43:13 2020 From: hoffmann at mountainminds.com (Marc Hoffmann) Date: Thu, 29 Oct 2020 09:43:13 +0100 Subject: arm32 builds continue to fail for me after 8253540 and 8253901 In-Reply-To: References: <56ff08d5-a4e5-788a-1c29-02f76e8755d2@redhat.com> <17F91692-4F3D-4FAA-AB94-361B6C84F982@mountainminds.com> <111D12C0-BE9F-4B2F-BAB1-CB0445CAB219@mountainminds.com> Message-ID: <00F0209A-F00D-4B06-B696-7DBD8E9D34F6@mountainminds.com> Hi Boris, thanks for coming back on this! Current master is still red for me with the same failure (https://pici.beachhub.io/#/jdk ): # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/workspace/src/hotspot/share/asm/register.hpp:160), pid=15243, tid=15248 # assert(a != b && a != c && a != d && a != e && b != c && b != d && b != e && c != d && c != e && d != e) failed: registers must be different: a=0x00000002, b=0x00000003, c=0x00000000, d=0x00000000, e=0x0000000c # # JRE version: (16.0) (fastdebug build ) # Java VM: OpenJDK Server VM (fastdebug 16-internal+0-adhoc..workspace, mixed mode, g1 gc, linux-arm) # Problematic frame: # V [libjvm.so+0x751404] InterpreterMacroAssembler::unlock_object(RegisterImpl*) [clone .part.34]+0x63 Also I don?t see how the last commit you mention is related to the issue. The problem seems to be inconsistent assertions in InterpreterMacroAssembler::unlock_object which was not touched since commit [2] in your list: https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/arm/interp_masm_arm.cpp#L1000 In line 990 it is asserted that Rlock == R0, but then in line 1000 the registered should be different: assert_different_registers(Robj, Rmark, Rlock, R0, Rtemp); Regards, -marc > On 28. Oct 2020, at 23:39, Boris Ulasevich wrote: > > Hi Marc, > > Sorry for being unavailable for too long! > > My understanding of the case is that > - my change [2] fixed the issue introduced in [1] > - with the fix JVM build still crashed on RPi > - the build issue on RPi was introduced by the change [3] and was > fixed by the change [4] > > Can I ask you to check that the current repo build works Ok for you? > > regards, > Boris > > > [1] > commit 77a0f3999afa322b64643afd4a161164440af975 > Author: Coleen Phillimore > Date: Mon Sep 28 15:49:02 2020 +0000 > 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function > [2] > commit fd0cb98ed03c6214c02ccd3503c1e6d77065a428 > Author: Boris Ulasevich > Date: Thu Oct 8 06:52:27 2020 +0000 > 8253901: ARM32: SIGSEGV during monitorexit due to incorrect register > use (after JDK-8253540) > > [3] > commit ea27a54bf0ff526effb47f9daaec51ced2d2bb71 > Author: Calvin Cheung > Date: Mon Oct 5 16:52:00 2020 +0000 > 8224509: Incorrect alignment in CDS related allocation code on 32-bit platforms > [4] > commit 5145bed0282575a580cf3ebc343038c1dc8ddb8d > Author: Ioi Lam > Date: Fri Oct 16 05:14:46 2020 +0000 > 8254125: Assertion in cppVtables.cpp during builds on 32bit Windows > > > On Fri, Oct 16, 2020 at 10:19 AM Marc Hoffmann > wrote: >> >> Dear Boris, >> >> if it helps to fix the 32 arm build: >> >> 1) I gave you write access to my JDK fork at GitHub: https://github.com/marchof/jdk >> 2) You can (force) push to the branch called ?build? >> 3) The build results are here: https://pici.beachhub.io/#/jdk-marchof >> >> The repo is polled every 30?, the build takes another 30? until it fails. >> >> Regards, >> -marc >> >> >> On 13. Oct 2020, at 08:48, Boris Ulasevich wrote: >> >> Hi Marc, >> >> I created JDK-8254661 for the issue. I would love to fix it, but still >> can't reproduce the crash (even on Raspberry Pi). >> What configuration do you have? The following sequence works Ok for me: >> pi at raspberrypi $ git clone https://github.com/openjdk/jdk >> pi at raspberrypi $ cd jdk >> pi at raspberrypi $ bash configure --with-boot-jdk=/home/pi/jdk-15 >> pi at raspberrypi $ make >> >> Your debug build shows that I did not fix the >> assert_different_registers in >> InterpreterMacroAssembler::unlock_object() >> body (and the function comment by the way!), though with eyeballing I >> do not see what is wrong for Rlock=R0: >> https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/arm/interp_masm_arm.cpp#L1000 >> >> regards, >> Boris >> >> On Mon, Oct 12, 2020 at 11:34 PM Marc Hoffmann >> wrote: >> >> >> Hi Aleksey, hi Boris, >> >> for me the crash is always reproducible: Every single build after >> >> 77a0f3999afa322b64643afd4a161164440af975 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function >> >> fails on arm32 (build on ubuntu in docker on a raspberry pi 4). Before this commit I haven?t encountered any failures. >> >> Here is the hs_err file with ?enable-debug (reproduced with current master c7f00640627eab38b77d23d07876cf0247fa18f3). >> >> Cheers, >> -marc >> >> >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (/workspace/src/hotspot/share/asm/register.hpp:160), pid=14700, tid=14705 >> # assert(a != b && a != c && a != d && a != e && b != c && b != d && b != e && c != d && c != e && d != e) failed: registers must be different: a=0x00000002, b=0x00000003, c=0x00000000, d=0x00000000, e=0x0000000c >> # >> # JRE version: (16.0) (fastdebug build ) >> # Java VM: OpenJDK Server VM (fastdebug 16-internal+0-adhoc..workspace, mixed mode, g1 gc, linux-arm) >> # Problematic frame: >> # V [libjvm.so+0x7571fc] InterpreterMacroAssembler::unlock_object(RegisterImpl*) [clone .part.34]+0x63 >> # >> # Core dump will be written. Default location: /workspace/make/core >> # >> # >> >> --------------- S U M M A R Y ------------ >> >> Command Line: -Xms64M -Xmx768M --add-exports=java.base/jdk.internal.module=ALL-UNNAMED build.tools.jigsaw.AddPackagesAttribute /workspace/build/linux-arm-server-fastdebug/jdk >> >> Host: 20431585315d, rev 3 (v7l), 4 cores, 3G, Ubuntu 18.04.3 LTS >> Time: Mon Oct 12 20:22:15 2020 UTC elapsed time: 0.144243 seconds (0d 0h 0m 0s) >> >> --------------- T H R E A D --------------- >> >> Current thread (0xb5b16460): JavaThread "Unknown thread" [_thread_in_vm, id=14705, stack(0xb5c6e000,0xb5cbe000)] >> >> Stack: [0xb5c6e000,0xb5cbe000], sp=0xb5cbc2d0, free space=312k >> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x7571fc] InterpreterMacroAssembler::unlock_object(RegisterImpl*) [clone .part.34]+0x63 >> >> Registers: >> r0 = 0x00000003 >> r1 = 0x000000a0 >> r2 = 0x00000002 >> r3 = 0x00000000 >> r4 = 0xb5b168b0 >> r5 = 0x0000000c >> r6 = 0x00000000 >> r7 = 0xb5cbc2e8 >> r8 = 0xb6db1fa8 >> r9 = 0xb5cbc760 >> r10 = 0xe3520000 >> fp = 0xb6db1fa8 >> r12 = 0xb6ff8000 >> sp = 0xb5cbc2d0 >> lr = 0x00000058 >> pc = 0xb64961fc >> cpsr = 0x200f0030 >> >> Top of Stack: (sp=0xb5cbc2d0) >> 0xb5cbc2d0: 00000002 00000003 00000000 00000000 >> 0xb5cbc2e0: 0000000c 00000048 0000006e 00000000 >> 0xb5cbc2f0: 00000000 00000000 00000000 0000007c >> 0xb5cbc300: 00000000 00000077 b5cbc378 00000000 >> 0xb5cbc310: b5cbc380 0000000f b6db1fa8 b5cbc340 >> 0xb5cbc320: b5b168b0 b6db1fa8 b5cbc4c4 b6008a2b >> 0xb5cbc330: b5cbc454 b5cbc348 0000000f b61971cf >> 0xb5cbc340: b5cbc380 b5cbc3b0 b5b168b0 b5cbc388 >> >> Instructions: (pc=0xb64961fc) >> 0xb64960fc: 440be9c7 a034f8c7 f5e06139 68ebf7ed >> 0xb649610c: f040689a 46184164 1180f441 68996011 >> 0xb649611c: f5e03104 4b12f781 46284622 1003f858 >> 0xb649612c: f998f2e4 f64f68eb f2ce7210 6899122f >> 0xb649613c: 600a4618 31046899 f76ef5e0 46284631 >> 0xb649614c: f4e6f69f f5e04630 f507f79b 46bd7786 >> 0xb649615c: 8ff0e8bd 0091bfd0 00006ee4 000059d0 >> 0xb649616c: 00007d1c 4a084b07 b480447b 589baf00 >> 0xb649617c: b91b781b f85d46bd 47707b04 f85d46bd >> 0xb649618c: e7197b04 0091be30 00006a24 bf182900 >> 0xb649619c: f1a1290c e92d0202 b0f34ff0 2301bf08 >> 0xb64961ac: bf18af06 f8df2300 2a018268 461abf8c >> 0xb64961bc: 0201f043 f04f460e 44f833ff 30d0f8c7 >> 0xb64961cc: f8c74604 23003140 333de9c7 30fcf887 >> 0xb64961dc: 3359e9c7 316cf887 4a8eb1da 0e58f04f >> 0xb64961ec: 250c2003 1002f858 f8d12202 21a0c000 >> 0xb64961fc: e000f88c 2000e9cd 6302e9cd 4b874a86 >> 0xb649620c: 447a4887 9504447b f58f4478 f00cfaa3 >> 0xb649621c: 68e2f2d5 4340f44f 33a0f2ce 0a04f04f >> 0xb649622c: 7980f04f 0b00f04f 46106891 600b2501 >> 0xb649623c: 44516891 f6f0f5e0 4a7b497a 3001f858 >> 0xb649624c: 0140f107 a048f8c7 4608643e f8c7681b >> 0xb649625c: 60fb904c f858647b f8c72002 f8c7b054 >> 0xb649626c: 60bab058 e9c73208 653abb19 66fd607a >> 0xb649627c: f732f5e0 689a68e3 4164f040 f4414618 >> 0xb649628c: 60111181 44516899 f6c6f5e0 0110f107 >> 0xb649629c: 68fb687a f8c74608 623aa018 6304e9c7 >> 0xb64962ac: 901cf8c7 bb09e9c7 bb0de9c7 f5e063fd >> 0xb64962bc: 68e3f713 f0406899 46184264 4240f442 >> 0xb64962cc: 6899600a f1074451 f5e00ad0 4b57f6a5 >> 0xb64962dc: 3003f858 2b00781b 8093f040 f04f68bb >> 0xb64962ec: 68f97c80 33082500 c0acf8c7 0b01f04f >> >> >> >> --------------- P R O C E S S --------------- >> >> uid : 0 euid : 0 gid : 0 egid : 0 >> >> umask: 0022 (----w--w-) >> >> Threads class SMR info: >> _java_thread_list=0xb6e56078, length=0, elements={ >> } >> _java_thread_list_alloc_cnt=1, _java_thread_list_free_cnt=0, _java_thread_list_max=0, _nested_thread_list_max=0 >> _delete_lock_wait_cnt=0, _delete_lock_wait_max=0 >> _to_delete_list_cnt=0, _to_delete_list_max=0 >> >> Java Threads: ( => current thread ) >> >> Other Threads: >> 0xb5b73188 GCTaskThread "GC Thread#0" [stack: 0x81d00000,0x81d80000] [id=14706] >> 0xb5b77dc0 ConcurrentGCThread "G1 Main Marker" [stack: 0x81c7e000,0x81cfe000] [id=14707] >> 0xb5b790c0 ConcurrentGCThread "G1 Conc#0" [stack: 0x81a80000,0x81b00000] [id=14708] >> 0xb5bde230 ConcurrentGCThread "G1 Refine#0" [stack: 0x81780000,0x81800000] [id=14709] >> 0xb5bdf488 ConcurrentGCThread "G1 Service" [stack: 0x81580000,0x81600000] [id=14710] >> >> =>0xb5b16460 (exited) JavaThread "Unknown thread" [_thread_in_vm, id=14705, stack(0xb5c6e000,0xb5cbe000)] >> >> Threads with active compile tasks: >> >> VM state: not at safepoint (not fully initialized) >> >> VM Mutex/Monitor currently owned by a thread: None >> >> GC Precious Log: >> CPUs: 4 total, 4 available >> Memory: 3827M >> Large Page Support: Disabled >> NUMA Support: Disabled >> Compressed Oops: Disabled >> Heap Region Size: 1M >> Heap Min Capacity: 64M >> Heap Initial Capacity: 64M >> Heap Max Capacity: 768M >> Pre-touch: Disabled >> Parallel Workers: 4 >> Concurrent Workers: 1 >> Concurrent Refinement Workers: 4 >> Periodic GC: Disabled >> >> Heap: >> garbage-first heap total 65536K, used 0K [0x83a00000, 0xb3a00000) >> region size 1024K, 1 young (1024K), 0 survivors (0K) >> Metaspace used 944K, capacity 2200K, committed 2200K, reserved 4400K >> >> Heap Regions: E=young(eden), S=young(survivor), O=old, HS=humongous(starts), HC=humongous(continues), CS=collection set, F=free, OA=open archive, CA=closed archive, TAMS=top-at-mark-start (previous, next) >> | 0|0x83a00000, 0x83a00000, 0x83b00000| 0%| F| |TAMS 0x83a00000, 0x83a00000| Untracked >> | 1|0x83b00000, 0x83b00000, 0x83c00000| 0%| F| |TAMS 0x83b00000, 0x83b00000| Untracked >> | 2|0x83c00000, 0x83c00000, 0x83d00000| 0%| F| |TAMS 0x83c00000, 0x83c00000| Untracked >> | 3|0x83d00000, 0x83d00000, 0x83e00000| 0%| F| |TAMS 0x83d00000, 0x83d00000| Untracked >> | 4|0x83e00000, 0x83e00000, 0x83f00000| 0%| F| |TAMS 0x83e00000, 0x83e00000| Untracked >> | 5|0x83f00000, 0x83f00000, 0x84000000| 0%| F| |TAMS 0x83f00000, 0x83f00000| Untracked >> | 6|0x84000000, 0x84000000, 0x84100000| 0%| F| |TAMS 0x84000000, 0x84000000| Untracked >> | 7|0x84100000, 0x84100000, 0x84200000| 0%| F| |TAMS 0x84100000, 0x84100000| Untracked >> | 8|0x84200000, 0x84200000, 0x84300000| 0%| F| |TAMS 0x84200000, 0x84200000| Untracked >> | 9|0x84300000, 0x84300000, 0x84400000| 0%| F| |TAMS 0x84300000, 0x84300000| Untracked >> | 10|0x84400000, 0x84400000, 0x84500000| 0%| F| |TAMS 0x84400000, 0x84400000| Untracked >> | 11|0x84500000, 0x84500000, 0x84600000| 0%| F| |TAMS 0x84500000, 0x84500000| Untracked >> | 12|0x84600000, 0x84600000, 0x84700000| 0%| F| |TAMS 0x84600000, 0x84600000| Untracked >> | 13|0x84700000, 0x84700000, 0x84800000| 0%| F| |TAMS 0x84700000, 0x84700000| Untracked >> | 14|0x84800000, 0x84800000, 0x84900000| 0%| F| |TAMS 0x84800000, 0x84800000| Untracked >> | 15|0x84900000, 0x84900000, 0x84a00000| 0%| F| |TAMS 0x84900000, 0x84900000| Untracked >> | 16|0x84a00000, 0x84a00000, 0x84b00000| 0%| F| |TAMS 0x84a00000, 0x84a00000| Untracked >> | 17|0x84b00000, 0x84b00000, 0x84c00000| 0%| F| |TAMS 0x84b00000, 0x84b00000| Untracked >> | 18|0x84c00000, 0x84c00000, 0x84d00000| 0%| F| |TAMS 0x84c00000, 0x84c00000| Untracked >> | 19|0x84d00000, 0x84d00000, 0x84e00000| 0%| F| |TAMS 0x84d00000, 0x84d00000| Untracked >> | 20|0x84e00000, 0x84e00000, 0x84f00000| 0%| F| |TAMS 0x84e00000, 0x84e00000| Untracked >> | 21|0x84f00000, 0x84f00000, 0x85000000| 0%| F| |TAMS 0x84f00000, 0x84f00000| Untracked >> | 22|0x85000000, 0x85000000, 0x85100000| 0%| F| |TAMS 0x85000000, 0x85000000| Untracked >> | 23|0x85100000, 0x85100000, 0x85200000| 0%| F| |TAMS 0x85100000, 0x85100000| Untracked >> | 24|0x85200000, 0x85200000, 0x85300000| 0%| F| |TAMS 0x85200000, 0x85200000| Untracked >> | 25|0x85300000, 0x85300000, 0x85400000| 0%| F| |TAMS 0x85300000, 0x85300000| Untracked >> | 26|0x85400000, 0x85400000, 0x85500000| 0%| F| |TAMS 0x85400000, 0x85400000| Untracked >> | 27|0x85500000, 0x85500000, 0x85600000| 0%| F| |TAMS 0x85500000, 0x85500000| Untracked >> | 28|0x85600000, 0x85600000, 0x85700000| 0%| F| |TAMS 0x85600000, 0x85600000| Untracked >> | 29|0x85700000, 0x85700000, 0x85800000| 0%| F| |TAMS 0x85700000, 0x85700000| Untracked >> | 30|0x85800000, 0x85800000, 0x85900000| 0%| F| |TAMS 0x85800000, 0x85800000| Untracked >> | 31|0x85900000, 0x85900000, 0x85a00000| 0%| F| |TAMS 0x85900000, 0x85900000| Untracked >> | 32|0x85a00000, 0x85a00000, 0x85b00000| 0%| F| |TAMS 0x85a00000, 0x85a00000| Untracked >> | 33|0x85b00000, 0x85b00000, 0x85c00000| 0%| F| |TAMS 0x85b00000, 0x85b00000| Untracked >> | 34|0x85c00000, 0x85c00000, 0x85d00000| 0%| F| |TAMS 0x85c00000, 0x85c00000| Untracked >> | 35|0x85d00000, 0x85d00000, 0x85e00000| 0%| F| |TAMS 0x85d00000, 0x85d00000| Untracked >> | 36|0x85e00000, 0x85e00000, 0x85f00000| 0%| F| |TAMS 0x85e00000, 0x85e00000| Untracked >> | 37|0x85f00000, 0x85f00000, 0x86000000| 0%| F| |TAMS 0x85f00000, 0x85f00000| Untracked >> | 38|0x86000000, 0x86000000, 0x86100000| 0%| F| |TAMS 0x86000000, 0x86000000| Untracked >> | 39|0x86100000, 0x86100000, 0x86200000| 0%| F| |TAMS 0x86100000, 0x86100000| Untracked >> | 40|0x86200000, 0x86200000, 0x86300000| 0%| F| |TAMS 0x86200000, 0x86200000| Untracked >> | 41|0x86300000, 0x86300000, 0x86400000| 0%| F| |TAMS 0x86300000, 0x86300000| Untracked >> | 42|0x86400000, 0x86400000, 0x86500000| 0%| F| |TAMS 0x86400000, 0x86400000| Untracked >> | 43|0x86500000, 0x86500000, 0x86600000| 0%| F| |TAMS 0x86500000, 0x86500000| Untracked >> | 44|0x86600000, 0x86600000, 0x86700000| 0%| F| |TAMS 0x86600000, 0x86600000| Untracked >> | 45|0x86700000, 0x86700000, 0x86800000| 0%| F| |TAMS 0x86700000, 0x86700000| Untracked >> | 46|0x86800000, 0x86800000, 0x86900000| 0%| F| |TAMS 0x86800000, 0x86800000| Untracked >> | 47|0x86900000, 0x86900000, 0x86a00000| 0%| F| |TAMS 0x86900000, 0x86900000| Untracked >> | 48|0x86a00000, 0x86a00000, 0x86b00000| 0%| F| |TAMS 0x86a00000, 0x86a00000| Untracked >> | 49|0x86b00000, 0x86b00000, 0x86c00000| 0%| F| |TAMS 0x86b00000, 0x86b00000| Untracked >> | 50|0x86c00000, 0x86c00000, 0x86d00000| 0%| F| |TAMS 0x86c00000, 0x86c00000| Untracked >> | 51|0x86d00000, 0x86d00000, 0x86e00000| 0%| F| |TAMS 0x86d00000, 0x86d00000| Untracked >> | 52|0x86e00000, 0x86e00000, 0x86f00000| 0%| F| |TAMS 0x86e00000, 0x86e00000| Untracked >> | 53|0x86f00000, 0x86f00000, 0x87000000| 0%| F| |TAMS 0x86f00000, 0x86f00000| Untracked >> | 54|0x87000000, 0x87000000, 0x87100000| 0%| F| |TAMS 0x87000000, 0x87000000| Untracked >> | 55|0x87100000, 0x87100000, 0x87200000| 0%| F| |TAMS 0x87100000, 0x87100000| Untracked >> | 56|0x87200000, 0x87200000, 0x87300000| 0%| F| |TAMS 0x87200000, 0x87200000| Untracked >> | 57|0x87300000, 0x87300000, 0x87400000| 0%| F| |TAMS 0x87300000, 0x87300000| Untracked >> | 58|0x87400000, 0x87400000, 0x87500000| 0%| F| |TAMS 0x87400000, 0x87400000| Untracked >> | 59|0x87500000, 0x87500000, 0x87600000| 0%| F| |TAMS 0x87500000, 0x87500000| Untracked >> | 60|0x87600000, 0x87600000, 0x87700000| 0%| F| |TAMS 0x87600000, 0x87600000| Untracked >> | 61|0x87700000, 0x87700000, 0x87800000| 0%| F| |TAMS 0x87700000, 0x87700000| Untracked >> | 62|0x87800000, 0x87800000, 0x87900000| 0%| F| |TAMS 0x87800000, 0x87800000| Untracked >> | 63|0x87900000, 0x87942908, 0x87a00000| 26%| E| |TAMS 0x87900000, 0x87900000| Complete >> >> Card table byte_map: [0x83700000,0x83880000] _byte_map_base: 0x832e3000 >> >> Marking Bits (Prev, Next): (CMBitMap*) 0xb5b74324, (CMBitMap*) 0xb5b74344 >> Prev Bits: [0x82980000, 0x83580000) >> Next Bits: [0x81d80000, 0x82980000) >> >> GC Heap History (0 events): >> No events >> >> Deoptimization events (0 events): >> No events >> >> Classes unloaded (0 events): >> No events >> >> Classes redefined (0 events): >> No events >> >> Internal exceptions (0 events): >> No events >> >> Events (20 events): >> Event: 0.113 loading class java/lang/Character >> Event: 0.114 loading class java/lang/Character done >> Event: 0.114 loading class java/lang/Float >> Event: 0.115 loading class java/lang/Number >> Event: 0.115 loading class java/lang/Number done >> Event: 0.115 loading class java/lang/Float done >> Event: 0.115 loading class java/lang/Double >> Event: 0.116 loading class java/lang/Double done >> Event: 0.116 loading class java/lang/Byte >> Event: 0.116 loading class java/lang/Byte done >> Event: 0.116 loading class java/lang/Short >> Event: 0.117 loading class java/lang/Short done >> Event: 0.117 loading class java/lang/Integer >> Event: 0.118 loading class java/lang/Integer done >> Event: 0.118 loading class java/lang/Long >> Event: 0.119 loading class java/lang/Long done >> Event: 0.119 loading class java/util/Iterator >> Event: 0.119 loading class java/util/Iterator done >> Event: 0.119 loading class java/lang/reflect/RecordComponent >> Event: 0.119 loading class java/lang/reflect/RecordComponent done >> >> >> Dynamic libraries: >> 00410000-00411000 r-xp 00000000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java >> 00420000-00421000 r--p 00000000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java >> 00421000-00422000 rw-p 00001000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java >> 019b6000-019d7000 rw-p 00000000 00:00 0 [heap] >> 809c9000-80e00000 rw-p 00000000 00:00 0 >> 80e00000-80e8e000 rw-p 00000000 00:00 0 >> 80e8e000-80f00000 ---p 00000000 00:00 0 >> 80fb4000-811da000 rw-p 00000000 00:00 0 >> 811da000-81400000 ---p 00000000 00:00 0 >> 81400000-81421000 rw-p 00000000 00:00 0 >> 81421000-81500000 ---p 00000000 00:00 0 >> 8157e000-8157f000 ---p 00000000 00:00 0 >> 8157f000-81600000 rw-p 00000000 00:00 0 >> 81600000-81621000 rw-p 00000000 00:00 0 >> 81621000-81700000 ---p 00000000 00:00 0 >> 8177e000-8177f000 ---p 00000000 00:00 0 >> 8177f000-81800000 rw-p 00000000 00:00 0 >> 81800000-81821000 rw-p 00000000 00:00 0 >> 81821000-81900000 ---p 00000000 00:00 0 >> 81900000-81921000 rw-p 00000000 00:00 0 >> 81921000-81a00000 ---p 00000000 00:00 0 >> 81a7e000-81a7f000 ---p 00000000 00:00 0 >> 81a7f000-81b00000 rw-p 00000000 00:00 0 >> 81b00000-81b21000 rw-p 00000000 00:00 0 >> 81b21000-81c00000 ---p 00000000 00:00 0 >> 81c21000-81c7c000 rw-p 00000000 00:00 0 >> 81c7c000-81c7d000 ---p 00000000 00:00 0 >> 81c7d000-81cfe000 rw-p 00000000 00:00 0 >> 81cfe000-81cff000 ---p 00000000 00:00 0 >> 81cff000-81e80000 rw-p 00000000 00:00 0 >> 81e80000-82980000 ---p 00000000 00:00 0 >> 82980000-82a80000 rw-p 00000000 00:00 0 >> 82a80000-83580000 ---p 00000000 00:00 0 >> 83580000-835a0000 rw-p 00000000 00:00 0 >> 835a0000-83700000 ---p 00000000 00:00 0 >> 83700000-83720000 rw-p 00000000 00:00 0 >> 83720000-83880000 ---p 00000000 00:00 0 >> 83880000-838a0000 rw-p 00000000 00:00 0 >> 838a0000-83a00000 ---p 00000000 00:00 0 >> 83a00000-87a00000 rw-p 00000000 00:00 0 >> 87a00000-b3a00000 ---p 00000000 00:00 0 >> b3a25000-b3a76000 rw-p 00000000 00:00 0 >> b3a76000-b3ab3000 ---p 00000000 00:00 0 >> b3ab3000-b3c33000 rwxp 00000000 00:00 0 >> b3c33000-b5ab3000 ---p 00000000 00:00 0 >> b5ab3000-b5ac8000 r-xp 00000000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so >> b5ac8000-b5ad8000 ---p 00015000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so >> b5ad8000-b5ad9000 r--p 00015000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so >> b5ad9000-b5ada000 rw-p 00016000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so >> b5ada000-b5ae2000 rw-s 00000000 b3:02 2576900 /tmp/hsperfdata_root/14700 >> b5ae2000-b5ae9000 r-xp 00000000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so >> b5ae9000-b5af8000 ---p 00007000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so >> b5af8000-b5af9000 r--p 00006000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so >> b5af9000-b5afa000 rw-p 00007000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so >> b5afa000-b5b00000 rw-p 00000000 00:00 0 >> b5b00000-b5c00000 rw-p 00000000 00:00 0 >> b5c00000-b5c0d000 r-xp 00000000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so >> b5c0d000-b5c1c000 ---p 0000d000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so >> b5c1c000-b5c1d000 r--p 0000c000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so >> b5c1d000-b5c1e000 rw-p 0000d000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so >> b5c1e000-b5c20000 rw-p 00000000 00:00 0 >> b5c20000-b5c27000 r-xp 00000000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so >> b5c27000-b5c36000 ---p 00007000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so >> b5c36000-b5c37000 r--p 00006000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so >> b5c37000-b5c38000 rw-p 00007000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so >> b5c38000-b5c3d000 r-xp 00000000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so >> b5c3d000-b5c4c000 ---p 00005000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so >> b5c4c000-b5c4d000 r--p 00004000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so >> b5c4d000-b5c4e000 rw-p 00005000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so >> b5c4e000-b5c5d000 r-xp 00000000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so >> b5c5d000-b5c6c000 ---p 0000f000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so >> b5c6c000-b5c6d000 r--p 0000e000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so >> b5c6d000-b5c6e000 rw-p 0000f000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so >> b5c6e000-b5c71000 ---p 00000000 00:00 0 >> b5c71000-b5cbe000 rw-p 00000000 00:00 0 >> b5cbe000-b5d2d000 r-xp 00000000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so >> b5d2d000-b5d3d000 ---p 0006f000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so >> b5d3d000-b5d3e000 r--p 0006f000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so >> b5d3e000-b5d3f000 rw-p 00070000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so >> b5d3f000-b6d56000 r-xp 00000000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so >> b6d56000-b6d65000 ---p 01017000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so >> b6d65000-b6dba000 r--p 01016000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so >> b6dba000-b6dd2000 rw-p 0106b000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so >> b6dd2000-b6e5e000 rw-p 00000000 00:00 0 >> b6e5e000-b6e6f000 r-xp 00000000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so >> b6e6f000-b6e7f000 ---p 00011000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so >> b6e7f000-b6e80000 r--p 00011000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so >> b6e80000-b6e81000 rw-p 00012000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so >> b6e81000-b6e83000 rw-p 00000000 00:00 0 >> b6e83000-b6e85000 r-xp 00000000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so >> b6e85000-b6e94000 ---p 00002000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so >> b6e94000-b6e95000 r--p 00001000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so >> b6e95000-b6e96000 rw-p 00002000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so >> b6e96000-b6eaf000 r-xp 00000000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 >> b6eaf000-b6ebe000 ---p 00019000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 >> b6ebe000-b6ebf000 r--p 00018000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 >> b6ebf000-b6ec0000 rw-p 00019000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 >> b6ec0000-b6fa2000 r-xp 00000000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so >> b6fa2000-b6fb2000 ---p 000e2000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so >> b6fb2000-b6fb4000 r--p 000e2000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so >> b6fb4000-b6fb5000 rw-p 000e4000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so >> b6fb5000-b6fb8000 rw-p 00000000 00:00 0 >> b6fb8000-b6fc2000 r-xp 00000000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so >> b6fc2000-b6fd1000 ---p 0000a000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so >> b6fd1000-b6fd2000 r--p 00009000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so >> b6fd2000-b6fd3000 rw-p 0000a000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so >> b6fd3000-b6feb000 r-xp 00000000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so >> b6ff2000-b6ff4000 rw-p 00000000 00:00 0 >> b6ff6000-b6ff7000 ---p 00000000 00:00 0 >> b6ff7000-b6ff8000 r--p 00000000 00:00 0 >> b6ff8000-b6ff9000 rwxp 00000000 00:00 0 >> b6ff9000-b6ffb000 rw-p 00000000 00:00 0 >> b6ffb000-b6ffc000 r--p 00018000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so >> b6ffc000-b6ffd000 rw-p 00019000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so >> bed97000-bedb8000 rw-p 00000000 00:00 0 [stack] >> beeb0000-beeb1000 r-xp 00000000 00:00 0 [sigpage] >> beeb1000-beeb2000 r--p 00000000 00:00 0 [vvar] >> beeb2000-beeb3000 r-xp 00000000 00:00 0 [vdso] >> ffff0000-ffff1000 r-xp 00000000 00:00 0 [vectors] >> >> >> VM Arguments: >> jvm_args: -Xms64M -Xmx768M --add-exports=java.base/jdk.internal.module=ALL-UNNAMED >> java_command: build.tools.jigsaw.AddPackagesAttribute /workspace/build/linux-arm-server-fastdebug/jdk >> java_class_path (initial): /workspace/build/linux-arm-server-fastdebug/buildtools/tools_jigsaw_classes >> Launcher Type: SUN_STANDARD >> >> [Global flags] >> uint ConcGCThreads = 1 {product} {ergonomic} Number of threads concurrent gc will use >> uint G1ConcRefinementThreads = 4 {product} {ergonomic} The number of parallel rem set update threads. Will be set ergonomically by default. >> size_t G1HeapRegionSize = 1048576 {product} {ergonomic} Size of the G1 regions. >> uintx GCDrainStackTargetSize = 64 {product} {ergonomic} Number of entries we will try to leave on the stack during parallel gc >> size_t InitialHeapSize = 67108864 {product} {command line} Initial heap size (in bytes); zero means use ergonomics >> size_t MarkStackSize = 32768 {product} {ergonomic} Size of marking stack >> size_t MaxHeapSize = 805306368 {product} {command line} Maximum heap size (in bytes) >> size_t MaxNewSize = 482344960 {product} {ergonomic} Maximum new generation size (in bytes), max_uintx means set ergonomically >> size_t MinHeapDeltaBytes = 1048576 {product} {ergonomic} The minimum change in heap space due to GC (in bytes) >> size_t MinHeapSize = 67108864 {product} {command line} Minimum heap size (in bytes); zero means use ergonomics >> uintx NonProfiledCodeHeapSize = 0 {pd product} {ergonomic} Size of code heap with non-profiled methods (in bytes) >> uintx ProfiledCodeHeapSize = 0 {pd product} {ergonomic} Size of code heap with profiled methods (in bytes) >> size_t SoftMaxHeapSize = 805306368 {manageable} {ergonomic} Soft limit for maximum heap size (in bytes) >> bool UseG1GC = true {product} {ergonomic} Use the Garbage-First garbage collector >> >> Logging: >> Log output configuration: >> #0: stdout all=warning uptime,level,tags >> #1: stderr all=off uptime,level,tags >> >> Environment Variables: >> JAVA_HOME=/opt/java/openjdk >> PATH=/opt/java/openjdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin >> LC_ALL=C >> >> Signal Handlers: >> SIGSEGV: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGBUS: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGFPE: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGPIPE: [libjvm.so+0xc9aa9d], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGXFSZ: [libjvm.so+0xc9aa9d], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGILL: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGUSR2: [libjvm.so+0xc9ad95], sa_mask[0]=00000000000000000000000000000000, sa_flags=SA_RESTART|SA_SIGINFO >> SIGHUP: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none >> SIGINT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none >> SIGTERM: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none >> SIGQUIT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none >> >> >> --------------- S Y S T E M --------------- >> >> OS: >> DISTRIB_ID=Ubuntu >> DISTRIB_RELEASE=18.04 >> DISTRIB_CODENAME=bionic >> DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS" >> uname: Linux 20431585315d 5.4.51-v7l+ #1333 SMP Mon Aug 10 16:51:40 BST 2020 armv7l >> OS uptime: 14 days 7:59 hours >> libc: glibc 2.27 NPTL 2.27 >> rlimit (soft/hard): STACK 8192k/infinity , CORE infinity/infinity , NPROC infinity/infinity , NOFILE 1048576/1048576 , AS infinity/infinity , CPU infinity/infinity , DATA infinity/infinity , FSIZE infinity/infinity , MEMLOCK 64k/64k >> load average: 3.37 3.26 3.09 >> >> /proc/meminfo: >> MemTotal: 3919812 kB >> MemFree: 1255688 kB >> MemAvailable: 3518740 kB >> Buffers: 134316 kB >> Cached: 2117828 kB >> SwapCached: 0 kB >> Active: 1266624 kB >> Inactive: 1167412 kB >> Active(anon): 110360 kB >> Inactive(anon): 80744 kB >> Active(file): 1156264 kB >> Inactive(file): 1086668 kB >> Unevictable: 16 kB >> Mlocked: 16 kB >> HighTotal: 3264512 kB >> HighFree: 1038848 kB >> LowTotal: 655300 kB >> LowFree: 216840 kB >> SwapTotal: 102396 kB >> SwapFree: 102396 kB >> Dirty: 24916 kB >> Writeback: 0 kB >> AnonPages: 181884 kB >> Mapped: 125864 kB >> Shmem: 16892 kB >> KReclaimable: 181816 kB >> Slab: 205164 kB >> SReclaimable: 181816 kB >> SUnreclaim: 23348 kB >> KernelStack: 2240 kB >> PageTables: 2684 kB >> NFS_Unstable: 0 kB >> Bounce: 0 kB >> WritebackTmp: 0 kB >> CommitLimit: 2062300 kB >> Committed_AS: 1125176 kB >> VmallocTotal: 245760 kB >> VmallocUsed: 5520 kB >> VmallocChunk: 0 kB >> Percpu: 512 kB >> CmaTotal: 262144 kB >> CmaFree: 171244 kB >> >> /sys/kernel/mm/transparent_hugepage/enabled: >> /sys/kernel/mm/transparent_hugepage/defrag (defrag/compaction efforts parameter): >> >> Process Memory: >> Virtual Size: 888828K (peak: 888828K) >> Resident Set Size: 25020K (peak: 25020K) (anon: 11372K, file: 13648K, shmem: 0K) >> Swapped out: 0K >> C-Heap outstanding allocations: 1636K >> >> /proc/sys/kernel/threads-max (system-wide limit on the number of threads): 57119 >> /proc/sys/vm/max_map_count (maximum number of memory map areas a process may have): 65530 >> /proc/sys/kernel/pid_max (system-wide limit on number of process identifiers): 32768 >> >> Steal ticks since vm start: 0 >> Steal ticks percentage since vm start: 0.000 >> >> CPU: total 4 (initial active 4) (ARMv7), vfp, vfp3-32, simd, mp_ext >> /proc/cpuinfo: >> processor : 0 >> model name : ARMv7 Processor rev 3 (v7l) >> BogoMIPS : 270.00 >> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 >> CPU implementer : 0x41 >> CPU architecture: 7 >> CPU variant : 0x0 >> CPU part : 0xd08 >> CPU revision : 3 >> >> processor : 1 >> model name : ARMv7 Processor rev 3 (v7l) >> BogoMIPS : 270.00 >> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 >> CPU implementer : 0x41 >> CPU architecture: 7 >> CPU variant : 0x0 >> CPU part : 0xd08 >> CPU revision : 3 >> >> processor : 2 >> model name : ARMv7 Processor rev 3 (v7l) >> BogoMIPS : 270.00 >> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 >> CPU implementer : 0x41 >> CPU architecture: 7 >> CPU variant : 0x0 >> CPU part : 0xd08 >> CPU revision : 3 >> >> processor : 3 >> model name : ARMv7 Processor rev 3 (v7l) >> BogoMIPS : 270.00 >> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 >> CPU implementer : 0x41 >> CPU architecture: 7 >> CPU variant : 0x0 >> CPU part : 0xd08 >> CPU revision : 3 >> >> Hardware : BCM2711 >> Revision : c03111 >> Serial : 100000001c47254f >> Model : Raspberry Pi 4 Model B Rev 1.1 >> >> Online cpus: 0-3 >> Offline cpus: >> >> Memory: 4k page, physical 3919812k(1255688k free), swap 102396k(102396k free) >> >> vm_info: OpenJDK Server VM (fastdebug 16-internal+0-adhoc..workspace) for linux-arm JRE (16-internal+0-adhoc..workspace), built on Oct 12 2020 19:49:51 by "" with gcc 7.5.0 >> >> END. >> >> >> >> >> On 12. Oct 2020, at 20:24, Aleksey Shipilev wrote: >> >> Hi, >> >> On 10/12/20 8:12 PM, Marc Hoffmann wrote: >> >> Please find the build log and the hs_err file for commit fd0cb98ed03c6214c02ccd3503c1e6d77065a428 attached. >> >> >> Please try to build with fastdebug (./configure --enable-debug), so that JVM asserts meaninfully somewhere? >> >> Is there any additional information I can provide to help getting these builds fixed again? >> >> >> I am seeing plenty of weird x86_32 crashes since last week. Pretty sure some of them would manifest on ARM32 as well. This is why building with fastdebug is the next step: it maps out the bug symptoms. >> >> -- >> Thanks, >> -Aleksey >> >> >> From shade at openjdk.java.net Thu Oct 29 08:49:47 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 29 Oct 2020 08:49:47 GMT Subject: RFR: 8255416: Investigate err_msg to detect unnecessary uses [v3] In-Reply-To: References: <-P5id4x50oiTzG6JHl4GD-HJbqYM4juVm9IpsNvXlV4=.7e07be8c-8389-41e2-a563-1f3be709ac51@github.com> Message-ID: <33_QxzGck_FYstuHHF_usmlcLQIL45p-xKBwAWxBQkA=.bf31affe-035f-4466-a2e7-a7cd13cb2617@github.com> On Thu, 29 Oct 2020 08:20:57 GMT, Thomas Stuefe wrote: >> Actually, this code prevents a single string argument in compile time. It relies on two constructors to introduce ambiguity in overload resolution that makes C++ compiler complain and abort compilation. Comment above the dummy constructor should clarify that for anyone stepping on compile error. >> inline FormatErrBuffer(const char* format, ...) ATTRIBUTE_PRINTF(2, 3); >> inline FormatErrBuffer(const char* msg) { ShouldNotReachHere(); } >> For sanity check, I've used this sample code https://godbolt.org/z/szs6rE >> >> And the cases that were fixed have been detected by the compiler. >> >> I don't think that this problem is a serious issue as well. But as for me, it is a minor code complexity increase to ensure that the minor problem of extra string copy will never appear again. > > Oh, now I see it. Smart. > > But it occurred to me that someone may want to start a err_msg buffer up with a string literal, only to then add additional content via FormatBuffer::append(). So initializing it with a literal and no arguments may be a valid usecase. > > So I'd still prefer keeping this out. Alternatively, if you like to keep it in, could we just not implement the second constructor? Which should give us linker errors if the static check fails. +1 to leave this undefined to get a linkage error. There are already precedents to do this in Hotspot code, for example Node(const Node&); // not defined; linker error to use these ... // should never be used AdapterHandlerEntry(); ------------- PR: https://git.openjdk.java.net/jdk/pull/905 From tschatzl at openjdk.java.net Thu Oct 29 08:51:46 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 29 Oct 2020 08:51:46 GMT Subject: RFR: 8255298: Remove SurvivorAlignmentInBytes functionality [v3] In-Reply-To: References: Message-ID: On Tue, 27 Oct 2020 21:56:32 GMT, Kim Barrett wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> kbarrett review > > Marked as reviewed by kbarrett (Reviewer). Thanks @kstefanj @kimbarrett @shipilev for your reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/838 From tschatzl at openjdk.java.net Thu Oct 29 08:51:48 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 29 Oct 2020 08:51:48 GMT Subject: Integrated: 8255298: Remove SurvivorAlignmentInBytes functionality In-Reply-To: References: Message-ID: On Fri, 23 Oct 2020 15:16:57 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews to remove the SurvivorAlignmentInBytes functionality? It has not been in use for a long time if ever, and can be removed. Searching the web also indicates that apart from the usual lists of all options and CRs it is never mentioned. > > SurvivorAlignmentInBytes is an experimental option so no further process is required. > > Testing: tier1-5 > > Thanks, > Thomas This pull request has now been integrated. Changeset: 38574d51 Author: Thomas Schatzl URL: https://git.openjdk.java.net/jdk/commit/38574d51 Stats: 1426 lines in 24 files changed: 0 ins; 1420 del; 6 mod 8255298: Remove SurvivorAlignmentInBytes functionality Reviewed-by: shade, ayang, kbarrett ------------- PR: https://git.openjdk.java.net/jdk/pull/838 From ayang at openjdk.java.net Thu Oct 29 09:02:43 2020 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Thu, 29 Oct 2020 09:02:43 GMT Subject: RFR: 8255232: G1: Make G1BiasedMappedArray freeable In-Reply-To: <966v_gu-j37Do5XhKCgwhCu0DDdwyKPQ04oUQkvzEIs=.96808fbd-a650-448e-96af-58a2bf7b9c2d@github.com> References: <966v_gu-j37Do5XhKCgwhCu0DDdwyKPQ04oUQkvzEIs=.96808fbd-a650-448e-96af-58a2bf7b9c2d@github.com> Message-ID: <6NGThp_w_vEvRq4pM4qt5M9pwnhKLTX-Ie6Vzw27Q2c=.927b3fed-6188-443c-b8f4-a6ce9d48c14f@github.com> On Thu, 22 Oct 2020 13:47:00 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that makes G1BiasedMappedArray freeable? > > Previously all G1BiasedMappedArray were created as unfreeable i.e. assigned to static variables. However with JDK-8253600 I need one such biased map for the full collector which is created and deleted during full GC. So the biased array should also be freed as necessary to avoid a memory leak. > > The alternative would be to statically allocate that map anyway and provide it to the current G1FullCollector instance, but I do not think the single malloc call is perf sensitive compared to full collector work and there is much point in doing something more complicated at this time. In the future I hope that the young gen collector will also be extracted from G1CollectedHeap with the same need. If/when allocation of these helper data structures becomes a problem I would suggest looking into this again. > > One option then could be using some ResoureArea for these things in the future. > > For this change there should be no change in behavior at all. > > Testing: tier1-5 > > Thanks, > Thomas Changes requested by ayang (Author). src/hotspot/share/gc/g1/g1BiasedArray.cpp line 46: > 44: _bias = 0; > 45: _shift_by = 0; > 46: } 1. `FreeHeap` makes more sense to me, since the allocation sites uses `AllocateHeap`. 2. Why resetting those fields to zero? The only resource we need to release in this destructor is memory, right? ------------- PR: https://git.openjdk.java.net/jdk/pull/808 From tschatzl at openjdk.java.net Thu Oct 29 09:15:43 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 29 Oct 2020 09:15:43 GMT Subject: RFR: 8255232: G1: Make G1BiasedMappedArray freeable In-Reply-To: <6NGThp_w_vEvRq4pM4qt5M9pwnhKLTX-Ie6Vzw27Q2c=.927b3fed-6188-443c-b8f4-a6ce9d48c14f@github.com> References: <966v_gu-j37Do5XhKCgwhCu0DDdwyKPQ04oUQkvzEIs=.96808fbd-a650-448e-96af-58a2bf7b9c2d@github.com> <6NGThp_w_vEvRq4pM4qt5M9pwnhKLTX-Ie6Vzw27Q2c=.927b3fed-6188-443c-b8f4-a6ce9d48c14f@github.com> Message-ID: <5-99VX5dSlnD2-vUuclmha-B6sZ7UH00onzKIJNVinI=.97fdac90-7cc4-477c-af2b-3be48bf7d92c@github.com> On Thu, 29 Oct 2020 09:00:00 GMT, Albert Mingkun Yang wrote: >> Hi all, >> >> can I have reviews for this change that makes G1BiasedMappedArray freeable? >> >> Previously all G1BiasedMappedArray were created as unfreeable i.e. assigned to static variables. However with JDK-8253600 I need one such biased map for the full collector which is created and deleted during full GC. So the biased array should also be freed as necessary to avoid a memory leak. >> >> The alternative would be to statically allocate that map anyway and provide it to the current G1FullCollector instance, but I do not think the single malloc call is perf sensitive compared to full collector work and there is much point in doing something more complicated at this time. In the future I hope that the young gen collector will also be extracted from G1CollectedHeap with the same need. If/when allocation of these helper data structures becomes a problem I would suggest looking into this again. >> >> One option then could be using some ResoureArea for these things in the future. >> >> For this change there should be no change in behavior at all. >> >> Testing: tier1-5 >> >> Thanks, >> Thomas > > src/hotspot/share/gc/g1/g1BiasedArray.cpp line 46: > >> 44: _bias = 0; >> 45: _shift_by = 0; >> 46: } > > 1. `FreeHeap` makes more sense to me, since the allocation sites uses `AllocateHeap`. > 2. Why resetting those fields to zero? The only resource we need to release in this destructor is memory, right? 1. Will fix. 2. Some debugging code to see if anyone else is using it after the freeing as previously these tables were never freed - I was not sure if I should keep them or not so I want to see if anyone would complain :) I'll remove these then. ------------- PR: https://git.openjdk.java.net/jdk/pull/808 From stefank at openjdk.java.net Thu Oct 29 09:38:57 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 29 Oct 2020 09:38:57 GMT Subject: RFR: 8254877: GCLogPrecious::_lock rank constrains what locks you are allowed to have when crashing [v2] In-Reply-To: <_mcGIfKtXbuDVOTisGl5s38hnMVVwAWIuaqG3mwlKj4=.8fe7adf8-cae8-4ccb-8d98-91ea7d308243@github.com> References: <_mcGIfKtXbuDVOTisGl5s38hnMVVwAWIuaqG3mwlKj4=.8fe7adf8-cae8-4ccb-8d98-91ea7d308243@github.com> Message-ID: <11e7qlaOBcw8kmtM9nvtHjGfO3Uy7WdZua8lDbHsAkU=.e19bdb12-9b81-4379-ad5e-57e7d69b9e50@github.com> > This is an alternative version of the fix proposed in 900: > https://github.com/openjdk/jdk/pull/900 > > Erik's description: >> Today, when you crash, the GCLogPrecious::_lock is taken. This effectively limits you to only get clean crash reports if you crash or assert without holding a lock of rank tty or lower. It is arguably difficult to know what locks you are going to have when crashing. Therefore, I don't think the precious GC log should constrain possible crashing contexts in that fashion. > > As Erik mentioned in that PR, I'd like to retain the ability to easily dump the precious log when debugging. The proposed fix changes the Mutex to a Semaphore, and use trywait to safely access the buffer. In the unlikely event that another thread is holding the lock, the hs_err printer skips printing the log. > > This also makes it possible to call precious logging from within the stack watermark processing code. I think there's a possibility that we might call the following error logging, when we fail to commit memory for a ZPage, when relocating, during stack watermark processing: > `log_error_p(gc)("Failed to commit memory (%s)", err.to_string());` Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: Review 1 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/903/files - new: https://git.openjdk.java.net/jdk/pull/903/files/fdadc38a..88e2b4e2 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=903&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=903&range=00-01 Stats: 81 lines in 4 files changed: 66 ins; 3 del; 12 mod Patch: https://git.openjdk.java.net/jdk/pull/903.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/903/head:pull/903 PR: https://git.openjdk.java.net/jdk/pull/903 From stefank at openjdk.java.net Thu Oct 29 09:38:57 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 29 Oct 2020 09:38:57 GMT Subject: RFR: 8254877: GCLogPrecious::_lock rank constrains what locks you are allowed to have when crashing [v2] In-Reply-To: References: <_mcGIfKtXbuDVOTisGl5s38hnMVVwAWIuaqG3mwlKj4=.8fe7adf8-cae8-4ccb-8d98-91ea7d308243@github.com> Message-ID: On Wed, 28 Oct 2020 15:27:57 GMT, Erik ?sterlund wrote: >> Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: >> >> Review 1 > > Looks good. In the latest update I added two new helper classes: `SemaphoreLock` and `SemaphoreLocker`. I think this makes the code nicer. Since those classes are more broadly used, I'll go a head and split them out into a separate PR. ------------- PR: https://git.openjdk.java.net/jdk/pull/903 From stefank at openjdk.java.net Thu Oct 29 09:38:58 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 29 Oct 2020 09:38:58 GMT Subject: RFR: 8254877: GCLogPrecious::_lock rank constrains what locks you are allowed to have when crashing [v2] In-Reply-To: References: <_mcGIfKtXbuDVOTisGl5s38hnMVVwAWIuaqG3mwlKj4=.8fe7adf8-cae8-4ccb-8d98-91ea7d308243@github.com> Message-ID: On Wed, 28 Oct 2020 15:02:40 GMT, Per Liden wrote: >> Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: >> >> Review 1 > > src/hotspot/share/gc/shared/gcLogPrecious.cpp line 93: > >> 91: } >> 92: >> 93: _lock->signal(); > > As we discussed off-line, perhaps we might want to print something even if the log isn't initialized and/or is empty. Something like: > st->print_cr("GC Precious Log:"); > > if (_lines == NULL) { > st->print_cr(""); > return; > } > > if (!_lock->trywait()) { > st->print_cr(""); > return; > } > > if (_lines->size() == 0) { > st->print_cr(""); > } else { > st->print_cr("%s", _lines->base()); > } > > _lock->signal(); > You decide. Looks good otherwise. Updated with suggestion. I also added extra newlines to make the output look pretty. That was needed because _lines->base() is always terminated with a newline. ------------- PR: https://git.openjdk.java.net/jdk/pull/903 From stefank at openjdk.java.net Thu Oct 29 09:38:59 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 29 Oct 2020 09:38:59 GMT Subject: RFR: 8254877: GCLogPrecious::_lock rank constrains what locks you are allowed to have when crashing [v2] In-Reply-To: References: <_mcGIfKtXbuDVOTisGl5s38hnMVVwAWIuaqG3mwlKj4=.8fe7adf8-cae8-4ccb-8d98-91ea7d308243@github.com> Message-ID: <2pEEe-qXj9Y4FYxKo_5_QlQfj3NfdccdNVr_BO6k4LQ=.ee8a5f48-c70f-40df-8d6a-dd420c465a74@github.com> On Wed, 28 Oct 2020 16:02:37 GMT, Albert Mingkun Yang wrote: >> Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: >> >> Review 1 > > src/hotspot/share/gc/shared/gcLogPrecious.cpp line 30: > >> 28: stringStream* GCLogPrecious::_lines = NULL; >> 29: stringStream* GCLogPrecious::_temp = NULL; >> 30: Semaphore* GCLogPrecious::_lock = NULL; > > Maybe renaming `_lock` to `_semaphore`? Additionally, since it's a binary semaphore, `new Semaphore(1)`, some comments explaining why `Mutex` is **not** suitable could avoid some future confusions. > > PS: not a review, just a comment in passing. It's used as a lock, so I think the name `_lock` is appropriate. Instead I introduced a new class: `SemaphoreLock`, to make the code more readable (IMHO). Also added a comment. Hopefully, this addressed your comments. ------------- PR: https://git.openjdk.java.net/jdk/pull/903 From ayang at openjdk.java.net Thu Oct 29 09:42:50 2020 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Thu, 29 Oct 2020 09:42:50 GMT Subject: RFR: 8254877: GCLogPrecious::_lock rank constrains what locks you are allowed to have when crashing [v2] In-Reply-To: <2pEEe-qXj9Y4FYxKo_5_QlQfj3NfdccdNVr_BO6k4LQ=.ee8a5f48-c70f-40df-8d6a-dd420c465a74@github.com> References: <_mcGIfKtXbuDVOTisGl5s38hnMVVwAWIuaqG3mwlKj4=.8fe7adf8-cae8-4ccb-8d98-91ea7d308243@github.com> <2pEEe-qXj9Y4FYxKo_5_QlQfj3NfdccdNVr_BO6k4LQ=.ee8a5f48-c70f-40df-8d6a-dd420c465a74@github.com> Message-ID: On Thu, 29 Oct 2020 09:34:12 GMT, Stefan Karlsson wrote: >> src/hotspot/share/gc/shared/gcLogPrecious.cpp line 30: >> >>> 28: stringStream* GCLogPrecious::_lines = NULL; >>> 29: stringStream* GCLogPrecious::_temp = NULL; >>> 30: Semaphore* GCLogPrecious::_lock = NULL; >> >> Maybe renaming `_lock` to `_semaphore`? Additionally, since it's a binary semaphore, `new Semaphore(1)`, some comments explaining why `Mutex` is **not** suitable could avoid some future confusions. >> >> PS: not a review, just a comment in passing. > > It's used as a lock, so I think the name `_lock` is appropriate. Instead I introduced a new class: `SemaphoreLock`, to make the code more readable (IMHO). Also added a comment. Hopefully, this addressed your comments. Thank you; it does look more readable. BTW, mutex has `try_lock` as well. ------------- PR: https://git.openjdk.java.net/jdk/pull/903 From stefank at openjdk.java.net Thu Oct 29 09:50:49 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 29 Oct 2020 09:50:49 GMT Subject: RFR: 8254877: GCLogPrecious::_lock rank constrains what locks you are allowed to have when crashing [v2] In-Reply-To: References: <_mcGIfKtXbuDVOTisGl5s38hnMVVwAWIuaqG3mwlKj4=.8fe7adf8-cae8-4ccb-8d98-91ea7d308243@github.com> <2pEEe-qXj9Y4FYxKo_5_QlQfj3NfdccdNVr_BO6k4LQ=.ee8a5f48-c70f-40df-8d6a-dd420c465a74@github.com> Message-ID: On Thu, 29 Oct 2020 09:40:19 GMT, Albert Mingkun Yang wrote: >> It's used as a lock, so I think the name `_lock` is appropriate. Instead I introduced a new class: `SemaphoreLock`, to make the code more readable (IMHO). Also added a comment. Hopefully, this addressed your comments. > > Thank you; it does look more readable. BTW, mutex has `try_lock` as well. I see. Then I'll remove that part of the new comment. ------------- PR: https://git.openjdk.java.net/jdk/pull/903 From eosterlund at openjdk.java.net Thu Oct 29 09:54:45 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 29 Oct 2020 09:54:45 GMT Subject: RFR: 8255579: x86: Use cmpq(Register,Address) in safepoint_poll In-Reply-To: References: Message-ID: On Thu, 29 Oct 2020 08:14:27 GMT, Aleksey Shipilev wrote: > JDK-8253180 added a new block in `safepoint_poll` that uses broken `cmpq` (JDK-8255550): it effectively does the comparison with operands swapped. It makes sense to use the non-broken `cmpq`, as to avoid changing the condition code and thus making the code less understandable. See the discussion in #910. > > Testing: > - [x] `tier1` with Z (some SA tests fail, and some other fail with OOME -- seem to be expected/problem-listed) Looks good. I also got confused about this. When this instruction didn't work the way I thought it would, I thought it must be a weird style thing, that everything is right to left in the assembler, for consistency, so it would read from right to left. Then I convinced myself it made sense. But the instruction being incorrectly encoded also checks out. Sigh. Anyway, I ran some local testing that would just fall over immediately if this didn't work. And it seems to work. ------------- Marked as reviewed by eosterlund (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/924 From shade at openjdk.java.net Thu Oct 29 09:54:46 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 29 Oct 2020 09:54:46 GMT Subject: RFR: 8255579: x86: Use cmpq(Register,Address) in safepoint_poll In-Reply-To: References: Message-ID: <21WMr5NDWAkN_bYcBgV2bEwvphgDVC6CZAhUaBk1rzs=.efc1c597-93fe-42d1-bc23-e08b73583e01@github.com> On Thu, 29 Oct 2020 09:49:30 GMT, Erik ?sterlund wrote: >> JDK-8253180 added a new block in `safepoint_poll` that uses broken `cmpq` (JDK-8255550): it effectively does the comparison with operands swapped. It makes sense to use the non-broken `cmpq`, as to avoid changing the condition code and thus making the code less understandable. See the discussion in #910. >> >> Testing: >> - [x] `tier1` with Z (some SA tests fail, and some other fail with OOME -- seem to be expected/problem-listed) > > Looks good. I also got confused about this. When this instruction didn't work the way I thought it would, I thought it must be a weird style thing, that everything is right to left in the assembler, for consistency, so it would read from right to left. Then I convinced myself it made sense. But the instruction being incorrectly encoded also checks out. Sigh. Anyway, I ran some local testing that would just fall over immediately if this didn't work. And it seems to work. Thanks @fisk! Would you like someone else to look at this, or your review is enough? ------------- PR: https://git.openjdk.java.net/jdk/pull/924 From tschatzl at openjdk.java.net Thu Oct 29 09:56:01 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 29 Oct 2020 09:56:01 GMT Subject: RFR: 8255232: G1: Make G1BiasedMappedArray freeable [v2] In-Reply-To: <966v_gu-j37Do5XhKCgwhCu0DDdwyKPQ04oUQkvzEIs=.96808fbd-a650-448e-96af-58a2bf7b9c2d@github.com> References: <966v_gu-j37Do5XhKCgwhCu0DDdwyKPQ04oUQkvzEIs=.96808fbd-a650-448e-96af-58a2bf7b9c2d@github.com> Message-ID: > Hi all, > > can I have reviews for this change that makes G1BiasedMappedArray freeable? > > Previously all G1BiasedMappedArray were created as unfreeable i.e. assigned to static variables. However with JDK-8253600 I need one such biased map for the full collector which is created and deleted during full GC. So the biased array should also be freed as necessary to avoid a memory leak. > > The alternative would be to statically allocate that map anyway and provide it to the current G1FullCollector instance, but I do not think the single malloc call is perf sensitive compared to full collector work and there is much point in doing something more complicated at this time. In the future I hope that the young gen collector will also be extracted from G1CollectedHeap with the same need. If/when allocation of these helper data structures becomes a problem I would suggest looking into this again. > > One option then could be using some ResoureArea for these things in the future. > > For this change there should be no change in behavior at all. > > Testing: tier1-5 > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: ayang review ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/808/files - new: https://git.openjdk.java.net/jdk/pull/808/files/51b297bb..c6835bae Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=808&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=808&range=00-01 Stats: 9 lines in 1 file changed: 0 ins; 8 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/808.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/808/head:pull/808 PR: https://git.openjdk.java.net/jdk/pull/808 From eosterlund at openjdk.java.net Thu Oct 29 09:58:44 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 29 Oct 2020 09:58:44 GMT Subject: RFR: 8255579: x86: Use cmpq(Register,Address) in safepoint_poll In-Reply-To: <21WMr5NDWAkN_bYcBgV2bEwvphgDVC6CZAhUaBk1rzs=.efc1c597-93fe-42d1-bc23-e08b73583e01@github.com> References: <21WMr5NDWAkN_bYcBgV2bEwvphgDVC6CZAhUaBk1rzs=.efc1c597-93fe-42d1-bc23-e08b73583e01@github.com> Message-ID: On Thu, 29 Oct 2020 09:52:11 GMT, Aleksey Shipilev wrote: >> Looks good. I also got confused about this. When this instruction didn't work the way I thought it would, I thought it must be a weird style thing, that everything is right to left in the assembler, for consistency, so it would read from right to left. Then I convinced myself it made sense. But the instruction being incorrectly encoded also checks out. Sigh. Anyway, I ran some local testing that would just fall over immediately if this didn't work. And it seems to work. > > Thanks @fisk! Would you like someone else to look at this, or your review is enough? Since the encodings of the two instructions are literally identical, I think this classifies as a trivial change. So I'm okay with one review. ------------- PR: https://git.openjdk.java.net/jdk/pull/924 From ayang at openjdk.java.net Thu Oct 29 10:06:47 2020 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Thu, 29 Oct 2020 10:06:47 GMT Subject: RFR: 8255232: G1: Make G1BiasedMappedArray freeable [v2] In-Reply-To: References: <966v_gu-j37Do5XhKCgwhCu0DDdwyKPQ04oUQkvzEIs=.96808fbd-a650-448e-96af-58a2bf7b9c2d@github.com> Message-ID: On Thu, 29 Oct 2020 09:56:01 GMT, Thomas Schatzl wrote: >> Hi all, >> >> can I have reviews for this change that makes G1BiasedMappedArray freeable? >> >> Previously all G1BiasedMappedArray were created as unfreeable i.e. assigned to static variables. However with JDK-8253600 I need one such biased map for the full collector which is created and deleted during full GC. So the biased array should also be freed as necessary to avoid a memory leak. >> >> The alternative would be to statically allocate that map anyway and provide it to the current G1FullCollector instance, but I do not think the single malloc call is perf sensitive compared to full collector work and there is much point in doing something more complicated at this time. In the future I hope that the young gen collector will also be extracted from G1CollectedHeap with the same need. If/when allocation of these helper data structures becomes a problem I would suggest looking into this again. >> >> One option then could be using some ResoureArea for these things in the future. >> >> For this change there should be no change in behavior at all. >> >> Testing: tier1-5 >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > ayang review Thank you for the change. ------------- Marked as reviewed by ayang (Author). PR: https://git.openjdk.java.net/jdk/pull/808 From stefank at openjdk.java.net Thu Oct 29 10:06:58 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 29 Oct 2020 10:06:58 GMT Subject: RFR: 8254877: GCLogPrecious::_lock rank constrains what locks you are allowed to have when crashing [v3] In-Reply-To: <_mcGIfKtXbuDVOTisGl5s38hnMVVwAWIuaqG3mwlKj4=.8fe7adf8-cae8-4ccb-8d98-91ea7d308243@github.com> References: <_mcGIfKtXbuDVOTisGl5s38hnMVVwAWIuaqG3mwlKj4=.8fe7adf8-cae8-4ccb-8d98-91ea7d308243@github.com> Message-ID: > This is an alternative version of the fix proposed in 900: > https://github.com/openjdk/jdk/pull/900 > > Erik's description: >> Today, when you crash, the GCLogPrecious::_lock is taken. This effectively limits you to only get clean crash reports if you crash or assert without holding a lock of rank tty or lower. It is arguably difficult to know what locks you are going to have when crashing. Therefore, I don't think the precious GC log should constrain possible crashing contexts in that fashion. > > As Erik mentioned in that PR, I'd like to retain the ability to easily dump the precious log when debugging. The proposed fix changes the Mutex to a Semaphore, and use trywait to safely access the buffer. In the unlikely event that another thread is holding the lock, the hs_err printer skips printing the log. > > This also makes it possible to call precious logging from within the stack watermark processing code. I think there's a possibility that we might call the following error logging, when we fail to commit memory for a ZPage, when relocating, during stack watermark processing: > `log_error_p(gc)("Failed to commit memory (%s)", err.to_string());` Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: Review 2 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/903/files - new: https://git.openjdk.java.net/jdk/pull/903/files/88e2b4e2..63a9473a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=903&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=903&range=01-02 Stats: 12 lines in 3 files changed: 0 ins; 2 del; 10 mod Patch: https://git.openjdk.java.net/jdk/pull/903.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/903/head:pull/903 PR: https://git.openjdk.java.net/jdk/pull/903 From stefank at openjdk.java.net Thu Oct 29 10:06:58 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 29 Oct 2020 10:06:58 GMT Subject: RFR: 8254877: GCLogPrecious::_lock rank constrains what locks you are allowed to have when crashing [v3] In-Reply-To: References: <_mcGIfKtXbuDVOTisGl5s38hnMVVwAWIuaqG3mwlKj4=.8fe7adf8-cae8-4ccb-8d98-91ea7d308243@github.com> Message-ID: On Thu, 29 Oct 2020 09:35:45 GMT, Stefan Karlsson wrote: >> Looks good. > > In the latest update I added two new helper classes: `SemaphoreLock` and `SemaphoreLocker`. I think this makes the code nicer. Since those classes are more broadly used, I'll go a head and split them out into a separate PR. Forked off the SemaphoreLock part into https://github.com/openjdk/jdk/pull/927 ------------- PR: https://git.openjdk.java.net/jdk/pull/903 From shade at openjdk.java.net Thu Oct 29 10:08:58 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 29 Oct 2020 10:08:58 GMT Subject: RFR: 8255579: x86: Use cmpq(Register, Address) in safepoint_poll [v2] In-Reply-To: References: Message-ID: > JDK-8253180 added a new block in `safepoint_poll` that uses broken `cmpq` (JDK-8255550): it effectively does the comparison with operands swapped. It makes sense to use the non-broken `cmpq`, as to avoid changing the condition code and thus making the code less understandable. See the discussion in #910. > > Testing: > - [x] `tier1` with Z (some SA tests fail, and some other fail with OOME -- seem to be expected/problem-listed) Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch 'master' into JDK-8255579-safepoint-poll - 8255579: x86: Use cmpq(Register,Address) in safepoint_poll ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/924/files - new: https://git.openjdk.java.net/jdk/pull/924/files/1f156d72..cc98eaae Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=924&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=924&range=00-01 Stats: 1637 lines in 27 files changed: 139 ins; 1454 del; 44 mod Patch: https://git.openjdk.java.net/jdk/pull/924.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/924/head:pull/924 PR: https://git.openjdk.java.net/jdk/pull/924 From eosterlund at openjdk.java.net Thu Oct 29 10:13:44 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 29 Oct 2020 10:13:44 GMT Subject: RFR: 8255579: x86: Use cmpq(Register, Address) in safepoint_poll [v2] In-Reply-To: References: Message-ID: On Thu, 29 Oct 2020 10:08:58 GMT, Aleksey Shipilev wrote: >> JDK-8253180 added a new block in `safepoint_poll` that uses broken `cmpq` (JDK-8255550): it effectively does the comparison with operands swapped. It makes sense to use the non-broken `cmpq`, as to avoid changing the condition code and thus making the code less understandable. See the discussion in #910. >> >> Testing: >> - [x] `tier1` with Z (some SA tests fail, and some other fail with OOME -- seem to be expected/problem-listed) > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge branch 'master' into JDK-8255579-safepoint-poll > - 8255579: x86: Use cmpq(Register,Address) in safepoint_poll Marked as reviewed by eosterlund (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/924 From sjohanss at openjdk.java.net Thu Oct 29 10:25:51 2020 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Thu, 29 Oct 2020 10:25:51 GMT Subject: RFR: 8253600: G1: Fully support pinned regions for full gc In-Reply-To: References: Message-ID: On Fri, 23 Oct 2020 07:40:45 GMT, Thomas Schatzl wrote: > Hi all, > > can I get reviews for this change that implements "proper" support for pinned regions in the G1 full collector? > > By proper I mean that at the end of gc, pinned regions contain the correct TAMS and bitmap markings under the TAMS so that dead objects within them are supported? > > Currently all (pinned) regions have their TAMS set to bottom() and their bitmap above TAMS cleared (at least logically :) ). This works as long objects within these regions can't be dead as it is the case now: > - humongous regions are either live or fully reclaimed. > - all other pinned regions are archive regions at the moment that are always treated as fully live (and do not contain dead objects). > > This change is a requirement for fixing JDK-8253081 as some earlier change made it possible to have dead objects within open archive regions. It also enables supporting removal of gclocker use for g1, i.e. using region pinning. > > Based on the PR#808 (https://github.com/openjdk/jdk/pull/808). > > Testing: tier1-8, testing with prototype for region pinning, testing with prototype for JDK-8253081. > Performance testing: no regressions > > Some comments for questions that might come up during review: > > - how does this work with the bitmaps now: > - at start of full gc the next bitmap is cleared > - full gc marks the next bitmap > - for all pinned regions, keep TAMS and top() (*), otherwise set TAMS to bottom > - swap bitmaps > - clear next bitmap for next marking > > (*) this means that from a usage POV pinned regions are considered full. This is inaccurate, but sufficient: full gc clears all remembered sets anyway, so we do not need that information for gc efficiency purposes anyway to evacuate later. The next marking before old gen evacuation will update it to the correct values anyway. G1 does not support allocation into "holes" in pinned regions that can be open archive only at this time too, so there is no need to be more exact. > > - use of a region attribute table for phase 2+ only: compared to before we need fast access to information whether a given reference goes into a pinned region (as opposed to an archive region) wrt to adjusting that pointer to avoid doing work for these references. > > Phase 1 marking could have used this information for the do-we-need-to-preserve-the-mark check too: however this would have required g1 to add an extra another pass over all regions to update that. This seemed slower than just checking this information "more slowly" for the objects that need mark preservation. Tests showed that this is the case for <0.00% (yeah, these references that need mark preservation are rounding errors in cases it matters) of overall references, so I did not add that pass. > (Additionally g1 full gc is a last-ditch effort, and while marking takes a significant time, it does not completely dominate it). > > I.e. the second clause in the condition of this hunk is intentionally slower than could be: > @@ -52,7 +52,9 @@ inline bool G1FullGCMarker::mark_object(oop obj) { > // Marked by us, preserve if needed. > markWord mark = obj->mark(); > if (obj->mark_must_be_preserved(mark) && > // It is not necessary to preserve marks for objects in pinned regions because > // we do not change their headers (i.e. forward them). > !G1CollectedHeap::heap()->heap_region_containing(obj)->is_pinned()) { > preserved_stack()->push(obj, mark); > } > - there is no code yet that checks for empty pinned regions yet. Only JDK-8253081 introduces that because still all contents of all archive regions are live forever. > > Also please note that the 51b297b change is from the #808 change. > > Thanks, > Thomas Nice change Thomas, mostly just small comments. src/hotspot/share/gc/g1/g1FullGCAdjustTask.cpp line 70: > 68: oop obj = oop(r->humongous_start_region()->bottom()); > 69: obj->oop_iterate(&cl, MemRegion(r->bottom(), r->top())); > 70: } else if (!(r->is_closed_archive() || r->is_free())) { I would prefer: Suggestion: } else if (!r->is_closed_archive() && !r->is_free()) { src/hotspot/share/gc/g1/g1FullGCAdjustTask.cpp line 118: > 116: > 117: // Now adjust pointers region by region > 118: G1AdjustRegionClosure blk(collector(), collector()->mark_bitmap(), worker_id); Just pass `collector()` and retrieve the bitmap in the constructor. src/hotspot/share/gc/g1/g1FullGCMarker.hpp line 46: > 44: > 45: class G1CMBitMap; > 46: class G1FullCollector; Not needed (yet). src/hotspot/share/gc/g1/g1FullGCPrepareTask.cpp line 87: > 85: Ticks start = Ticks::now(); > 86: G1FullGCCompactionPoint* compaction_point = collector()->compaction_point(worker_id); > 87: G1CalculatePointersClosure closure(collector(), collector()->mark_bitmap(), compaction_point); I would prefer just passing `collector()` and get the bitmap in the constructor here as well. src/hotspot/share/gc/g1/g1FullGCPrepareTask.cpp line 111: > 109: > 110: void G1FullGCPrepareTask::G1CalculatePointersClosure::free_humongous_region(HeapRegion* hr) { > 111: assert(hr->is_humongous(), "handled elsewhere"); Improve message a bit maybe including some information about the HR, like the type. src/hotspot/share/gc/g1/heapRegion.hpp line 173: > 171: void non_pinned_complete_compaction(); > 172: void pinned_complete_compaction(); > 173: void complete_compaction_common(); What do you think about using these names instead: Suggestion: void complete_compaction(); void reset_pinned(); void reset_after_full(); An other alternative would also to be to use `reset_after_compaction()` instead of `complete_compaction()`. src/hotspot/share/gc/g1/g1FullCollector.hpp line 57: > 55: > 56: // This table is used to store some per-region attributes needed during collection. > 57: class G1FullGCHeapRegionAttrBiasedMappedArray : public G1BiasedMappedArray { I would like this class to be moved to its own file. ------------- Changes requested by sjohanss (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/824 From jbhateja at openjdk.java.net Thu Oct 29 10:28:00 2020 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Thu, 29 Oct 2020 10:28:00 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v11] In-Reply-To: References: Message-ID: > Summary: > > 1) Partial in-lining technique avoids call overhead penalty for small array copy operations with size less than 32 bytes. > 2) At runtime, a conditional check based on copy length either calls an array-copy stub or executes an optimized instruction sequence using AVX-512 masked instructions emitted at the call site. > 3) New runtime flag ArrayCopyPartialInlineSize=0/32(default)/64 bytes determines the maximum size for partial in-lining. > 4) Based on the perf results seen in benchmarks currently partial in-lining is performed only for arraycopy involving sub-word types (bool/byte/char/short). Once PR-61 gets integrated we can extend this patch to cover all the primitive types. > > Performance Results: > System : CascadeLake Server, Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz > Micros : test/micro/org/openjdk/bench/java/lang/ArrayCopy*.java > ArrayCopyPartialInlineSize : 32 > > JMH | Block Size | Baseline (ns/op) | Partial Inling (ns/op) | Gain > -- | -- | -- | -- | -- > ArrayCopyAligned.testByte | 1 | 5.417 | 2.696 | 2.009272997 > ArrayCopyAligned.testByte | 3 | 5.494 | 2.702 | 2.03330866 > ArrayCopyAligned.testByte | 5 | 5.417 | 2.637 | 2.05422829 > ArrayCopyAligned.testByte | 10 | 5.343 | 2.703 | 1.976692564 > ArrayCopyAligned.testByte | 20 | 5.837 | 2.636 | 2.214339909 > ArrayCopyAligned.testByte | 70 | 5.86 | 6 | 0.976666667 > ArrayCopyAligned.testByte | 150 | 6.766 | 6.906 | 0.979727773 > ArrayCopyAligned.testByte | 300 | 7.605 | 7.952 | 0.956363179 > ArrayCopyAligned.testByte | 600 | 11.989 | 12.007 | 0.998500874 > ArrayCopyAligned.testByte | 1200 | 16.447 | 16.585 | 0.991679228 > ArrayCopyAligned.testChar | 1 | 5.02 | 2.828 | 1.775106082 > ArrayCopyAligned.testChar | 3 | 5.129 | 2.762 | 1.85698769 > ArrayCopyAligned.testChar | 5 | 5.041 | 2.762 | 1.82512672 > ArrayCopyAligned.testChar | 10 | 5.716 | 2.762 | 2.069514844 > ArrayCopyAligned.testChar | 20 | 5.111 | 5.399 | 0.946656788 > ArrayCopyAligned.testChar | 70 | 6.271 | 6.242 | 1.004645947 > ArrayCopyAligned.testChar | 150 | 7.45 | 7.599 | 0.980392157 > ArrayCopyAligned.testChar | 300 | 9.904 | 10.112 | 0.97943038 > ArrayCopyAligned.testChar | 600 | 17.131 | 17.167 | 0.997902953 > ArrayCopyAligned.testChar | 1200 | 29.556 | 29.851 | 0.990117584 > ArrayCopyUnalignedBoth.testByte | 1 | 5.419 | 2.702 | 2.005551443 > ArrayCopyUnalignedBoth.testByte | 3 | 5.558 | 2.636 | 2.108497724 > ArrayCopyUnalignedBoth.testByte | 5 | 5.43 | 2.636 | 2.059939302 > ArrayCopyUnalignedBoth.testByte | 10 | 5.378 | 2.637 | 2.039438756 > ArrayCopyUnalignedBoth.testByte | 20 | 5.914 | 2.636 | 2.243550836 > ArrayCopyUnalignedBoth.testByte | 70 | 5.882 | 5.954 | 0.987907289 > ArrayCopyUnalignedBoth.testByte | 150 | 6.784 | 6.88 | 0.986046512 > ArrayCopyUnalignedBoth.testByte | 300 | 7.635 | 7.968 | 0.958207831 > ArrayCopyUnalignedBoth.testByte | 600 | 12.226 | 12.129 | 1.007997362 > ArrayCopyUnalignedBoth.testByte | 1200 | 16.992 | 20.717 | 0.820195974 > ArrayCopyUnalignedBoth.testChar | 1 | 5.019 | 2.828 | 1.774752475 > ArrayCopyUnalignedBoth.testChar | 3 | 5.163 | 2.763 | 1.868621064 > ArrayCopyUnalignedBoth.testChar | 5 | 5.042 | 2.827 | 1.783516095 > ArrayCopyUnalignedBoth.testChar | 10 | 5.718 | 2.828 | 2.021923621 > ArrayCopyUnalignedBoth.testChar | 20 | 5.111 | 5.404 | 0.945780903 > ArrayCopyUnalignedBoth.testChar | 70 | 6.367 | 6.235 | 1.02117081 > ArrayCopyUnalignedBoth.testChar | 150 | 7.367 | 8.269 | 0.890917886 > ArrayCopyUnalignedBoth.testChar | 300 | 10.358 | 10.642 | 0.973313287 > ArrayCopyUnalignedBoth.testChar | 600 | 20.84 | 17.522 | 1.189361945 > ArrayCopyUnalignedBoth.testChar | 1200 | 31.895 | 31.892 | 1.000094067 > ArrayCopyUnalignedDst.testByte | 1 | 5.455 | 2.637 | 2.068638604 > ArrayCopyUnalignedDst.testByte | 3 | 5.562 | 2.702 | 2.058475204 > ArrayCopyUnalignedDst.testByte | 5 | 5.427 | 2.702 | 2.008512213 > ArrayCopyUnalignedDst.testByte | 10 | 5.367 | 2.696 | 1.990727003 > ArrayCopyUnalignedDst.testByte | 20 | 5.839 | 2.637 | 2.214258627 > ArrayCopyUnalignedDst.testByte | 70 | 5.888 | 5.968 | 0.986595174 > ArrayCopyUnalignedDst.testByte | 150 | 6.785 | 6.773 | 1.001771741 > ArrayCopyUnalignedDst.testByte | 300 | 7.606 | 7.972 | 0.954089313 > ArrayCopyUnalignedDst.testByte | 600 | 11.986 | 21.195 | 0.565510734 > ArrayCopyUnalignedDst.testByte | 1200 | 16.54 | 16.784 | 0.985462345 > ArrayCopyUnalignedDst.testChar | 1 | 5.02 | 2.827 | 1.775733994 > ArrayCopyUnalignedDst.testChar | 3 | 5.131 | 2.762 | 1.857711803 > ArrayCopyUnalignedDst.testChar | 5 | 5.038 | 2.762 | 1.82404055 > ArrayCopyUnalignedDst.testChar | 10 | 5.718 | 2.762 | 2.070238957 > ArrayCopyUnalignedDst.testChar | 20 | 5.113 | 5.401 | 0.946676541 > ArrayCopyUnalignedDst.testChar | 70 | 6.222 | 6.214 | 1.001287416 > ArrayCopyUnalignedDst.testChar | 150 | 7.367 | 8.125 | 0.906707692 > ArrayCopyUnalignedDst.testChar | 300 | 10.204 | 10.082 | 1.012100774 > ArrayCopyUnalignedDst.testChar | 600 | 16.978 | 17.135 | 0.990837467 > ArrayCopyUnalignedDst.testChar | 1200 | 32.351 | 31.996 | 1.011095137 > ArrayCopyUnalignedSrc.testByte | 1 | 5.414 | 2.696 | 2.008160237 > ArrayCopyUnalignedSrc.testByte | 3 | 5.494 | 2.637 | 2.083428138 > ArrayCopyUnalignedSrc.testByte | 5 | 5.431 | 2.637 | 2.059537353 > ArrayCopyUnalignedSrc.testByte | 10 | 5.344 | 2.703 | 1.977062523 > ArrayCopyUnalignedSrc.testByte | 20 | 5.834 | 2.696 | 2.163946588 > ArrayCopyUnalignedSrc.testByte | 70 | 5.883 | 6.009 | 0.979031453 > ArrayCopyUnalignedSrc.testByte | 150 | 6.729 | 6.87 | 0.979475983 > ArrayCopyUnalignedSrc.testByte | 300 | 7.603 | 7.97 | 0.953952321 > ArrayCopyUnalignedSrc.testByte | 600 | 12.004 | 12.16 | 0.987171053 > ArrayCopyUnalignedSrc.testByte | 1200 | 16.534 | 16.643 | 0.9934507 > ArrayCopyUnalignedSrc.testChar | 1 | 5.021 | 2.762 | 1.81788559 > ArrayCopyUnalignedSrc.testChar | 3 | 5.13 | 2.762 | 1.857349747 > ArrayCopyUnalignedSrc.testChar | 5 | 5.042 | 2.827 | 1.783516095 > ArrayCopyUnalignedSrc.testChar | 10 | 5.726 | 2.761 | 2.073886273 > ArrayCopyUnalignedSrc.testChar | 20 | 5.112 | 5.401 | 0.94649139 > ArrayCopyUnalignedSrc.testChar | 70 | 6.113 | 6.227 | 0.981692629 > ArrayCopyUnalignedSrc.testChar | 150 | 7.493 | 7.888 | 0.949923935 > ArrayCopyUnalignedSrc.testChar | 300 | 10.234 | 10.501 | 0.97457385 > ArrayCopyUnalignedSrc.testChar | 600 | 17.175 | 17.142 | 1.001925096 > ArrayCopyUnalignedSrc.testChar | 1200 | 31.926 | 31.987 | 0.998092975 > > Detailed Reports: > Baseline : [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_Baseline.txt) > WithOpt : [http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt](http://cr.openjdk.java.net/~jbhateja/8252848/JMH_results/JMH_With_PI_Opts.txt) Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: - JDK-8252848: Review comments addressed. - Merge remote-tracking branch 'origin' into JDK-8252848 - JDK-8252848 : Replacing generic assembler routine evmovdqu with macro assembly routine calling type specific leaf level assembly functions. - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8252848 - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8252848 - JDK-8252848 : Review comments resolution. - Merge remote-tracking branch 'upstream' into JDK-8252848 - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8252848 - Replacing explicit type checks with existing type checking routines - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8252848 - ... and 1 more: https://git.openjdk.java.net/jdk/compare/4031cb41...9e85592a ------------- Changes: https://git.openjdk.java.net/jdk/pull/302/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=302&range=10 Stats: 527 lines in 27 files changed: 477 ins; 23 del; 27 mod Patch: https://git.openjdk.java.net/jdk/pull/302.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/302/head:pull/302 PR: https://git.openjdk.java.net/jdk/pull/302 From jbhateja at openjdk.java.net Thu Oct 29 10:28:00 2020 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Thu, 29 Oct 2020 10:28:00 GMT Subject: RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions [v9] In-Reply-To: References: <8Ryyxuf5P2D6WNyj4riYCTgN0U6WLrLpBmxhNbnmPpQ=.b2ed5660-99d0-49d1-83e0-8b2de518d7b8@github.com> Message-ID: On Thu, 29 Oct 2020 08:03:23 GMT, Ningsheng Jian wrote: > > As currently there is no support for mask registers in RA, for X86 long ideal type is sufficient for a mask producing node (def operand is a mask register) ; But for complete support returning Op_RegVMask as an ideal_reg() type for masked Ideal node should do the trick without creating an explicit new ideal Type for mask generating nodes. Spill sizes and number of slots may be different for X86 and ARM (SVE). > > So, do you have a plan to support Op_RegVMask? In SVE, we will use this kind of node for mask/predicate type. > Not as the part of this patch but as a separate RFE, we may benefit from decoupling b/w ideal type(bottom_type) and ideal_reg() for a given Ideal node; this should allow us to build any future extension on top of these masked generating nodes (VectorMaskGen). > > Shallow copy during Node::clone should be sufficient here since encapsulated element type will be preserved. > > Checking the code in Node::clone() again, I think the object copy relies on size_of(), so you need to override that to get the correct object size for copying. Thanks for pointing out; I missed this earlier. ------------- PR: https://git.openjdk.java.net/jdk/pull/302 From stefank at openjdk.java.net Thu Oct 29 10:37:43 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 29 Oct 2020 10:37:43 GMT Subject: RFR: 8255582: Introduce SemaphoreLock and SemaphoreLocker In-Reply-To: <7xyhNYWM9-ZzeiQszzibqDrnZNaCCKLd__6oY4Mtmoo=.b9897b00-fb2e-4c35-ab69-47ad62f9ef66@github.com> References: <_JWavDFnX9LXuj9uXsd6RDwRDrsoVb-tG-CVQ6fMA90=.5d37cc4c-b6cf-4dc8-ac47-0372cf368ead@github.com> <7xyhNYWM9-ZzeiQszzibqDrnZNaCCKLd__6oY4Mtmoo=.b9897b00-fb2e-4c35-ab69-47ad62f9ef66@github.com> Message-ID: On Thu, 29 Oct 2020 10:09:50 GMT, Albert Mingkun Yang wrote: >> Semaphores can be used as low-level locks, but the readability of the code using them could be better. I propose that we introduce two new classes: >> >> SemaphoreLock - which provides the operations lock, unlock, try_lock. >> >> SemaphoreLocker - Equivalent to MutexLocker. > > Marked as reviewed by ayang (Author). Moved over to hotspot-dev. ------------- PR: https://git.openjdk.java.net/jdk/pull/927 From rkennke at openjdk.java.net Thu Oct 29 11:09:08 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 29 Oct 2020 11:09:08 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v21] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads that make heavvy use of such weak references will therefore potentially cause significant GC pauses. > > There are 3 main items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for FinalReference, marking stops there, and does not mark through the referent. > 2. Concurrent processing is performed after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers in Reference.get() intrinsics that return NULL when the referent is not reachable. > > Testing: hotspot_gc_shenadoah (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions, tier1, tier2, vmTestbase_vm_metaspace, vmTestbase_nsk_jvmti, with -XX:+UseShenandoahGC without regressions, specjvm with various levels of verification Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: - Call into native-LRB on unknown oop strenght (i.e. reflection) too - Put in comment about API impedence mismatch around interpreter native LRB ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/505/files - new: https://git.openjdk.java.net/jdk/pull/505/files/072e3817..96f2a3ca Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=20 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=19-20 Stats: 9 lines in 2 files changed: 6 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From pliden at openjdk.java.net Thu Oct 29 11:10:48 2020 From: pliden at openjdk.java.net (Per Liden) Date: Thu, 29 Oct 2020 11:10:48 GMT Subject: RFR: 8254877: GCLogPrecious::_lock rank constrains what locks you are allowed to have when crashing [v3] In-Reply-To: References: <_mcGIfKtXbuDVOTisGl5s38hnMVVwAWIuaqG3mwlKj4=.8fe7adf8-cae8-4ccb-8d98-91ea7d308243@github.com> Message-ID: On Thu, 29 Oct 2020 10:03:30 GMT, Stefan Karlsson wrote: >> In the latest update I added two new helper classes: `SemaphoreLock` and `SemaphoreLocker`. I think this makes the code nicer. Since those classes are more broadly used, I'll go a head and split them out into a separate PR. > > Forked off the SemaphoreLock part into https://github.com/openjdk/jdk/pull/927 Updates look good! ------------- PR: https://git.openjdk.java.net/jdk/pull/903 From pliden at openjdk.java.net Thu Oct 29 11:13:42 2020 From: pliden at openjdk.java.net (Per Liden) Date: Thu, 29 Oct 2020 11:13:42 GMT Subject: RFR: 8255582: Introduce SemaphoreLock and SemaphoreLocker In-Reply-To: <_JWavDFnX9LXuj9uXsd6RDwRDrsoVb-tG-CVQ6fMA90=.5d37cc4c-b6cf-4dc8-ac47-0372cf368ead@github.com> References: <_JWavDFnX9LXuj9uXsd6RDwRDrsoVb-tG-CVQ6fMA90=.5d37cc4c-b6cf-4dc8-ac47-0372cf368ead@github.com> Message-ID: On Thu, 29 Oct 2020 10:01:22 GMT, Stefan Karlsson wrote: > Semaphores can be used as low-level locks, but the readability of the code using them could be better. I propose that we introduce two new classes: > > SemaphoreLock - which provides the operations lock, unlock, try_lock. > > SemaphoreLocker - Equivalent to MutexLocker. For low-level locks, an alternative could be to use PlatformMutex. ------------- PR: https://git.openjdk.java.net/jdk/pull/927 From rkennke at openjdk.java.net Thu Oct 29 11:17:00 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 29 Oct 2020 11:17:00 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v22] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads that make heavvy use of such weak references will therefore potentially cause significant GC pauses. > > There are 3 main items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for FinalReference, marking stops there, and does not mark through the referent. > 2. Concurrent processing is performed after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers in Reference.get() intrinsics that return NULL when the referent is not reachable. > > Testing: hotspot_gc_shenadoah (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions, tier1, tier2, vmTestbase_vm_metaspace, vmTestbase_nsk_jvmti, with -XX:+UseShenandoahGC without regressions, specjvm with various levels of verification Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Some more ShMarkTask cleanups ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/505/files - new: https://git.openjdk.java.net/jdk/pull/505/files/96f2a3ca..f2bf4edc Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=21 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=20-21 Stats: 31 lines in 1 file changed: 10 ins; 6 del; 15 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From rehn at openjdk.java.net Thu Oct 29 11:19:45 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 29 Oct 2020 11:19:45 GMT Subject: RFR: 8255582: Introduce SemaphoreLock and SemaphoreLocker In-Reply-To: References: <_JWavDFnX9LXuj9uXsd6RDwRDrsoVb-tG-CVQ6fMA90=.5d37cc4c-b6cf-4dc8-ac47-0372cf368ead@github.com> Message-ID: <8tzipodcXSFSV0-cLFk6JbptRFqq-M9dLZwtlysqr7k=.f3ae8624-bd62-4118-bed1-9386d44cd3f3@github.com> On Thu, 29 Oct 2020 11:11:10 GMT, Per Liden wrote: >> Semaphores can be used as low-level locks, but the readability of the code using them could be better. I propose that we introduce two new classes: >> >> SemaphoreLock - which provides the operations lock, unlock, try_lock. >> >> SemaphoreLocker - Equivalent to MutexLocker. > > For low-level locks, an alternative could be to use PlatformMutex. If you are using a semaphore exactly like a mutex, why are you not using a mutex? Since they are not part of lock ranks they are deadlock-prone. Main uses seems to be people not wanting to deal with lock ranks, but that only leads to deadlocks sooner or later. Another issue is: SemaphoreLock sl; sl.unlock(); sl.unlock(); Might even be correct code if you wished to signal twice. Thus, the whole: { SemaphoreLocker sl(&lock); // Not mutual exclusive } If you are not suppose to being able to unlock twice then it works exactly like a Mutex and I think we instead should fix lock ranks (whatever that means). If you are suppose to be able to unlock twice, then it's not a lock, is it ? :) E.g.: ThreadIdExclusiveAccess is one of the places where this SemaphoreLocker could be used, but there seems to be no issues using a Mutex instead (except choosing a rank). ------------- PR: https://git.openjdk.java.net/jdk/pull/927 From hoffmann at mountainminds.com Thu Oct 29 11:36:47 2020 From: hoffmann at mountainminds.com (Marc Hoffmann) Date: Thu, 29 Oct 2020 12:36:47 +0100 Subject: arm32 builds continue to fail for me after 8253540 and 8253901 In-Reply-To: References: <56ff08d5-a4e5-788a-1c29-02f76e8755d2@redhat.com> <17F91692-4F3D-4FAA-AB94-361B6C84F982@mountainminds.com> <111D12C0-BE9F-4B2F-BAB1-CB0445CAB219@mountainminds.com> <00F0209A-F00D-4B06-B696-7DBD8E9D34F6@mountainminds.com> Message-ID: Hi Boris, I can confirm that the build is green without fastdebug. Many thanks, -marc > On 29. Oct 2020, at 11:39, Boris Ulasevich wrote: > > Hi Marc, > > Oh, yes, fastdebug fails because I forgot to update the assertion > statement - it is a separate issue, and I must fix it. > But I am sure that the real issue is in another place. And it seems > that it was fixed: release builds works Ok on my RPI since [4]. > > Boris > > On Thu, Oct 29, 2020 at 11:43 AM Marc Hoffmann > wrote: >> >> Hi Boris, >> >> thanks for coming back on this! >> >> Current master is still red for me with the same failure (https://pici.beachhub.io/#/jdk): >> >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (/workspace/src/hotspot/share/asm/register.hpp:160), pid=15243, tid=15248 >> # assert(a != b && a != c && a != d && a != e && b != c && b != d && b != e && c != d && c != e && d != e) failed: registers must be different: a=0x00000002, b=0x00000003, c=0x00000000, d=0x00000000, e=0x0000000c >> # >> # JRE version: (16.0) (fastdebug build ) >> # Java VM: OpenJDK Server VM (fastdebug 16-internal+0-adhoc..workspace, mixed mode, g1 gc, linux-arm) >> # Problematic frame: >> # V [libjvm.so+0x751404] InterpreterMacroAssembler::unlock_object(RegisterImpl*) [clone .part.34]+0x63 >> >> Also I don?t see how the last commit you mention is related to the issue. The problem seems to be inconsistent assertions in InterpreterMacroAssembler::unlock_object which was not touched since commit [2] in your list: >> >> https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/arm/interp_masm_arm.cpp#L1000 >> >> In line 990 it is asserted that Rlock == R0, but then in line 1000 the registered should be different: >> >> assert_different_registers(Robj, Rmark, Rlock, R0, Rtemp); >> >> >> Regards, >> -marc >> >> >> >> >> On 28. Oct 2020, at 23:39, Boris Ulasevich wrote: >> >> Hi Marc, >> >> Sorry for being unavailable for too long! >> >> My understanding of the case is that >> - my change [2] fixed the issue introduced in [1] >> - with the fix JVM build still crashed on RPi >> - the build issue on RPi was introduced by the change [3] and was >> fixed by the change [4] >> >> Can I ask you to check that the current repo build works Ok for you? >> >> regards, >> Boris >> >> >> [1] >> commit 77a0f3999afa322b64643afd4a161164440af975 >> Author: Coleen Phillimore >> Date: Mon Sep 28 15:49:02 2020 +0000 >> 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function >> [2] >> commit fd0cb98ed03c6214c02ccd3503c1e6d77065a428 >> Author: Boris Ulasevich >> Date: Thu Oct 8 06:52:27 2020 +0000 >> 8253901: ARM32: SIGSEGV during monitorexit due to incorrect register >> use (after JDK-8253540) >> >> [3] >> commit ea27a54bf0ff526effb47f9daaec51ced2d2bb71 >> Author: Calvin Cheung >> Date: Mon Oct 5 16:52:00 2020 +0000 >> 8224509: Incorrect alignment in CDS related allocation code on 32-bit platforms >> [4] >> commit 5145bed0282575a580cf3ebc343038c1dc8ddb8d >> Author: Ioi Lam >> Date: Fri Oct 16 05:14:46 2020 +0000 >> 8254125: Assertion in cppVtables.cpp during builds on 32bit Windows >> >> >> On Fri, Oct 16, 2020 at 10:19 AM Marc Hoffmann >> wrote: >> >> >> Dear Boris, >> >> if it helps to fix the 32 arm build: >> >> 1) I gave you write access to my JDK fork at GitHub: https://github.com/marchof/jdk >> 2) You can (force) push to the branch called ?build? >> 3) The build results are here: https://pici.beachhub.io/#/jdk-marchof >> >> The repo is polled every 30?, the build takes another 30? until it fails. >> >> Regards, >> -marc >> >> >> On 13. Oct 2020, at 08:48, Boris Ulasevich wrote: >> >> Hi Marc, >> >> I created JDK-8254661 for the issue. I would love to fix it, but still >> can't reproduce the crash (even on Raspberry Pi). >> What configuration do you have? The following sequence works Ok for me: >> pi at raspberrypi $ git clone https://github.com/openjdk/jdk >> pi at raspberrypi $ cd jdk >> pi at raspberrypi $ bash configure --with-boot-jdk=/home/pi/jdk-15 >> pi at raspberrypi $ make >> >> Your debug build shows that I did not fix the >> assert_different_registers in >> InterpreterMacroAssembler::unlock_object() >> body (and the function comment by the way!), though with eyeballing I >> do not see what is wrong for Rlock=R0: >> https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/arm/interp_masm_arm.cpp#L1000 >> >> regards, >> Boris >> >> On Mon, Oct 12, 2020 at 11:34 PM Marc Hoffmann >> wrote: >> >> >> Hi Aleksey, hi Boris, >> >> for me the crash is always reproducible: Every single build after >> >> 77a0f3999afa322b64643afd4a161164440af975 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function >> >> fails on arm32 (build on ubuntu in docker on a raspberry pi 4). Before this commit I haven?t encountered any failures. >> >> Here is the hs_err file with ?enable-debug (reproduced with current master c7f00640627eab38b77d23d07876cf0247fa18f3). >> >> Cheers, >> -marc >> >> >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (/workspace/src/hotspot/share/asm/register.hpp:160), pid=14700, tid=14705 >> # assert(a != b && a != c && a != d && a != e && b != c && b != d && b != e && c != d && c != e && d != e) failed: registers must be different: a=0x00000002, b=0x00000003, c=0x00000000, d=0x00000000, e=0x0000000c >> # >> # JRE version: (16.0) (fastdebug build ) >> # Java VM: OpenJDK Server VM (fastdebug 16-internal+0-adhoc..workspace, mixed mode, g1 gc, linux-arm) >> # Problematic frame: >> # V [libjvm.so+0x7571fc] InterpreterMacroAssembler::unlock_object(RegisterImpl*) [clone .part.34]+0x63 >> # >> # Core dump will be written. Default location: /workspace/make/core >> # >> # >> >> --------------- S U M M A R Y ------------ >> >> Command Line: -Xms64M -Xmx768M --add-exports=java.base/jdk.internal.module=ALL-UNNAMED build.tools.jigsaw.AddPackagesAttribute /workspace/build/linux-arm-server-fastdebug/jdk >> >> Host: 20431585315d, rev 3 (v7l), 4 cores, 3G, Ubuntu 18.04.3 LTS >> Time: Mon Oct 12 20:22:15 2020 UTC elapsed time: 0.144243 seconds (0d 0h 0m 0s) >> >> --------------- T H R E A D --------------- >> >> Current thread (0xb5b16460): JavaThread "Unknown thread" [_thread_in_vm, id=14705, stack(0xb5c6e000,0xb5cbe000)] >> >> Stack: [0xb5c6e000,0xb5cbe000], sp=0xb5cbc2d0, free space=312k >> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x7571fc] InterpreterMacroAssembler::unlock_object(RegisterImpl*) [clone .part.34]+0x63 >> >> Registers: >> r0 = 0x00000003 >> r1 = 0x000000a0 >> r2 = 0x00000002 >> r3 = 0x00000000 >> r4 = 0xb5b168b0 >> r5 = 0x0000000c >> r6 = 0x00000000 >> r7 = 0xb5cbc2e8 >> r8 = 0xb6db1fa8 >> r9 = 0xb5cbc760 >> r10 = 0xe3520000 >> fp = 0xb6db1fa8 >> r12 = 0xb6ff8000 >> sp = 0xb5cbc2d0 >> lr = 0x00000058 >> pc = 0xb64961fc >> cpsr = 0x200f0030 >> >> Top of Stack: (sp=0xb5cbc2d0) >> 0xb5cbc2d0: 00000002 00000003 00000000 00000000 >> 0xb5cbc2e0: 0000000c 00000048 0000006e 00000000 >> 0xb5cbc2f0: 00000000 00000000 00000000 0000007c >> 0xb5cbc300: 00000000 00000077 b5cbc378 00000000 >> 0xb5cbc310: b5cbc380 0000000f b6db1fa8 b5cbc340 >> 0xb5cbc320: b5b168b0 b6db1fa8 b5cbc4c4 b6008a2b >> 0xb5cbc330: b5cbc454 b5cbc348 0000000f b61971cf >> 0xb5cbc340: b5cbc380 b5cbc3b0 b5b168b0 b5cbc388 >> >> Instructions: (pc=0xb64961fc) >> 0xb64960fc: 440be9c7 a034f8c7 f5e06139 68ebf7ed >> 0xb649610c: f040689a 46184164 1180f441 68996011 >> 0xb649611c: f5e03104 4b12f781 46284622 1003f858 >> 0xb649612c: f998f2e4 f64f68eb f2ce7210 6899122f >> 0xb649613c: 600a4618 31046899 f76ef5e0 46284631 >> 0xb649614c: f4e6f69f f5e04630 f507f79b 46bd7786 >> 0xb649615c: 8ff0e8bd 0091bfd0 00006ee4 000059d0 >> 0xb649616c: 00007d1c 4a084b07 b480447b 589baf00 >> 0xb649617c: b91b781b f85d46bd 47707b04 f85d46bd >> 0xb649618c: e7197b04 0091be30 00006a24 bf182900 >> 0xb649619c: f1a1290c e92d0202 b0f34ff0 2301bf08 >> 0xb64961ac: bf18af06 f8df2300 2a018268 461abf8c >> 0xb64961bc: 0201f043 f04f460e 44f833ff 30d0f8c7 >> 0xb64961cc: f8c74604 23003140 333de9c7 30fcf887 >> 0xb64961dc: 3359e9c7 316cf887 4a8eb1da 0e58f04f >> 0xb64961ec: 250c2003 1002f858 f8d12202 21a0c000 >> 0xb64961fc: e000f88c 2000e9cd 6302e9cd 4b874a86 >> 0xb649620c: 447a4887 9504447b f58f4478 f00cfaa3 >> 0xb649621c: 68e2f2d5 4340f44f 33a0f2ce 0a04f04f >> 0xb649622c: 7980f04f 0b00f04f 46106891 600b2501 >> 0xb649623c: 44516891 f6f0f5e0 4a7b497a 3001f858 >> 0xb649624c: 0140f107 a048f8c7 4608643e f8c7681b >> 0xb649625c: 60fb904c f858647b f8c72002 f8c7b054 >> 0xb649626c: 60bab058 e9c73208 653abb19 66fd607a >> 0xb649627c: f732f5e0 689a68e3 4164f040 f4414618 >> 0xb649628c: 60111181 44516899 f6c6f5e0 0110f107 >> 0xb649629c: 68fb687a f8c74608 623aa018 6304e9c7 >> 0xb64962ac: 901cf8c7 bb09e9c7 bb0de9c7 f5e063fd >> 0xb64962bc: 68e3f713 f0406899 46184264 4240f442 >> 0xb64962cc: 6899600a f1074451 f5e00ad0 4b57f6a5 >> 0xb64962dc: 3003f858 2b00781b 8093f040 f04f68bb >> 0xb64962ec: 68f97c80 33082500 c0acf8c7 0b01f04f >> >> >> >> --------------- P R O C E S S --------------- >> >> uid : 0 euid : 0 gid : 0 egid : 0 >> >> umask: 0022 (----w--w-) >> >> Threads class SMR info: >> _java_thread_list=0xb6e56078, length=0, elements={ >> } >> _java_thread_list_alloc_cnt=1, _java_thread_list_free_cnt=0, _java_thread_list_max=0, _nested_thread_list_max=0 >> _delete_lock_wait_cnt=0, _delete_lock_wait_max=0 >> _to_delete_list_cnt=0, _to_delete_list_max=0 >> >> Java Threads: ( => current thread ) >> >> Other Threads: >> 0xb5b73188 GCTaskThread "GC Thread#0" [stack: 0x81d00000,0x81d80000] [id=14706] >> 0xb5b77dc0 ConcurrentGCThread "G1 Main Marker" [stack: 0x81c7e000,0x81cfe000] [id=14707] >> 0xb5b790c0 ConcurrentGCThread "G1 Conc#0" [stack: 0x81a80000,0x81b00000] [id=14708] >> 0xb5bde230 ConcurrentGCThread "G1 Refine#0" [stack: 0x81780000,0x81800000] [id=14709] >> 0xb5bdf488 ConcurrentGCThread "G1 Service" [stack: 0x81580000,0x81600000] [id=14710] >> >> =>0xb5b16460 (exited) JavaThread "Unknown thread" [_thread_in_vm, id=14705, stack(0xb5c6e000,0xb5cbe000)] >> >> Threads with active compile tasks: >> >> VM state: not at safepoint (not fully initialized) >> >> VM Mutex/Monitor currently owned by a thread: None >> >> GC Precious Log: >> CPUs: 4 total, 4 available >> Memory: 3827M >> Large Page Support: Disabled >> NUMA Support: Disabled >> Compressed Oops: Disabled >> Heap Region Size: 1M >> Heap Min Capacity: 64M >> Heap Initial Capacity: 64M >> Heap Max Capacity: 768M >> Pre-touch: Disabled >> Parallel Workers: 4 >> Concurrent Workers: 1 >> Concurrent Refinement Workers: 4 >> Periodic GC: Disabled >> >> Heap: >> garbage-first heap total 65536K, used 0K [0x83a00000, 0xb3a00000) >> region size 1024K, 1 young (1024K), 0 survivors (0K) >> Metaspace used 944K, capacity 2200K, committed 2200K, reserved 4400K >> >> Heap Regions: E=young(eden), S=young(survivor), O=old, HS=humongous(starts), HC=humongous(continues), CS=collection set, F=free, OA=open archive, CA=closed archive, TAMS=top-at-mark-start (previous, next) >> | 0|0x83a00000, 0x83a00000, 0x83b00000| 0%| F| |TAMS 0x83a00000, 0x83a00000| Untracked >> | 1|0x83b00000, 0x83b00000, 0x83c00000| 0%| F| |TAMS 0x83b00000, 0x83b00000| Untracked >> | 2|0x83c00000, 0x83c00000, 0x83d00000| 0%| F| |TAMS 0x83c00000, 0x83c00000| Untracked >> | 3|0x83d00000, 0x83d00000, 0x83e00000| 0%| F| |TAMS 0x83d00000, 0x83d00000| Untracked >> | 4|0x83e00000, 0x83e00000, 0x83f00000| 0%| F| |TAMS 0x83e00000, 0x83e00000| Untracked >> | 5|0x83f00000, 0x83f00000, 0x84000000| 0%| F| |TAMS 0x83f00000, 0x83f00000| Untracked >> | 6|0x84000000, 0x84000000, 0x84100000| 0%| F| |TAMS 0x84000000, 0x84000000| Untracked >> | 7|0x84100000, 0x84100000, 0x84200000| 0%| F| |TAMS 0x84100000, 0x84100000| Untracked >> | 8|0x84200000, 0x84200000, 0x84300000| 0%| F| |TAMS 0x84200000, 0x84200000| Untracked >> | 9|0x84300000, 0x84300000, 0x84400000| 0%| F| |TAMS 0x84300000, 0x84300000| Untracked >> | 10|0x84400000, 0x84400000, 0x84500000| 0%| F| |TAMS 0x84400000, 0x84400000| Untracked >> | 11|0x84500000, 0x84500000, 0x84600000| 0%| F| |TAMS 0x84500000, 0x84500000| Untracked >> | 12|0x84600000, 0x84600000, 0x84700000| 0%| F| |TAMS 0x84600000, 0x84600000| Untracked >> | 13|0x84700000, 0x84700000, 0x84800000| 0%| F| |TAMS 0x84700000, 0x84700000| Untracked >> | 14|0x84800000, 0x84800000, 0x84900000| 0%| F| |TAMS 0x84800000, 0x84800000| Untracked >> | 15|0x84900000, 0x84900000, 0x84a00000| 0%| F| |TAMS 0x84900000, 0x84900000| Untracked >> | 16|0x84a00000, 0x84a00000, 0x84b00000| 0%| F| |TAMS 0x84a00000, 0x84a00000| Untracked >> | 17|0x84b00000, 0x84b00000, 0x84c00000| 0%| F| |TAMS 0x84b00000, 0x84b00000| Untracked >> | 18|0x84c00000, 0x84c00000, 0x84d00000| 0%| F| |TAMS 0x84c00000, 0x84c00000| Untracked >> | 19|0x84d00000, 0x84d00000, 0x84e00000| 0%| F| |TAMS 0x84d00000, 0x84d00000| Untracked >> | 20|0x84e00000, 0x84e00000, 0x84f00000| 0%| F| |TAMS 0x84e00000, 0x84e00000| Untracked >> | 21|0x84f00000, 0x84f00000, 0x85000000| 0%| F| |TAMS 0x84f00000, 0x84f00000| Untracked >> | 22|0x85000000, 0x85000000, 0x85100000| 0%| F| |TAMS 0x85000000, 0x85000000| Untracked >> | 23|0x85100000, 0x85100000, 0x85200000| 0%| F| |TAMS 0x85100000, 0x85100000| Untracked >> | 24|0x85200000, 0x85200000, 0x85300000| 0%| F| |TAMS 0x85200000, 0x85200000| Untracked >> | 25|0x85300000, 0x85300000, 0x85400000| 0%| F| |TAMS 0x85300000, 0x85300000| Untracked >> | 26|0x85400000, 0x85400000, 0x85500000| 0%| F| |TAMS 0x85400000, 0x85400000| Untracked >> | 27|0x85500000, 0x85500000, 0x85600000| 0%| F| |TAMS 0x85500000, 0x85500000| Untracked >> | 28|0x85600000, 0x85600000, 0x85700000| 0%| F| |TAMS 0x85600000, 0x85600000| Untracked >> | 29|0x85700000, 0x85700000, 0x85800000| 0%| F| |TAMS 0x85700000, 0x85700000| Untracked >> | 30|0x85800000, 0x85800000, 0x85900000| 0%| F| |TAMS 0x85800000, 0x85800000| Untracked >> | 31|0x85900000, 0x85900000, 0x85a00000| 0%| F| |TAMS 0x85900000, 0x85900000| Untracked >> | 32|0x85a00000, 0x85a00000, 0x85b00000| 0%| F| |TAMS 0x85a00000, 0x85a00000| Untracked >> | 33|0x85b00000, 0x85b00000, 0x85c00000| 0%| F| |TAMS 0x85b00000, 0x85b00000| Untracked >> | 34|0x85c00000, 0x85c00000, 0x85d00000| 0%| F| |TAMS 0x85c00000, 0x85c00000| Untracked >> | 35|0x85d00000, 0x85d00000, 0x85e00000| 0%| F| |TAMS 0x85d00000, 0x85d00000| Untracked >> | 36|0x85e00000, 0x85e00000, 0x85f00000| 0%| F| |TAMS 0x85e00000, 0x85e00000| Untracked >> | 37|0x85f00000, 0x85f00000, 0x86000000| 0%| F| |TAMS 0x85f00000, 0x85f00000| Untracked >> | 38|0x86000000, 0x86000000, 0x86100000| 0%| F| |TAMS 0x86000000, 0x86000000| Untracked >> | 39|0x86100000, 0x86100000, 0x86200000| 0%| F| |TAMS 0x86100000, 0x86100000| Untracked >> | 40|0x86200000, 0x86200000, 0x86300000| 0%| F| |TAMS 0x86200000, 0x86200000| Untracked >> | 41|0x86300000, 0x86300000, 0x86400000| 0%| F| |TAMS 0x86300000, 0x86300000| Untracked >> | 42|0x86400000, 0x86400000, 0x86500000| 0%| F| |TAMS 0x86400000, 0x86400000| Untracked >> | 43|0x86500000, 0x86500000, 0x86600000| 0%| F| |TAMS 0x86500000, 0x86500000| Untracked >> | 44|0x86600000, 0x86600000, 0x86700000| 0%| F| |TAMS 0x86600000, 0x86600000| Untracked >> | 45|0x86700000, 0x86700000, 0x86800000| 0%| F| |TAMS 0x86700000, 0x86700000| Untracked >> | 46|0x86800000, 0x86800000, 0x86900000| 0%| F| |TAMS 0x86800000, 0x86800000| Untracked >> | 47|0x86900000, 0x86900000, 0x86a00000| 0%| F| |TAMS 0x86900000, 0x86900000| Untracked >> | 48|0x86a00000, 0x86a00000, 0x86b00000| 0%| F| |TAMS 0x86a00000, 0x86a00000| Untracked >> | 49|0x86b00000, 0x86b00000, 0x86c00000| 0%| F| |TAMS 0x86b00000, 0x86b00000| Untracked >> | 50|0x86c00000, 0x86c00000, 0x86d00000| 0%| F| |TAMS 0x86c00000, 0x86c00000| Untracked >> | 51|0x86d00000, 0x86d00000, 0x86e00000| 0%| F| |TAMS 0x86d00000, 0x86d00000| Untracked >> | 52|0x86e00000, 0x86e00000, 0x86f00000| 0%| F| |TAMS 0x86e00000, 0x86e00000| Untracked >> | 53|0x86f00000, 0x86f00000, 0x87000000| 0%| F| |TAMS 0x86f00000, 0x86f00000| Untracked >> | 54|0x87000000, 0x87000000, 0x87100000| 0%| F| |TAMS 0x87000000, 0x87000000| Untracked >> | 55|0x87100000, 0x87100000, 0x87200000| 0%| F| |TAMS 0x87100000, 0x87100000| Untracked >> | 56|0x87200000, 0x87200000, 0x87300000| 0%| F| |TAMS 0x87200000, 0x87200000| Untracked >> | 57|0x87300000, 0x87300000, 0x87400000| 0%| F| |TAMS 0x87300000, 0x87300000| Untracked >> | 58|0x87400000, 0x87400000, 0x87500000| 0%| F| |TAMS 0x87400000, 0x87400000| Untracked >> | 59|0x87500000, 0x87500000, 0x87600000| 0%| F| |TAMS 0x87500000, 0x87500000| Untracked >> | 60|0x87600000, 0x87600000, 0x87700000| 0%| F| |TAMS 0x87600000, 0x87600000| Untracked >> | 61|0x87700000, 0x87700000, 0x87800000| 0%| F| |TAMS 0x87700000, 0x87700000| Untracked >> | 62|0x87800000, 0x87800000, 0x87900000| 0%| F| |TAMS 0x87800000, 0x87800000| Untracked >> | 63|0x87900000, 0x87942908, 0x87a00000| 26%| E| |TAMS 0x87900000, 0x87900000| Complete >> >> Card table byte_map: [0x83700000,0x83880000] _byte_map_base: 0x832e3000 >> >> Marking Bits (Prev, Next): (CMBitMap*) 0xb5b74324, (CMBitMap*) 0xb5b74344 >> Prev Bits: [0x82980000, 0x83580000) >> Next Bits: [0x81d80000, 0x82980000) >> >> GC Heap History (0 events): >> No events >> >> Deoptimization events (0 events): >> No events >> >> Classes unloaded (0 events): >> No events >> >> Classes redefined (0 events): >> No events >> >> Internal exceptions (0 events): >> No events >> >> Events (20 events): >> Event: 0.113 loading class java/lang/Character >> Event: 0.114 loading class java/lang/Character done >> Event: 0.114 loading class java/lang/Float >> Event: 0.115 loading class java/lang/Number >> Event: 0.115 loading class java/lang/Number done >> Event: 0.115 loading class java/lang/Float done >> Event: 0.115 loading class java/lang/Double >> Event: 0.116 loading class java/lang/Double done >> Event: 0.116 loading class java/lang/Byte >> Event: 0.116 loading class java/lang/Byte done >> Event: 0.116 loading class java/lang/Short >> Event: 0.117 loading class java/lang/Short done >> Event: 0.117 loading class java/lang/Integer >> Event: 0.118 loading class java/lang/Integer done >> Event: 0.118 loading class java/lang/Long >> Event: 0.119 loading class java/lang/Long done >> Event: 0.119 loading class java/util/Iterator >> Event: 0.119 loading class java/util/Iterator done >> Event: 0.119 loading class java/lang/reflect/RecordComponent >> Event: 0.119 loading class java/lang/reflect/RecordComponent done >> >> >> Dynamic libraries: >> 00410000-00411000 r-xp 00000000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java >> 00420000-00421000 r--p 00000000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java >> 00421000-00422000 rw-p 00001000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java >> 019b6000-019d7000 rw-p 00000000 00:00 0 [heap] >> 809c9000-80e00000 rw-p 00000000 00:00 0 >> 80e00000-80e8e000 rw-p 00000000 00:00 0 >> 80e8e000-80f00000 ---p 00000000 00:00 0 >> 80fb4000-811da000 rw-p 00000000 00:00 0 >> 811da000-81400000 ---p 00000000 00:00 0 >> 81400000-81421000 rw-p 00000000 00:00 0 >> 81421000-81500000 ---p 00000000 00:00 0 >> 8157e000-8157f000 ---p 00000000 00:00 0 >> 8157f000-81600000 rw-p 00000000 00:00 0 >> 81600000-81621000 rw-p 00000000 00:00 0 >> 81621000-81700000 ---p 00000000 00:00 0 >> 8177e000-8177f000 ---p 00000000 00:00 0 >> 8177f000-81800000 rw-p 00000000 00:00 0 >> 81800000-81821000 rw-p 00000000 00:00 0 >> 81821000-81900000 ---p 00000000 00:00 0 >> 81900000-81921000 rw-p 00000000 00:00 0 >> 81921000-81a00000 ---p 00000000 00:00 0 >> 81a7e000-81a7f000 ---p 00000000 00:00 0 >> 81a7f000-81b00000 rw-p 00000000 00:00 0 >> 81b00000-81b21000 rw-p 00000000 00:00 0 >> 81b21000-81c00000 ---p 00000000 00:00 0 >> 81c21000-81c7c000 rw-p 00000000 00:00 0 >> 81c7c000-81c7d000 ---p 00000000 00:00 0 >> 81c7d000-81cfe000 rw-p 00000000 00:00 0 >> 81cfe000-81cff000 ---p 00000000 00:00 0 >> 81cff000-81e80000 rw-p 00000000 00:00 0 >> 81e80000-82980000 ---p 00000000 00:00 0 >> 82980000-82a80000 rw-p 00000000 00:00 0 >> 82a80000-83580000 ---p 00000000 00:00 0 >> 83580000-835a0000 rw-p 00000000 00:00 0 >> 835a0000-83700000 ---p 00000000 00:00 0 >> 83700000-83720000 rw-p 00000000 00:00 0 >> 83720000-83880000 ---p 00000000 00:00 0 >> 83880000-838a0000 rw-p 00000000 00:00 0 >> 838a0000-83a00000 ---p 00000000 00:00 0 >> 83a00000-87a00000 rw-p 00000000 00:00 0 >> 87a00000-b3a00000 ---p 00000000 00:00 0 >> b3a25000-b3a76000 rw-p 00000000 00:00 0 >> b3a76000-b3ab3000 ---p 00000000 00:00 0 >> b3ab3000-b3c33000 rwxp 00000000 00:00 0 >> b3c33000-b5ab3000 ---p 00000000 00:00 0 >> b5ab3000-b5ac8000 r-xp 00000000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so >> b5ac8000-b5ad8000 ---p 00015000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so >> b5ad8000-b5ad9000 r--p 00015000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so >> b5ad9000-b5ada000 rw-p 00016000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so >> b5ada000-b5ae2000 rw-s 00000000 b3:02 2576900 /tmp/hsperfdata_root/14700 >> b5ae2000-b5ae9000 r-xp 00000000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so >> b5ae9000-b5af8000 ---p 00007000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so >> b5af8000-b5af9000 r--p 00006000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so >> b5af9000-b5afa000 rw-p 00007000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so >> b5afa000-b5b00000 rw-p 00000000 00:00 0 >> b5b00000-b5c00000 rw-p 00000000 00:00 0 >> b5c00000-b5c0d000 r-xp 00000000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so >> b5c0d000-b5c1c000 ---p 0000d000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so >> b5c1c000-b5c1d000 r--p 0000c000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so >> b5c1d000-b5c1e000 rw-p 0000d000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so >> b5c1e000-b5c20000 rw-p 00000000 00:00 0 >> b5c20000-b5c27000 r-xp 00000000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so >> b5c27000-b5c36000 ---p 00007000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so >> b5c36000-b5c37000 r--p 00006000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so >> b5c37000-b5c38000 rw-p 00007000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so >> b5c38000-b5c3d000 r-xp 00000000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so >> b5c3d000-b5c4c000 ---p 00005000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so >> b5c4c000-b5c4d000 r--p 00004000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so >> b5c4d000-b5c4e000 rw-p 00005000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so >> b5c4e000-b5c5d000 r-xp 00000000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so >> b5c5d000-b5c6c000 ---p 0000f000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so >> b5c6c000-b5c6d000 r--p 0000e000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so >> b5c6d000-b5c6e000 rw-p 0000f000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so >> b5c6e000-b5c71000 ---p 00000000 00:00 0 >> b5c71000-b5cbe000 rw-p 00000000 00:00 0 >> b5cbe000-b5d2d000 r-xp 00000000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so >> b5d2d000-b5d3d000 ---p 0006f000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so >> b5d3d000-b5d3e000 r--p 0006f000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so >> b5d3e000-b5d3f000 rw-p 00070000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so >> b5d3f000-b6d56000 r-xp 00000000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so >> b6d56000-b6d65000 ---p 01017000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so >> b6d65000-b6dba000 r--p 01016000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so >> b6dba000-b6dd2000 rw-p 0106b000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so >> b6dd2000-b6e5e000 rw-p 00000000 00:00 0 >> b6e5e000-b6e6f000 r-xp 00000000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so >> b6e6f000-b6e7f000 ---p 00011000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so >> b6e7f000-b6e80000 r--p 00011000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so >> b6e80000-b6e81000 rw-p 00012000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so >> b6e81000-b6e83000 rw-p 00000000 00:00 0 >> b6e83000-b6e85000 r-xp 00000000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so >> b6e85000-b6e94000 ---p 00002000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so >> b6e94000-b6e95000 r--p 00001000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so >> b6e95000-b6e96000 rw-p 00002000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so >> b6e96000-b6eaf000 r-xp 00000000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 >> b6eaf000-b6ebe000 ---p 00019000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 >> b6ebe000-b6ebf000 r--p 00018000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 >> b6ebf000-b6ec0000 rw-p 00019000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 >> b6ec0000-b6fa2000 r-xp 00000000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so >> b6fa2000-b6fb2000 ---p 000e2000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so >> b6fb2000-b6fb4000 r--p 000e2000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so >> b6fb4000-b6fb5000 rw-p 000e4000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so >> b6fb5000-b6fb8000 rw-p 00000000 00:00 0 >> b6fb8000-b6fc2000 r-xp 00000000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so >> b6fc2000-b6fd1000 ---p 0000a000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so >> b6fd1000-b6fd2000 r--p 00009000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so >> b6fd2000-b6fd3000 rw-p 0000a000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so >> b6fd3000-b6feb000 r-xp 00000000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so >> b6ff2000-b6ff4000 rw-p 00000000 00:00 0 >> b6ff6000-b6ff7000 ---p 00000000 00:00 0 >> b6ff7000-b6ff8000 r--p 00000000 00:00 0 >> b6ff8000-b6ff9000 rwxp 00000000 00:00 0 >> b6ff9000-b6ffb000 rw-p 00000000 00:00 0 >> b6ffb000-b6ffc000 r--p 00018000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so >> b6ffc000-b6ffd000 rw-p 00019000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so >> bed97000-bedb8000 rw-p 00000000 00:00 0 [stack] >> beeb0000-beeb1000 r-xp 00000000 00:00 0 [sigpage] >> beeb1000-beeb2000 r--p 00000000 00:00 0 [vvar] >> beeb2000-beeb3000 r-xp 00000000 00:00 0 [vdso] >> ffff0000-ffff1000 r-xp 00000000 00:00 0 [vectors] >> >> >> VM Arguments: >> jvm_args: -Xms64M -Xmx768M --add-exports=java.base/jdk.internal.module=ALL-UNNAMED >> java_command: build.tools.jigsaw.AddPackagesAttribute /workspace/build/linux-arm-server-fastdebug/jdk >> java_class_path (initial): /workspace/build/linux-arm-server-fastdebug/buildtools/tools_jigsaw_classes >> Launcher Type: SUN_STANDARD >> >> [Global flags] >> uint ConcGCThreads = 1 {product} {ergonomic} Number of threads concurrent gc will use >> uint G1ConcRefinementThreads = 4 {product} {ergonomic} The number of parallel rem set update threads. Will be set ergonomically by default. >> size_t G1HeapRegionSize = 1048576 {product} {ergonomic} Size of the G1 regions. >> uintx GCDrainStackTargetSize = 64 {product} {ergonomic} Number of entries we will try to leave on the stack during parallel gc >> size_t InitialHeapSize = 67108864 {product} {command line} Initial heap size (in bytes); zero means use ergonomics >> size_t MarkStackSize = 32768 {product} {ergonomic} Size of marking stack >> size_t MaxHeapSize = 805306368 {product} {command line} Maximum heap size (in bytes) >> size_t MaxNewSize = 482344960 {product} {ergonomic} Maximum new generation size (in bytes), max_uintx means set ergonomically >> size_t MinHeapDeltaBytes = 1048576 {product} {ergonomic} The minimum change in heap space due to GC (in bytes) >> size_t MinHeapSize = 67108864 {product} {command line} Minimum heap size (in bytes); zero means use ergonomics >> uintx NonProfiledCodeHeapSize = 0 {pd product} {ergonomic} Size of code heap with non-profiled methods (in bytes) >> uintx ProfiledCodeHeapSize = 0 {pd product} {ergonomic} Size of code heap with profiled methods (in bytes) >> size_t SoftMaxHeapSize = 805306368 {manageable} {ergonomic} Soft limit for maximum heap size (in bytes) >> bool UseG1GC = true {product} {ergonomic} Use the Garbage-First garbage collector >> >> Logging: >> Log output configuration: >> #0: stdout all=warning uptime,level,tags >> #1: stderr all=off uptime,level,tags >> >> Environment Variables: >> JAVA_HOME=/opt/java/openjdk >> PATH=/opt/java/openjdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin >> LC_ALL=C >> >> Signal Handlers: >> SIGSEGV: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGBUS: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGFPE: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGPIPE: [libjvm.so+0xc9aa9d], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGXFSZ: [libjvm.so+0xc9aa9d], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGILL: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO >> SIGUSR2: [libjvm.so+0xc9ad95], sa_mask[0]=00000000000000000000000000000000, sa_flags=SA_RESTART|SA_SIGINFO >> SIGHUP: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none >> SIGINT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none >> SIGTERM: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none >> SIGQUIT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none >> >> >> --------------- S Y S T E M --------------- >> >> OS: >> DISTRIB_ID=Ubuntu >> DISTRIB_RELEASE=18.04 >> DISTRIB_CODENAME=bionic >> DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS" >> uname: Linux 20431585315d 5.4.51-v7l+ #1333 SMP Mon Aug 10 16:51:40 BST 2020 armv7l >> OS uptime: 14 days 7:59 hours >> libc: glibc 2.27 NPTL 2.27 >> rlimit (soft/hard): STACK 8192k/infinity , CORE infinity/infinity , NPROC infinity/infinity , NOFILE 1048576/1048576 , AS infinity/infinity , CPU infinity/infinity , DATA infinity/infinity , FSIZE infinity/infinity , MEMLOCK 64k/64k >> load average: 3.37 3.26 3.09 >> >> /proc/meminfo: >> MemTotal: 3919812 kB >> MemFree: 1255688 kB >> MemAvailable: 3518740 kB >> Buffers: 134316 kB >> Cached: 2117828 kB >> SwapCached: 0 kB >> Active: 1266624 kB >> Inactive: 1167412 kB >> Active(anon): 110360 kB >> Inactive(anon): 80744 kB >> Active(file): 1156264 kB >> Inactive(file): 1086668 kB >> Unevictable: 16 kB >> Mlocked: 16 kB >> HighTotal: 3264512 kB >> HighFree: 1038848 kB >> LowTotal: 655300 kB >> LowFree: 216840 kB >> SwapTotal: 102396 kB >> SwapFree: 102396 kB >> Dirty: 24916 kB >> Writeback: 0 kB >> AnonPages: 181884 kB >> Mapped: 125864 kB >> Shmem: 16892 kB >> KReclaimable: 181816 kB >> Slab: 205164 kB >> SReclaimable: 181816 kB >> SUnreclaim: 23348 kB >> KernelStack: 2240 kB >> PageTables: 2684 kB >> NFS_Unstable: 0 kB >> Bounce: 0 kB >> WritebackTmp: 0 kB >> CommitLimit: 2062300 kB >> Committed_AS: 1125176 kB >> VmallocTotal: 245760 kB >> VmallocUsed: 5520 kB >> VmallocChunk: 0 kB >> Percpu: 512 kB >> CmaTotal: 262144 kB >> CmaFree: 171244 kB >> >> /sys/kernel/mm/transparent_hugepage/enabled: >> /sys/kernel/mm/transparent_hugepage/defrag (defrag/compaction efforts parameter): >> >> Process Memory: >> Virtual Size: 888828K (peak: 888828K) >> Resident Set Size: 25020K (peak: 25020K) (anon: 11372K, file: 13648K, shmem: 0K) >> Swapped out: 0K >> C-Heap outstanding allocations: 1636K >> >> /proc/sys/kernel/threads-max (system-wide limit on the number of threads): 57119 >> /proc/sys/vm/max_map_count (maximum number of memory map areas a process may have): 65530 >> /proc/sys/kernel/pid_max (system-wide limit on number of process identifiers): 32768 >> >> Steal ticks since vm start: 0 >> Steal ticks percentage since vm start: 0.000 >> >> CPU: total 4 (initial active 4) (ARMv7), vfp, vfp3-32, simd, mp_ext >> /proc/cpuinfo: >> processor : 0 >> model name : ARMv7 Processor rev 3 (v7l) >> BogoMIPS : 270.00 >> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 >> CPU implementer : 0x41 >> CPU architecture: 7 >> CPU variant : 0x0 >> CPU part : 0xd08 >> CPU revision : 3 >> >> processor : 1 >> model name : ARMv7 Processor rev 3 (v7l) >> BogoMIPS : 270.00 >> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 >> CPU implementer : 0x41 >> CPU architecture: 7 >> CPU variant : 0x0 >> CPU part : 0xd08 >> CPU revision : 3 >> >> processor : 2 >> model name : ARMv7 Processor rev 3 (v7l) >> BogoMIPS : 270.00 >> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 >> CPU implementer : 0x41 >> CPU architecture: 7 >> CPU variant : 0x0 >> CPU part : 0xd08 >> CPU revision : 3 >> >> processor : 3 >> model name : ARMv7 Processor rev 3 (v7l) >> BogoMIPS : 270.00 >> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 >> CPU implementer : 0x41 >> CPU architecture: 7 >> CPU variant : 0x0 >> CPU part : 0xd08 >> CPU revision : 3 >> >> Hardware : BCM2711 >> Revision : c03111 >> Serial : 100000001c47254f >> Model : Raspberry Pi 4 Model B Rev 1.1 >> >> Online cpus: 0-3 >> Offline cpus: >> >> Memory: 4k page, physical 3919812k(1255688k free), swap 102396k(102396k free) >> >> vm_info: OpenJDK Server VM (fastdebug 16-internal+0-adhoc..workspace) for linux-arm JRE (16-internal+0-adhoc..workspace), built on Oct 12 2020 19:49:51 by "" with gcc 7.5.0 >> >> END. >> >> >> >> >> On 12. Oct 2020, at 20:24, Aleksey Shipilev wrote: >> >> Hi, >> >> On 10/12/20 8:12 PM, Marc Hoffmann wrote: >> >> Please find the build log and the hs_err file for commit fd0cb98ed03c6214c02ccd3503c1e6d77065a428 attached. >> >> >> Please try to build with fastdebug (./configure --enable-debug), so that JVM asserts meaninfully somewhere? >> >> Is there any additional information I can provide to help getting these builds fixed again? >> >> >> I am seeing plenty of weird x86_32 crashes since last week. Pretty sure some of them would manifest on ARM32 as well. This is why building with fastdebug is the next step: it maps out the bug symptoms. >> >> -- >> Thanks, >> -Aleksey >> >> >> >> From kbarrett at openjdk.java.net Thu Oct 29 11:44:44 2020 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Thu, 29 Oct 2020 11:44:44 GMT Subject: RFR: 8255582: Introduce SemaphoreLock and SemaphoreLocker In-Reply-To: <_JWavDFnX9LXuj9uXsd6RDwRDrsoVb-tG-CVQ6fMA90=.5d37cc4c-b6cf-4dc8-ac47-0372cf368ead@github.com> References: <_JWavDFnX9LXuj9uXsd6RDwRDrsoVb-tG-CVQ6fMA90=.5d37cc4c-b6cf-4dc8-ac47-0372cf368ead@github.com> Message-ID: On Thu, 29 Oct 2020 10:01:22 GMT, Stefan Karlsson wrote: > Semaphores can be used as low-level locks, but the readability of the code using them could be better. I propose that we introduce two new classes: > > SemaphoreLock - which provides the operations lock, unlock, try_lock. > > SemaphoreLocker - Equivalent to MutexLocker. I don't think we should be using semaphores as a substitute for fixing mutex rankings or using some rankless mutex (like PlatformMutex). So I'm not in favor of this change. ------------- PR: https://git.openjdk.java.net/jdk/pull/927 From dholmes at openjdk.java.net Thu Oct 29 11:44:45 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 29 Oct 2020 11:44:45 GMT Subject: RFR: 8255582: Introduce SemaphoreLock and SemaphoreLocker In-Reply-To: References: <_JWavDFnX9LXuj9uXsd6RDwRDrsoVb-tG-CVQ6fMA90=.5d37cc4c-b6cf-4dc8-ac47-0372cf368ead@github.com> Message-ID: On Thu, 29 Oct 2020 11:39:32 GMT, Kim Barrett wrote: >> Semaphores can be used as low-level locks, but the readability of the code using them could be better. I propose that we introduce two new classes: >> >> SemaphoreLock - which provides the operations lock, unlock, try_lock. >> >> SemaphoreLocker - Equivalent to MutexLocker. > > I don't think we should be using semaphores as a substitute for fixing mutex rankings or using some rankless mutex (like PlatformMutex). So I'm not in favor of this change. As I recall one of the reasons Semaphore is used as a lock in places is because of initialization constraints related to Mutex and PlatformMutex. Then there is also the ability to use semaphores in signal handlers. The key difference between "locks" and "a binary semaphore used like a lock" is that true locks have a notion of ownership and can generally only be unlocked by their owner. That is not enforced by SemaphoreLock making it somewhat not-a-lock. That said the name and API at least convey the intent. But it is critical to document why you need to use a semaphore as a lock instead of using a "real" lock like Mutex or PlatformMutex. If it only to avoid rank issues then I agree with others that that is not sufficient justification. ------------- PR: https://git.openjdk.java.net/jdk/pull/927 From boris.ulasevich at bell-sw.com Thu Oct 29 12:02:29 2020 From: boris.ulasevich at bell-sw.com (Boris Ulasevich) Date: Thu, 29 Oct 2020 15:02:29 +0300 Subject: arm32 builds continue to fail for me after 8253540 and 8253901 In-Reply-To: References: <56ff08d5-a4e5-788a-1c29-02f76e8755d2@redhat.com> <17F91692-4F3D-4FAA-AB94-361B6C84F982@mountainminds.com> <111D12C0-BE9F-4B2F-BAB1-CB0445CAB219@mountainminds.com> <00F0209A-F00D-4B06-B696-7DBD8E9D34F6@mountainminds.com> Message-ID: Hi Marc, Good! Thank you. regards, Boris On Thu, Oct 29, 2020 at 2:36 PM Marc Hoffmann wrote: > > Hi Boris, > > I can confirm that the build is green without fastdebug. > > Many thanks, > -marc > > > > On 29. Oct 2020, at 11:39, Boris Ulasevich wrote: > > > > Hi Marc, > > > > Oh, yes, fastdebug fails because I forgot to update the assertion > > statement - it is a separate issue, and I must fix it. > > But I am sure that the real issue is in another place. And it seems > > that it was fixed: release builds works Ok on my RPI since [4]. > > > > Boris > > > > On Thu, Oct 29, 2020 at 11:43 AM Marc Hoffmann > > wrote: > >> > >> Hi Boris, > >> > >> thanks for coming back on this! > >> > >> Current master is still red for me with the same failure (https://pici.beachhub.io/#/jdk): > >> > >> # A fatal error has been detected by the Java Runtime Environment: > >> # > >> # Internal Error (/workspace/src/hotspot/share/asm/register.hpp:160), pid=15243, tid=15248 > >> # assert(a != b && a != c && a != d && a != e && b != c && b != d && b != e && c != d && c != e && d != e) failed: registers must be different: a=0x00000002, b=0x00000003, c=0x00000000, d=0x00000000, e=0x0000000c > >> # > >> # JRE version: (16.0) (fastdebug build ) > >> # Java VM: OpenJDK Server VM (fastdebug 16-internal+0-adhoc..workspace, mixed mode, g1 gc, linux-arm) > >> # Problematic frame: > >> # V [libjvm.so+0x751404] InterpreterMacroAssembler::unlock_object(RegisterImpl*) [clone .part.34]+0x63 > >> > >> Also I don?t see how the last commit you mention is related to the issue. The problem seems to be inconsistent assertions in InterpreterMacroAssembler::unlock_object which was not touched since commit [2] in your list: > >> > >> https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/arm/interp_masm_arm.cpp#L1000 > >> > >> In line 990 it is asserted that Rlock == R0, but then in line 1000 the registered should be different: > >> > >> assert_different_registers(Robj, Rmark, Rlock, R0, Rtemp); > >> > >> > >> Regards, > >> -marc > >> > >> > >> > >> > >> On 28. Oct 2020, at 23:39, Boris Ulasevich wrote: > >> > >> Hi Marc, > >> > >> Sorry for being unavailable for too long! > >> > >> My understanding of the case is that > >> - my change [2] fixed the issue introduced in [1] > >> - with the fix JVM build still crashed on RPi > >> - the build issue on RPi was introduced by the change [3] and was > >> fixed by the change [4] > >> > >> Can I ask you to check that the current repo build works Ok for you? > >> > >> regards, > >> Boris > >> > >> > >> [1] > >> commit 77a0f3999afa322b64643afd4a161164440af975 > >> Author: Coleen Phillimore > >> Date: Mon Sep 28 15:49:02 2020 +0000 > >> 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function > >> [2] > >> commit fd0cb98ed03c6214c02ccd3503c1e6d77065a428 > >> Author: Boris Ulasevich > >> Date: Thu Oct 8 06:52:27 2020 +0000 > >> 8253901: ARM32: SIGSEGV during monitorexit due to incorrect register > >> use (after JDK-8253540) > >> > >> [3] > >> commit ea27a54bf0ff526effb47f9daaec51ced2d2bb71 > >> Author: Calvin Cheung > >> Date: Mon Oct 5 16:52:00 2020 +0000 > >> 8224509: Incorrect alignment in CDS related allocation code on 32-bit platforms > >> [4] > >> commit 5145bed0282575a580cf3ebc343038c1dc8ddb8d > >> Author: Ioi Lam > >> Date: Fri Oct 16 05:14:46 2020 +0000 > >> 8254125: Assertion in cppVtables.cpp during builds on 32bit Windows > >> > >> > >> On Fri, Oct 16, 2020 at 10:19 AM Marc Hoffmann > >> wrote: > >> > >> > >> Dear Boris, > >> > >> if it helps to fix the 32 arm build: > >> > >> 1) I gave you write access to my JDK fork at GitHub: https://github.com/marchof/jdk > >> 2) You can (force) push to the branch called ?build? > >> 3) The build results are here: https://pici.beachhub.io/#/jdk-marchof > >> > >> The repo is polled every 30?, the build takes another 30? until it fails. > >> > >> Regards, > >> -marc > >> > >> > >> On 13. Oct 2020, at 08:48, Boris Ulasevich wrote: > >> > >> Hi Marc, > >> > >> I created JDK-8254661 for the issue. I would love to fix it, but still > >> can't reproduce the crash (even on Raspberry Pi). > >> What configuration do you have? The following sequence works Ok for me: > >> pi at raspberrypi $ git clone https://github.com/openjdk/jdk > >> pi at raspberrypi $ cd jdk > >> pi at raspberrypi $ bash configure --with-boot-jdk=/home/pi/jdk-15 > >> pi at raspberrypi $ make > >> > >> Your debug build shows that I did not fix the > >> assert_different_registers in > >> InterpreterMacroAssembler::unlock_object() > >> body (and the function comment by the way!), though with eyeballing I > >> do not see what is wrong for Rlock=R0: > >> https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/arm/interp_masm_arm.cpp#L1000 > >> > >> regards, > >> Boris > >> > >> On Mon, Oct 12, 2020 at 11:34 PM Marc Hoffmann > >> wrote: > >> > >> > >> Hi Aleksey, hi Boris, > >> > >> for me the crash is always reproducible: Every single build after > >> > >> 77a0f3999afa322b64643afd4a161164440af975 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function > >> > >> fails on arm32 (build on ubuntu in docker on a raspberry pi 4). Before this commit I haven?t encountered any failures. > >> > >> Here is the hs_err file with ?enable-debug (reproduced with current master c7f00640627eab38b77d23d07876cf0247fa18f3). > >> > >> Cheers, > >> -marc > >> > >> > >> # > >> # A fatal error has been detected by the Java Runtime Environment: > >> # > >> # Internal Error (/workspace/src/hotspot/share/asm/register.hpp:160), pid=14700, tid=14705 > >> # assert(a != b && a != c && a != d && a != e && b != c && b != d && b != e && c != d && c != e && d != e) failed: registers must be different: a=0x00000002, b=0x00000003, c=0x00000000, d=0x00000000, e=0x0000000c > >> # > >> # JRE version: (16.0) (fastdebug build ) > >> # Java VM: OpenJDK Server VM (fastdebug 16-internal+0-adhoc..workspace, mixed mode, g1 gc, linux-arm) > >> # Problematic frame: > >> # V [libjvm.so+0x7571fc] InterpreterMacroAssembler::unlock_object(RegisterImpl*) [clone .part.34]+0x63 > >> # > >> # Core dump will be written. Default location: /workspace/make/core > >> # > >> # > >> > >> --------------- S U M M A R Y ------------ > >> > >> Command Line: -Xms64M -Xmx768M --add-exports=java.base/jdk.internal.module=ALL-UNNAMED build.tools.jigsaw.AddPackagesAttribute /workspace/build/linux-arm-server-fastdebug/jdk > >> > >> Host: 20431585315d, rev 3 (v7l), 4 cores, 3G, Ubuntu 18.04.3 LTS > >> Time: Mon Oct 12 20:22:15 2020 UTC elapsed time: 0.144243 seconds (0d 0h 0m 0s) > >> > >> --------------- T H R E A D --------------- > >> > >> Current thread (0xb5b16460): JavaThread "Unknown thread" [_thread_in_vm, id=14705, stack(0xb5c6e000,0xb5cbe000)] > >> > >> Stack: [0xb5c6e000,0xb5cbe000], sp=0xb5cbc2d0, free space=312k > >> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) > >> V [libjvm.so+0x7571fc] InterpreterMacroAssembler::unlock_object(RegisterImpl*) [clone .part.34]+0x63 > >> > >> Registers: > >> r0 = 0x00000003 > >> r1 = 0x000000a0 > >> r2 = 0x00000002 > >> r3 = 0x00000000 > >> r4 = 0xb5b168b0 > >> r5 = 0x0000000c > >> r6 = 0x00000000 > >> r7 = 0xb5cbc2e8 > >> r8 = 0xb6db1fa8 > >> r9 = 0xb5cbc760 > >> r10 = 0xe3520000 > >> fp = 0xb6db1fa8 > >> r12 = 0xb6ff8000 > >> sp = 0xb5cbc2d0 > >> lr = 0x00000058 > >> pc = 0xb64961fc > >> cpsr = 0x200f0030 > >> > >> Top of Stack: (sp=0xb5cbc2d0) > >> 0xb5cbc2d0: 00000002 00000003 00000000 00000000 > >> 0xb5cbc2e0: 0000000c 00000048 0000006e 00000000 > >> 0xb5cbc2f0: 00000000 00000000 00000000 0000007c > >> 0xb5cbc300: 00000000 00000077 b5cbc378 00000000 > >> 0xb5cbc310: b5cbc380 0000000f b6db1fa8 b5cbc340 > >> 0xb5cbc320: b5b168b0 b6db1fa8 b5cbc4c4 b6008a2b > >> 0xb5cbc330: b5cbc454 b5cbc348 0000000f b61971cf > >> 0xb5cbc340: b5cbc380 b5cbc3b0 b5b168b0 b5cbc388 > >> > >> Instructions: (pc=0xb64961fc) > >> 0xb64960fc: 440be9c7 a034f8c7 f5e06139 68ebf7ed > >> 0xb649610c: f040689a 46184164 1180f441 68996011 > >> 0xb649611c: f5e03104 4b12f781 46284622 1003f858 > >> 0xb649612c: f998f2e4 f64f68eb f2ce7210 6899122f > >> 0xb649613c: 600a4618 31046899 f76ef5e0 46284631 > >> 0xb649614c: f4e6f69f f5e04630 f507f79b 46bd7786 > >> 0xb649615c: 8ff0e8bd 0091bfd0 00006ee4 000059d0 > >> 0xb649616c: 00007d1c 4a084b07 b480447b 589baf00 > >> 0xb649617c: b91b781b f85d46bd 47707b04 f85d46bd > >> 0xb649618c: e7197b04 0091be30 00006a24 bf182900 > >> 0xb649619c: f1a1290c e92d0202 b0f34ff0 2301bf08 > >> 0xb64961ac: bf18af06 f8df2300 2a018268 461abf8c > >> 0xb64961bc: 0201f043 f04f460e 44f833ff 30d0f8c7 > >> 0xb64961cc: f8c74604 23003140 333de9c7 30fcf887 > >> 0xb64961dc: 3359e9c7 316cf887 4a8eb1da 0e58f04f > >> 0xb64961ec: 250c2003 1002f858 f8d12202 21a0c000 > >> 0xb64961fc: e000f88c 2000e9cd 6302e9cd 4b874a86 > >> 0xb649620c: 447a4887 9504447b f58f4478 f00cfaa3 > >> 0xb649621c: 68e2f2d5 4340f44f 33a0f2ce 0a04f04f > >> 0xb649622c: 7980f04f 0b00f04f 46106891 600b2501 > >> 0xb649623c: 44516891 f6f0f5e0 4a7b497a 3001f858 > >> 0xb649624c: 0140f107 a048f8c7 4608643e f8c7681b > >> 0xb649625c: 60fb904c f858647b f8c72002 f8c7b054 > >> 0xb649626c: 60bab058 e9c73208 653abb19 66fd607a > >> 0xb649627c: f732f5e0 689a68e3 4164f040 f4414618 > >> 0xb649628c: 60111181 44516899 f6c6f5e0 0110f107 > >> 0xb649629c: 68fb687a f8c74608 623aa018 6304e9c7 > >> 0xb64962ac: 901cf8c7 bb09e9c7 bb0de9c7 f5e063fd > >> 0xb64962bc: 68e3f713 f0406899 46184264 4240f442 > >> 0xb64962cc: 6899600a f1074451 f5e00ad0 4b57f6a5 > >> 0xb64962dc: 3003f858 2b00781b 8093f040 f04f68bb > >> 0xb64962ec: 68f97c80 33082500 c0acf8c7 0b01f04f > >> > >> > >> > >> --------------- P R O C E S S --------------- > >> > >> uid : 0 euid : 0 gid : 0 egid : 0 > >> > >> umask: 0022 (----w--w-) > >> > >> Threads class SMR info: > >> _java_thread_list=0xb6e56078, length=0, elements={ > >> } > >> _java_thread_list_alloc_cnt=1, _java_thread_list_free_cnt=0, _java_thread_list_max=0, _nested_thread_list_max=0 > >> _delete_lock_wait_cnt=0, _delete_lock_wait_max=0 > >> _to_delete_list_cnt=0, _to_delete_list_max=0 > >> > >> Java Threads: ( => current thread ) > >> > >> Other Threads: > >> 0xb5b73188 GCTaskThread "GC Thread#0" [stack: 0x81d00000,0x81d80000] [id=14706] > >> 0xb5b77dc0 ConcurrentGCThread "G1 Main Marker" [stack: 0x81c7e000,0x81cfe000] [id=14707] > >> 0xb5b790c0 ConcurrentGCThread "G1 Conc#0" [stack: 0x81a80000,0x81b00000] [id=14708] > >> 0xb5bde230 ConcurrentGCThread "G1 Refine#0" [stack: 0x81780000,0x81800000] [id=14709] > >> 0xb5bdf488 ConcurrentGCThread "G1 Service" [stack: 0x81580000,0x81600000] [id=14710] > >> > >> =>0xb5b16460 (exited) JavaThread "Unknown thread" [_thread_in_vm, id=14705, stack(0xb5c6e000,0xb5cbe000)] > >> > >> Threads with active compile tasks: > >> > >> VM state: not at safepoint (not fully initialized) > >> > >> VM Mutex/Monitor currently owned by a thread: None > >> > >> GC Precious Log: > >> CPUs: 4 total, 4 available > >> Memory: 3827M > >> Large Page Support: Disabled > >> NUMA Support: Disabled > >> Compressed Oops: Disabled > >> Heap Region Size: 1M > >> Heap Min Capacity: 64M > >> Heap Initial Capacity: 64M > >> Heap Max Capacity: 768M > >> Pre-touch: Disabled > >> Parallel Workers: 4 > >> Concurrent Workers: 1 > >> Concurrent Refinement Workers: 4 > >> Periodic GC: Disabled > >> > >> Heap: > >> garbage-first heap total 65536K, used 0K [0x83a00000, 0xb3a00000) > >> region size 1024K, 1 young (1024K), 0 survivors (0K) > >> Metaspace used 944K, capacity 2200K, committed 2200K, reserved 4400K > >> > >> Heap Regions: E=young(eden), S=young(survivor), O=old, HS=humongous(starts), HC=humongous(continues), CS=collection set, F=free, OA=open archive, CA=closed archive, TAMS=top-at-mark-start (previous, next) > >> | 0|0x83a00000, 0x83a00000, 0x83b00000| 0%| F| |TAMS 0x83a00000, 0x83a00000| Untracked > >> | 1|0x83b00000, 0x83b00000, 0x83c00000| 0%| F| |TAMS 0x83b00000, 0x83b00000| Untracked > >> | 2|0x83c00000, 0x83c00000, 0x83d00000| 0%| F| |TAMS 0x83c00000, 0x83c00000| Untracked > >> | 3|0x83d00000, 0x83d00000, 0x83e00000| 0%| F| |TAMS 0x83d00000, 0x83d00000| Untracked > >> | 4|0x83e00000, 0x83e00000, 0x83f00000| 0%| F| |TAMS 0x83e00000, 0x83e00000| Untracked > >> | 5|0x83f00000, 0x83f00000, 0x84000000| 0%| F| |TAMS 0x83f00000, 0x83f00000| Untracked > >> | 6|0x84000000, 0x84000000, 0x84100000| 0%| F| |TAMS 0x84000000, 0x84000000| Untracked > >> | 7|0x84100000, 0x84100000, 0x84200000| 0%| F| |TAMS 0x84100000, 0x84100000| Untracked > >> | 8|0x84200000, 0x84200000, 0x84300000| 0%| F| |TAMS 0x84200000, 0x84200000| Untracked > >> | 9|0x84300000, 0x84300000, 0x84400000| 0%| F| |TAMS 0x84300000, 0x84300000| Untracked > >> | 10|0x84400000, 0x84400000, 0x84500000| 0%| F| |TAMS 0x84400000, 0x84400000| Untracked > >> | 11|0x84500000, 0x84500000, 0x84600000| 0%| F| |TAMS 0x84500000, 0x84500000| Untracked > >> | 12|0x84600000, 0x84600000, 0x84700000| 0%| F| |TAMS 0x84600000, 0x84600000| Untracked > >> | 13|0x84700000, 0x84700000, 0x84800000| 0%| F| |TAMS 0x84700000, 0x84700000| Untracked > >> | 14|0x84800000, 0x84800000, 0x84900000| 0%| F| |TAMS 0x84800000, 0x84800000| Untracked > >> | 15|0x84900000, 0x84900000, 0x84a00000| 0%| F| |TAMS 0x84900000, 0x84900000| Untracked > >> | 16|0x84a00000, 0x84a00000, 0x84b00000| 0%| F| |TAMS 0x84a00000, 0x84a00000| Untracked > >> | 17|0x84b00000, 0x84b00000, 0x84c00000| 0%| F| |TAMS 0x84b00000, 0x84b00000| Untracked > >> | 18|0x84c00000, 0x84c00000, 0x84d00000| 0%| F| |TAMS 0x84c00000, 0x84c00000| Untracked > >> | 19|0x84d00000, 0x84d00000, 0x84e00000| 0%| F| |TAMS 0x84d00000, 0x84d00000| Untracked > >> | 20|0x84e00000, 0x84e00000, 0x84f00000| 0%| F| |TAMS 0x84e00000, 0x84e00000| Untracked > >> | 21|0x84f00000, 0x84f00000, 0x85000000| 0%| F| |TAMS 0x84f00000, 0x84f00000| Untracked > >> | 22|0x85000000, 0x85000000, 0x85100000| 0%| F| |TAMS 0x85000000, 0x85000000| Untracked > >> | 23|0x85100000, 0x85100000, 0x85200000| 0%| F| |TAMS 0x85100000, 0x85100000| Untracked > >> | 24|0x85200000, 0x85200000, 0x85300000| 0%| F| |TAMS 0x85200000, 0x85200000| Untracked > >> | 25|0x85300000, 0x85300000, 0x85400000| 0%| F| |TAMS 0x85300000, 0x85300000| Untracked > >> | 26|0x85400000, 0x85400000, 0x85500000| 0%| F| |TAMS 0x85400000, 0x85400000| Untracked > >> | 27|0x85500000, 0x85500000, 0x85600000| 0%| F| |TAMS 0x85500000, 0x85500000| Untracked > >> | 28|0x85600000, 0x85600000, 0x85700000| 0%| F| |TAMS 0x85600000, 0x85600000| Untracked > >> | 29|0x85700000, 0x85700000, 0x85800000| 0%| F| |TAMS 0x85700000, 0x85700000| Untracked > >> | 30|0x85800000, 0x85800000, 0x85900000| 0%| F| |TAMS 0x85800000, 0x85800000| Untracked > >> | 31|0x85900000, 0x85900000, 0x85a00000| 0%| F| |TAMS 0x85900000, 0x85900000| Untracked > >> | 32|0x85a00000, 0x85a00000, 0x85b00000| 0%| F| |TAMS 0x85a00000, 0x85a00000| Untracked > >> | 33|0x85b00000, 0x85b00000, 0x85c00000| 0%| F| |TAMS 0x85b00000, 0x85b00000| Untracked > >> | 34|0x85c00000, 0x85c00000, 0x85d00000| 0%| F| |TAMS 0x85c00000, 0x85c00000| Untracked > >> | 35|0x85d00000, 0x85d00000, 0x85e00000| 0%| F| |TAMS 0x85d00000, 0x85d00000| Untracked > >> | 36|0x85e00000, 0x85e00000, 0x85f00000| 0%| F| |TAMS 0x85e00000, 0x85e00000| Untracked > >> | 37|0x85f00000, 0x85f00000, 0x86000000| 0%| F| |TAMS 0x85f00000, 0x85f00000| Untracked > >> | 38|0x86000000, 0x86000000, 0x86100000| 0%| F| |TAMS 0x86000000, 0x86000000| Untracked > >> | 39|0x86100000, 0x86100000, 0x86200000| 0%| F| |TAMS 0x86100000, 0x86100000| Untracked > >> | 40|0x86200000, 0x86200000, 0x86300000| 0%| F| |TAMS 0x86200000, 0x86200000| Untracked > >> | 41|0x86300000, 0x86300000, 0x86400000| 0%| F| |TAMS 0x86300000, 0x86300000| Untracked > >> | 42|0x86400000, 0x86400000, 0x86500000| 0%| F| |TAMS 0x86400000, 0x86400000| Untracked > >> | 43|0x86500000, 0x86500000, 0x86600000| 0%| F| |TAMS 0x86500000, 0x86500000| Untracked > >> | 44|0x86600000, 0x86600000, 0x86700000| 0%| F| |TAMS 0x86600000, 0x86600000| Untracked > >> | 45|0x86700000, 0x86700000, 0x86800000| 0%| F| |TAMS 0x86700000, 0x86700000| Untracked > >> | 46|0x86800000, 0x86800000, 0x86900000| 0%| F| |TAMS 0x86800000, 0x86800000| Untracked > >> | 47|0x86900000, 0x86900000, 0x86a00000| 0%| F| |TAMS 0x86900000, 0x86900000| Untracked > >> | 48|0x86a00000, 0x86a00000, 0x86b00000| 0%| F| |TAMS 0x86a00000, 0x86a00000| Untracked > >> | 49|0x86b00000, 0x86b00000, 0x86c00000| 0%| F| |TAMS 0x86b00000, 0x86b00000| Untracked > >> | 50|0x86c00000, 0x86c00000, 0x86d00000| 0%| F| |TAMS 0x86c00000, 0x86c00000| Untracked > >> | 51|0x86d00000, 0x86d00000, 0x86e00000| 0%| F| |TAMS 0x86d00000, 0x86d00000| Untracked > >> | 52|0x86e00000, 0x86e00000, 0x86f00000| 0%| F| |TAMS 0x86e00000, 0x86e00000| Untracked > >> | 53|0x86f00000, 0x86f00000, 0x87000000| 0%| F| |TAMS 0x86f00000, 0x86f00000| Untracked > >> | 54|0x87000000, 0x87000000, 0x87100000| 0%| F| |TAMS 0x87000000, 0x87000000| Untracked > >> | 55|0x87100000, 0x87100000, 0x87200000| 0%| F| |TAMS 0x87100000, 0x87100000| Untracked > >> | 56|0x87200000, 0x87200000, 0x87300000| 0%| F| |TAMS 0x87200000, 0x87200000| Untracked > >> | 57|0x87300000, 0x87300000, 0x87400000| 0%| F| |TAMS 0x87300000, 0x87300000| Untracked > >> | 58|0x87400000, 0x87400000, 0x87500000| 0%| F| |TAMS 0x87400000, 0x87400000| Untracked > >> | 59|0x87500000, 0x87500000, 0x87600000| 0%| F| |TAMS 0x87500000, 0x87500000| Untracked > >> | 60|0x87600000, 0x87600000, 0x87700000| 0%| F| |TAMS 0x87600000, 0x87600000| Untracked > >> | 61|0x87700000, 0x87700000, 0x87800000| 0%| F| |TAMS 0x87700000, 0x87700000| Untracked > >> | 62|0x87800000, 0x87800000, 0x87900000| 0%| F| |TAMS 0x87800000, 0x87800000| Untracked > >> | 63|0x87900000, 0x87942908, 0x87a00000| 26%| E| |TAMS 0x87900000, 0x87900000| Complete > >> > >> Card table byte_map: [0x83700000,0x83880000] _byte_map_base: 0x832e3000 > >> > >> Marking Bits (Prev, Next): (CMBitMap*) 0xb5b74324, (CMBitMap*) 0xb5b74344 > >> Prev Bits: [0x82980000, 0x83580000) > >> Next Bits: [0x81d80000, 0x82980000) > >> > >> GC Heap History (0 events): > >> No events > >> > >> Deoptimization events (0 events): > >> No events > >> > >> Classes unloaded (0 events): > >> No events > >> > >> Classes redefined (0 events): > >> No events > >> > >> Internal exceptions (0 events): > >> No events > >> > >> Events (20 events): > >> Event: 0.113 loading class java/lang/Character > >> Event: 0.114 loading class java/lang/Character done > >> Event: 0.114 loading class java/lang/Float > >> Event: 0.115 loading class java/lang/Number > >> Event: 0.115 loading class java/lang/Number done > >> Event: 0.115 loading class java/lang/Float done > >> Event: 0.115 loading class java/lang/Double > >> Event: 0.116 loading class java/lang/Double done > >> Event: 0.116 loading class java/lang/Byte > >> Event: 0.116 loading class java/lang/Byte done > >> Event: 0.116 loading class java/lang/Short > >> Event: 0.117 loading class java/lang/Short done > >> Event: 0.117 loading class java/lang/Integer > >> Event: 0.118 loading class java/lang/Integer done > >> Event: 0.118 loading class java/lang/Long > >> Event: 0.119 loading class java/lang/Long done > >> Event: 0.119 loading class java/util/Iterator > >> Event: 0.119 loading class java/util/Iterator done > >> Event: 0.119 loading class java/lang/reflect/RecordComponent > >> Event: 0.119 loading class java/lang/reflect/RecordComponent done > >> > >> > >> Dynamic libraries: > >> 00410000-00411000 r-xp 00000000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java > >> 00420000-00421000 r--p 00000000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java > >> 00421000-00422000 rw-p 00001000 b3:02 677726 /workspace/build/linux-arm-server-fastdebug/jdk/bin/java > >> 019b6000-019d7000 rw-p 00000000 00:00 0 [heap] > >> 809c9000-80e00000 rw-p 00000000 00:00 0 > >> 80e00000-80e8e000 rw-p 00000000 00:00 0 > >> 80e8e000-80f00000 ---p 00000000 00:00 0 > >> 80fb4000-811da000 rw-p 00000000 00:00 0 > >> 811da000-81400000 ---p 00000000 00:00 0 > >> 81400000-81421000 rw-p 00000000 00:00 0 > >> 81421000-81500000 ---p 00000000 00:00 0 > >> 8157e000-8157f000 ---p 00000000 00:00 0 > >> 8157f000-81600000 rw-p 00000000 00:00 0 > >> 81600000-81621000 rw-p 00000000 00:00 0 > >> 81621000-81700000 ---p 00000000 00:00 0 > >> 8177e000-8177f000 ---p 00000000 00:00 0 > >> 8177f000-81800000 rw-p 00000000 00:00 0 > >> 81800000-81821000 rw-p 00000000 00:00 0 > >> 81821000-81900000 ---p 00000000 00:00 0 > >> 81900000-81921000 rw-p 00000000 00:00 0 > >> 81921000-81a00000 ---p 00000000 00:00 0 > >> 81a7e000-81a7f000 ---p 00000000 00:00 0 > >> 81a7f000-81b00000 rw-p 00000000 00:00 0 > >> 81b00000-81b21000 rw-p 00000000 00:00 0 > >> 81b21000-81c00000 ---p 00000000 00:00 0 > >> 81c21000-81c7c000 rw-p 00000000 00:00 0 > >> 81c7c000-81c7d000 ---p 00000000 00:00 0 > >> 81c7d000-81cfe000 rw-p 00000000 00:00 0 > >> 81cfe000-81cff000 ---p 00000000 00:00 0 > >> 81cff000-81e80000 rw-p 00000000 00:00 0 > >> 81e80000-82980000 ---p 00000000 00:00 0 > >> 82980000-82a80000 rw-p 00000000 00:00 0 > >> 82a80000-83580000 ---p 00000000 00:00 0 > >> 83580000-835a0000 rw-p 00000000 00:00 0 > >> 835a0000-83700000 ---p 00000000 00:00 0 > >> 83700000-83720000 rw-p 00000000 00:00 0 > >> 83720000-83880000 ---p 00000000 00:00 0 > >> 83880000-838a0000 rw-p 00000000 00:00 0 > >> 838a0000-83a00000 ---p 00000000 00:00 0 > >> 83a00000-87a00000 rw-p 00000000 00:00 0 > >> 87a00000-b3a00000 ---p 00000000 00:00 0 > >> b3a25000-b3a76000 rw-p 00000000 00:00 0 > >> b3a76000-b3ab3000 ---p 00000000 00:00 0 > >> b3ab3000-b3c33000 rwxp 00000000 00:00 0 > >> b3c33000-b5ab3000 ---p 00000000 00:00 0 > >> b5ab3000-b5ac8000 r-xp 00000000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so > >> b5ac8000-b5ad8000 ---p 00015000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so > >> b5ad8000-b5ad9000 r--p 00015000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so > >> b5ad9000-b5ada000 rw-p 00016000 b3:02 144091 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjava.so > >> b5ada000-b5ae2000 rw-s 00000000 b3:02 2576900 /tmp/hsperfdata_root/14700 > >> b5ae2000-b5ae9000 r-xp 00000000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so > >> b5ae9000-b5af8000 ---p 00007000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so > >> b5af8000-b5af9000 r--p 00006000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so > >> b5af9000-b5afa000 rw-p 00007000 b3:02 2708515 /lib/arm-linux-gnueabihf/libnss_files-2.27.so > >> b5afa000-b5b00000 rw-p 00000000 00:00 0 > >> b5b00000-b5c00000 rw-p 00000000 00:00 0 > >> b5c00000-b5c0d000 r-xp 00000000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so > >> b5c0d000-b5c1c000 ---p 0000d000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so > >> b5c1c000-b5c1d000 r--p 0000c000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so > >> b5c1d000-b5c1e000 rw-p 0000d000 b3:02 2708509 /lib/arm-linux-gnueabihf/libnsl-2.27.so > >> b5c1e000-b5c20000 rw-p 00000000 00:00 0 > >> b5c20000-b5c27000 r-xp 00000000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so > >> b5c27000-b5c36000 ---p 00007000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so > >> b5c36000-b5c37000 r--p 00006000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so > >> b5c37000-b5c38000 rw-p 00007000 b3:02 2708519 /lib/arm-linux-gnueabihf/libnss_nis-2.27.so > >> b5c38000-b5c3d000 r-xp 00000000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so > >> b5c3d000-b5c4c000 ---p 00005000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so > >> b5c4c000-b5c4d000 r--p 00004000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so > >> b5c4d000-b5c4e000 rw-p 00005000 b3:02 2708511 /lib/arm-linux-gnueabihf/libnss_compat-2.27.so > >> b5c4e000-b5c5d000 r-xp 00000000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so > >> b5c5d000-b5c6c000 ---p 0000f000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so > >> b5c6c000-b5c6d000 r--p 0000e000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so > >> b5c6d000-b5c6e000 rw-p 0000f000 b3:02 144093 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjimage.so > >> b5c6e000-b5c71000 ---p 00000000 00:00 0 > >> b5c71000-b5cbe000 rw-p 00000000 00:00 0 > >> b5cbe000-b5d2d000 r-xp 00000000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so > >> b5d2d000-b5d3d000 ---p 0006f000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so > >> b5d3d000-b5d3e000 r--p 0006f000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so > >> b5d3e000-b5d3f000 rw-p 00070000 b3:02 2708506 /lib/arm-linux-gnueabihf/libm-2.27.so > >> b5d3f000-b6d56000 r-xp 00000000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so > >> b6d56000-b6d65000 ---p 01017000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so > >> b6d65000-b6dba000 r--p 01016000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so > >> b6dba000-b6dd2000 rw-p 0106b000 b3:02 144078 /workspace/build/linux-arm-server-fastdebug/jdk/lib/server/libjvm.so > >> b6dd2000-b6e5e000 rw-p 00000000 00:00 0 > >> b6e5e000-b6e6f000 r-xp 00000000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so > >> b6e6f000-b6e7f000 ---p 00011000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so > >> b6e7f000-b6e80000 r--p 00011000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so > >> b6e80000-b6e81000 rw-p 00012000 b3:02 2708524 /lib/arm-linux-gnueabihf/libpthread-2.27.so > >> b6e81000-b6e83000 rw-p 00000000 00:00 0 > >> b6e83000-b6e85000 r-xp 00000000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so > >> b6e85000-b6e94000 ---p 00002000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so > >> b6e94000-b6e95000 r--p 00001000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so > >> b6e95000-b6e96000 rw-p 00002000 b3:02 2708497 /lib/arm-linux-gnueabihf/libdl-2.27.so > >> b6e96000-b6eaf000 r-xp 00000000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 > >> b6eaf000-b6ebe000 ---p 00019000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 > >> b6ebe000-b6ebf000 r--p 00018000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 > >> b6ebf000-b6ec0000 rw-p 00019000 b3:02 1308274 /lib/arm-linux-gnueabihf/libz.so.1.2.11 > >> b6ec0000-b6fa2000 r-xp 00000000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so > >> b6fa2000-b6fb2000 ---p 000e2000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so > >> b6fb2000-b6fb4000 r--p 000e2000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so > >> b6fb4000-b6fb5000 rw-p 000e4000 b3:02 2708489 /lib/arm-linux-gnueabihf/libc-2.27.so > >> b6fb5000-b6fb8000 rw-p 00000000 00:00 0 > >> b6fb8000-b6fc2000 r-xp 00000000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so > >> b6fc2000-b6fd1000 ---p 0000a000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so > >> b6fd1000-b6fd2000 r--p 00009000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so > >> b6fd2000-b6fd3000 rw-p 0000a000 b3:02 144083 /workspace/build/linux-arm-server-fastdebug/jdk/lib/libjli.so > >> b6fd3000-b6feb000 r-xp 00000000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so > >> b6ff2000-b6ff4000 rw-p 00000000 00:00 0 > >> b6ff6000-b6ff7000 ---p 00000000 00:00 0 > >> b6ff7000-b6ff8000 r--p 00000000 00:00 0 > >> b6ff8000-b6ff9000 rwxp 00000000 00:00 0 > >> b6ff9000-b6ffb000 rw-p 00000000 00:00 0 > >> b6ffb000-b6ffc000 r--p 00018000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so > >> b6ffc000-b6ffd000 rw-p 00019000 b3:02 2708477 /lib/arm-linux-gnueabihf/ld-2.27.so > >> bed97000-bedb8000 rw-p 00000000 00:00 0 [stack] > >> beeb0000-beeb1000 r-xp 00000000 00:00 0 [sigpage] > >> beeb1000-beeb2000 r--p 00000000 00:00 0 [vvar] > >> beeb2000-beeb3000 r-xp 00000000 00:00 0 [vdso] > >> ffff0000-ffff1000 r-xp 00000000 00:00 0 [vectors] > >> > >> > >> VM Arguments: > >> jvm_args: -Xms64M -Xmx768M --add-exports=java.base/jdk.internal.module=ALL-UNNAMED > >> java_command: build.tools.jigsaw.AddPackagesAttribute /workspace/build/linux-arm-server-fastdebug/jdk > >> java_class_path (initial): /workspace/build/linux-arm-server-fastdebug/buildtools/tools_jigsaw_classes > >> Launcher Type: SUN_STANDARD > >> > >> [Global flags] > >> uint ConcGCThreads = 1 {product} {ergonomic} Number of threads concurrent gc will use > >> uint G1ConcRefinementThreads = 4 {product} {ergonomic} The number of parallel rem set update threads. Will be set ergonomically by default. > >> size_t G1HeapRegionSize = 1048576 {product} {ergonomic} Size of the G1 regions. > >> uintx GCDrainStackTargetSize = 64 {product} {ergonomic} Number of entries we will try to leave on the stack during parallel gc > >> size_t InitialHeapSize = 67108864 {product} {command line} Initial heap size (in bytes); zero means use ergonomics > >> size_t MarkStackSize = 32768 {product} {ergonomic} Size of marking stack > >> size_t MaxHeapSize = 805306368 {product} {command line} Maximum heap size (in bytes) > >> size_t MaxNewSize = 482344960 {product} {ergonomic} Maximum new generation size (in bytes), max_uintx means set ergonomically > >> size_t MinHeapDeltaBytes = 1048576 {product} {ergonomic} The minimum change in heap space due to GC (in bytes) > >> size_t MinHeapSize = 67108864 {product} {command line} Minimum heap size (in bytes); zero means use ergonomics > >> uintx NonProfiledCodeHeapSize = 0 {pd product} {ergonomic} Size of code heap with non-profiled methods (in bytes) > >> uintx ProfiledCodeHeapSize = 0 {pd product} {ergonomic} Size of code heap with profiled methods (in bytes) > >> size_t SoftMaxHeapSize = 805306368 {manageable} {ergonomic} Soft limit for maximum heap size (in bytes) > >> bool UseG1GC = true {product} {ergonomic} Use the Garbage-First garbage collector > >> > >> Logging: > >> Log output configuration: > >> #0: stdout all=warning uptime,level,tags > >> #1: stderr all=off uptime,level,tags > >> > >> Environment Variables: > >> JAVA_HOME=/opt/java/openjdk > >> PATH=/opt/java/openjdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin > >> LC_ALL=C > >> > >> Signal Handlers: > >> SIGSEGV: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO > >> SIGBUS: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO > >> SIGFPE: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO > >> SIGPIPE: [libjvm.so+0xc9aa9d], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO > >> SIGXFSZ: [libjvm.so+0xc9aa9d], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO > >> SIGILL: [libjvm.so+0xe19e65], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO > >> SIGUSR2: [libjvm.so+0xc9ad95], sa_mask[0]=00000000000000000000000000000000, sa_flags=SA_RESTART|SA_SIGINFO > >> SIGHUP: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none > >> SIGINT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none > >> SIGTERM: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none > >> SIGQUIT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none > >> > >> > >> --------------- S Y S T E M --------------- > >> > >> OS: > >> DISTRIB_ID=Ubuntu > >> DISTRIB_RELEASE=18.04 > >> DISTRIB_CODENAME=bionic > >> DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS" > >> uname: Linux 20431585315d 5.4.51-v7l+ #1333 SMP Mon Aug 10 16:51:40 BST 2020 armv7l > >> OS uptime: 14 days 7:59 hours > >> libc: glibc 2.27 NPTL 2.27 > >> rlimit (soft/hard): STACK 8192k/infinity , CORE infinity/infinity , NPROC infinity/infinity , NOFILE 1048576/1048576 , AS infinity/infinity , CPU infinity/infinity , DATA infinity/infinity , FSIZE infinity/infinity , MEMLOCK 64k/64k > >> load average: 3.37 3.26 3.09 > >> > >> /proc/meminfo: > >> MemTotal: 3919812 kB > >> MemFree: 1255688 kB > >> MemAvailable: 3518740 kB > >> Buffers: 134316 kB > >> Cached: 2117828 kB > >> SwapCached: 0 kB > >> Active: 1266624 kB > >> Inactive: 1167412 kB > >> Active(anon): 110360 kB > >> Inactive(anon): 80744 kB > >> Active(file): 1156264 kB > >> Inactive(file): 1086668 kB > >> Unevictable: 16 kB > >> Mlocked: 16 kB > >> HighTotal: 3264512 kB > >> HighFree: 1038848 kB > >> LowTotal: 655300 kB > >> LowFree: 216840 kB > >> SwapTotal: 102396 kB > >> SwapFree: 102396 kB > >> Dirty: 24916 kB > >> Writeback: 0 kB > >> AnonPages: 181884 kB > >> Mapped: 125864 kB > >> Shmem: 16892 kB > >> KReclaimable: 181816 kB > >> Slab: 205164 kB > >> SReclaimable: 181816 kB > >> SUnreclaim: 23348 kB > >> KernelStack: 2240 kB > >> PageTables: 2684 kB > >> NFS_Unstable: 0 kB > >> Bounce: 0 kB > >> WritebackTmp: 0 kB > >> CommitLimit: 2062300 kB > >> Committed_AS: 1125176 kB > >> VmallocTotal: 245760 kB > >> VmallocUsed: 5520 kB > >> VmallocChunk: 0 kB > >> Percpu: 512 kB > >> CmaTotal: 262144 kB > >> CmaFree: 171244 kB > >> > >> /sys/kernel/mm/transparent_hugepage/enabled: > >> /sys/kernel/mm/transparent_hugepage/defrag (defrag/compaction efforts parameter): > >> > >> Process Memory: > >> Virtual Size: 888828K (peak: 888828K) > >> Resident Set Size: 25020K (peak: 25020K) (anon: 11372K, file: 13648K, shmem: 0K) > >> Swapped out: 0K > >> C-Heap outstanding allocations: 1636K > >> > >> /proc/sys/kernel/threads-max (system-wide limit on the number of threads): 57119 > >> /proc/sys/vm/max_map_count (maximum number of memory map areas a process may have): 65530 > >> /proc/sys/kernel/pid_max (system-wide limit on number of process identifiers): 32768 > >> > >> Steal ticks since vm start: 0 > >> Steal ticks percentage since vm start: 0.000 > >> > >> CPU: total 4 (initial active 4) (ARMv7), vfp, vfp3-32, simd, mp_ext > >> /proc/cpuinfo: > >> processor : 0 > >> model name : ARMv7 Processor rev 3 (v7l) > >> BogoMIPS : 270.00 > >> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 > >> CPU implementer : 0x41 > >> CPU architecture: 7 > >> CPU variant : 0x0 > >> CPU part : 0xd08 > >> CPU revision : 3 > >> > >> processor : 1 > >> model name : ARMv7 Processor rev 3 (v7l) > >> BogoMIPS : 270.00 > >> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 > >> CPU implementer : 0x41 > >> CPU architecture: 7 > >> CPU variant : 0x0 > >> CPU part : 0xd08 > >> CPU revision : 3 > >> > >> processor : 2 > >> model name : ARMv7 Processor rev 3 (v7l) > >> BogoMIPS : 270.00 > >> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 > >> CPU implementer : 0x41 > >> CPU architecture: 7 > >> CPU variant : 0x0 > >> CPU part : 0xd08 > >> CPU revision : 3 > >> > >> processor : 3 > >> model name : ARMv7 Processor rev 3 (v7l) > >> BogoMIPS : 270.00 > >> Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 > >> CPU implementer : 0x41 > >> CPU architecture: 7 > >> CPU variant : 0x0 > >> CPU part : 0xd08 > >> CPU revision : 3 > >> > >> Hardware : BCM2711 > >> Revision : c03111 > >> Serial : 100000001c47254f > >> Model : Raspberry Pi 4 Model B Rev 1.1 > >> > >> Online cpus: 0-3 > >> Offline cpus: > >> > >> Memory: 4k page, physical 3919812k(1255688k free), swap 102396k(102396k free) > >> > >> vm_info: OpenJDK Server VM (fastdebug 16-internal+0-adhoc..workspace) for linux-arm JRE (16-internal+0-adhoc..workspace), built on Oct 12 2020 19:49:51 by "" with gcc 7.5.0 > >> > >> END. > >> > >> > >> > >> > >> On 12. Oct 2020, at 20:24, Aleksey Shipilev wrote: > >> > >> Hi, > >> > >> On 10/12/20 8:12 PM, Marc Hoffmann wrote: > >> > >> Please find the build log and the hs_err file for commit fd0cb98ed03c6214c02ccd3503c1e6d77065a428 attached. > >> > >> > >> Please try to build with fastdebug (./configure --enable-debug), so that JVM asserts meaninfully somewhere? > >> > >> Is there any additional information I can provide to help getting these builds fixed again? > >> > >> > >> I am seeing plenty of weird x86_32 crashes since last week. Pretty sure some of them would manifest on ARM32 as well. This is why building with fastdebug is the next step: it maps out the bug symptoms. > >> > >> -- > >> Thanks, > >> -Aleksey > >> > >> > >> > >> > From coleenp at openjdk.java.net Thu Oct 29 12:02:45 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 29 Oct 2020 12:02:45 GMT Subject: RFR: 8255582: Introduce SemaphoreLock and SemaphoreLocker In-Reply-To: References: <_JWavDFnX9LXuj9uXsd6RDwRDrsoVb-tG-CVQ6fMA90=.5d37cc4c-b6cf-4dc8-ac47-0372cf368ead@github.com> Message-ID: On Thu, 29 Oct 2020 11:41:55 GMT, David Holmes wrote: >> I don't think we should be using semaphores as a substitute for fixing mutex rankings or using some rankless mutex (like PlatformMutex). So I'm not in favor of this change. > > As I recall one of the reasons Semaphore is used as a lock in places is because of initialization constraints related to Mutex and PlatformMutex. Then there is also the ability to use semaphores in signal handlers. > > The key difference between "locks" and "a binary semaphore used like a lock" is that true locks have a notion of ownership and can generally only be unlocked by their owner. That is not enforced by SemaphoreLock making it somewhat not-a-lock. That said the name and API at least convey the intent. But it is critical to document why you need to use a semaphore as a lock instead of using a "real" lock like Mutex or PlatformMutex. If it only to avoid rank issues then I agree with others that that is not sufficient justification. I agree with Kim and David's comments. We were talking about doing the mutex ranking fix "really soon now". If we have to invent another mechanism to subvert it, maybe the priority of the work should be higher for that and this feedback should be included in the design of how we would like it to work. Please don't rush push this because I haven't had a time to properly review it right now. ------------- PR: https://git.openjdk.java.net/jdk/pull/927 From rkennke at openjdk.java.net Thu Oct 29 12:07:00 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 29 Oct 2020 12:07:00 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v23] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads that make heavvy use of such weak references will therefore potentially cause significant GC pauses. > > There are 3 main items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for FinalReference, marking stops there, and does not mark through the referent. > 2. Concurrent processing is performed after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers in Reference.get() intrinsics that return NULL when the referent is not reachable. > > Testing: hotspot_gc_shenadoah (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions, tier1, tier2, vmTestbase_vm_metaspace, vmTestbase_nsk_jvmti, with -XX:+UseShenandoahGC without regressions, specjvm with various levels of verification Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Rename mark_final -> mark_weak and several cleanups (by shade) ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/505/files - new: https://git.openjdk.java.net/jdk/pull/505/files/f2bf4edc..25879840 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=22 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=21-22 Stats: 68 lines in 7 files changed: 2 ins; 18 del; 48 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From shade at openjdk.java.net Thu Oct 29 12:07:01 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 29 Oct 2020 12:07:01 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v12] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: On Tue, 27 Oct 2020 14:54:01 GMT, Roman Kennke wrote: >> src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.inline.hpp line 264: >> >>> 262: marked = mark_context->mark_strong(obj, marked_first); >>> 263: } else { >>> 264: marked = mark_context->mark_final(obj, marked_first); >> >> Is this `mark_final` actually `mark_weak`? > > We could name it so, but it really means 'reachable through a FinalReference' so 'finalizably reachable' and 'marked final(izable)' seems the more correct term. It is weaker than 'strong' though, so yeah we could rename this. WDYT? I sent you a patch with this rename and collateral improvements. This discussion can be resolved. ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From shade at openjdk.java.net Thu Oct 29 12:07:01 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 29 Oct 2020 12:07:01 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v12] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: On Tue, 27 Oct 2020 14:56:22 GMT, Aleksey Shipilev wrote: >> Yes, that is true for mark_final(), but not for mark_strong(). I am changing it as you suggested. > > No wait, maybe that's fine then. Let me think about it. Same, patch sent, improvements delivered. Can resolve this discussion. ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From zgu at openjdk.java.net Thu Oct 29 12:19:45 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Thu, 29 Oct 2020 12:19:45 GMT Subject: RFR: 8255579: x86: Use cmpq(Register, Address) in safepoint_poll [v2] In-Reply-To: References: Message-ID: <_x0a3fJVHqX8bpTTZZl6US6v9LThRMfyiETmQSvCHPk=.7af35795-ef33-4dd4-a381-38443861f274@github.com> On Thu, 29 Oct 2020 10:08:58 GMT, Aleksey Shipilev wrote: >> JDK-8253180 added a new block in `safepoint_poll` that uses broken `cmpq` (JDK-8255550): it effectively does the comparison with operands swapped. It makes sense to use the non-broken `cmpq`, as to avoid changing the condition code and thus making the code less understandable. See the discussion in #910. >> >> Testing: >> - [x] `tier1` with Z (some SA tests fail, and some other fail with OOME -- seem to be expected/problem-listed) > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge branch 'master' into JDK-8255579-safepoint-poll > - 8255579: x86: Use cmpq(Register,Address) in safepoint_poll Tested with Shenandoah + concurrent stack processing on my personal branch, works fine. ------------- Marked as reviewed by zgu (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/924 From kbarrett at openjdk.java.net Thu Oct 29 12:27:46 2020 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Thu, 29 Oct 2020 12:27:46 GMT Subject: RFR: 8255232: G1: Make G1BiasedMappedArray freeable [v2] In-Reply-To: References: <966v_gu-j37Do5XhKCgwhCu0DDdwyKPQ04oUQkvzEIs=.96808fbd-a650-448e-96af-58a2bf7b9c2d@github.com> Message-ID: On Thu, 29 Oct 2020 09:56:01 GMT, Thomas Schatzl wrote: >> Hi all, >> >> can I have reviews for this change that makes G1BiasedMappedArray freeable? >> >> Previously all G1BiasedMappedArray were created as unfreeable i.e. assigned to static variables. However with JDK-8253600 I need one such biased map for the full collector which is created and deleted during full GC. So the biased array should also be freed as necessary to avoid a memory leak. >> >> The alternative would be to statically allocate that map anyway and provide it to the current G1FullCollector instance, but I do not think the single malloc call is perf sensitive compared to full collector work and there is much point in doing something more complicated at this time. In the future I hope that the young gen collector will also be extracted from G1CollectedHeap with the same need. If/when allocation of these helper data structures becomes a problem I would suggest looking into this again. >> >> One option then could be using some ResoureArea for these things in the future. >> >> For this change there should be no change in behavior at all. >> >> Testing: tier1-5 >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > ayang review Marked as reviewed by kbarrett (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/808 From shade at openjdk.java.net Thu Oct 29 12:28:46 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 29 Oct 2020 12:28:46 GMT Subject: RFR: 8255579: x86: Use cmpq(Register, Address) in safepoint_poll [v2] In-Reply-To: References: Message-ID: On Thu, 29 Oct 2020 10:11:14 GMT, Erik ?sterlund wrote: >> Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: >> >> - Merge branch 'master' into JDK-8255579-safepoint-poll >> - 8255579: x86: Use cmpq(Register,Address) in safepoint_poll > > Marked as reviewed by eosterlund (Reviewer). Tests look clean. Pushing, thanks @fisk, @zhengyu123. ------------- PR: https://git.openjdk.java.net/jdk/pull/924 From shade at openjdk.java.net Thu Oct 29 12:28:46 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 29 Oct 2020 12:28:46 GMT Subject: Integrated: 8255579: x86: Use cmpq(Register, Address) in safepoint_poll In-Reply-To: References: Message-ID: On Thu, 29 Oct 2020 08:14:27 GMT, Aleksey Shipilev wrote: > JDK-8253180 added a new block in `safepoint_poll` that uses broken `cmpq` (JDK-8255550): it effectively does the comparison with operands swapped. It makes sense to use the non-broken `cmpq`, as to avoid changing the condition code and thus making the code less understandable. See the discussion in #910. > > Testing: > - [x] `tier1` with Z (some SA tests fail, and some other fail with OOME -- seem to be expected/problem-listed) This pull request has now been integrated. Changeset: 4b20e460 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/4b20e460 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8255579: x86: Use cmpq(Register,Address) in safepoint_poll Reviewed-by: eosterlund, zgu ------------- PR: https://git.openjdk.java.net/jdk/pull/924 From eosterlund at openjdk.java.net Thu Oct 29 12:51:48 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 29 Oct 2020 12:51:48 GMT Subject: RFR: 8255452: Doing GC during JVMTI MethodExit event posting breaks return oop Message-ID: The imasm::remove_activation() call does not deal with safepoints very well. However, when the MethodExit JVMTI event is being called, we call into the runtime in the middle of remove_activation(). If the value being returned is an object type, then the top-of-stack contains the oop. However, the GC does not traverse said oop in any oop map, because it is simply not expected that we safepoint in the middle of remove_activation(). The JvmtiExport::post_method_exit() function we end up calling, reads the top-of-stack oop, and puts it in a handle. Then it calls JVMTI callbacks, that eventually call Java and a bunch of stuff that safepoints. So after the JVMTI callback, we can expect the top-of-stack oop to be broken. Unfortunately, when we continue, we therefore end up returning a broken oop. Notably, the fact that InterpreterRuntime::post_method_exit is a JRT_ENTRY, is wrong, as we can safepoint on the way back to Java, which will break the return oop in a similar way. So this patch makes it a JRT_BLOCK_ENTRY, moving the transition to VM and back, into a block of code that is protected against GC. Before the JRT_BLOCK is called, we stash away the return oop, and after the JRT_BLOCK_END, we restore the top-of-stack oop. In the path when InterpreterRuntime::post_method_exit is called when throwing an exception, we don't have the same problem of retaining an oop result, and hence the JRT_BLOCK/JRT_BLOCK_END section is not performed in this case; the logic is the same as before for this path. This is a JVMTI bug that has probably been around for a long time. It crashes with all GCs, but was discovered recently after concurrent stack processing, as StefanK has been running better GC stressing code in JVMTI, and the bug reproduced more easily with concurrent stack processing, as the timings were a bit different. The following reproducer failed pretty much 100% of the time: while true; do make test JTREG="RETAIN=all" TEST=test/hotspot/jtreg/vmTestbase/nsk/jdi/MethodExitEvent/returnValue/returnValue003/returnValue003.java TEST_OPTS_JAVA_OPTIONS="-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=0.0001 -XX:ZFragmentationLimit=0.01 -XX:+VerifyOops -XX:+ZVerifyViews -Xint" ; done With my fix I can run this repeatedly without any more failures. I have also sanity checked the patch by running tier 1-5, so that it does not introduces any new issues on its own. I have also used Stefan's nice external GC stressing with jcmd technique that was used to trigger crashes with other GCs, to make sure said crashes no longer reproduce either. ------------- Commit messages: - 8255452: Doing GC during JVMTI MethodExit event posting breaks return oop Changes: https://git.openjdk.java.net/jdk/pull/930/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=930&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255452 Stats: 46 lines in 3 files changed: 39 ins; 4 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/930.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/930/head:pull/930 PR: https://git.openjdk.java.net/jdk/pull/930 From rkennke at openjdk.java.net Thu Oct 29 12:58:58 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 29 Oct 2020 12:58:58 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v24] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads that make heavvy use of such weak references will therefore potentially cause significant GC pauses. > > There are 3 main items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for FinalReference, marking stops there, and does not mark through the referent. > 2. Concurrent processing is performed after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers in Reference.get() intrinsics that return NULL when the referent is not reachable. > > Testing: hotspot_gc_shenadoah (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions, tier1, tier2, vmTestbase_vm_metaspace, vmTestbase_nsk_jvmti, with -XX:+UseShenandoahGC without regressions, specjvm with various levels of verification Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Pass marking-strength through chunked arrays ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/505/files - new: https://git.openjdk.java.net/jdk/pull/505/files/25879840..f85ab85d Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=23 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=22-23 Stats: 11 lines in 2 files changed: 1 ins; 0 del; 10 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From stefank at openjdk.java.net Thu Oct 29 13:17:43 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 29 Oct 2020 13:17:43 GMT Subject: RFR: 8255582: Introduce SemaphoreLock and SemaphoreLocker In-Reply-To: References: <_JWavDFnX9LXuj9uXsd6RDwRDrsoVb-tG-CVQ6fMA90=.5d37cc4c-b6cf-4dc8-ac47-0372cf368ead@github.com> Message-ID: On Thu, 29 Oct 2020 11:11:10 GMT, Per Liden wrote: > For low-level locks, an alternative could be to use PlatformMutex. Sounds like a good idea. Though, I think that class needs to be brought out to its own header, or out of os::, so that I don't have to include os.hpp whenever a forward declaration would be enough. ------------- PR: https://git.openjdk.java.net/jdk/pull/927 From rkennke at openjdk.java.net Thu Oct 29 13:18:55 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 29 Oct 2020 13:18:55 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v25] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads that make heavvy use of such weak references will therefore potentially cause significant GC pauses. > > There are 3 main items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for FinalReference, marking stops there, and does not mark through the referent. > 2. Concurrent processing is performed after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers in Reference.get() intrinsics that return NULL when the referent is not reachable. > > Testing: hotspot_gc_shenadoah (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions, tier1, tier2, vmTestbase_vm_metaspace, vmTestbase_nsk_jvmti, with -XX:+UseShenandoahGC without regressions, specjvm with various levels of verification Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 81 commits: - Fix merge mistake - Merge branch 'master' into shenandoah-concurrent-weakrefs - Pass marking-strength through chunked arrays - Rename mark_final -> mark_weak and several cleanups (by shade) - Some more ShMarkTask cleanups - Call into native-LRB on unknown oop strenght (i.e. reflection) too - Put in comment about API impedence mismatch around interpreter native LRB - Add missing merge changes of shenandoahTaskQueue.hpp - Merge branch 'master' into shenandoah-concurrent-weakrefs - Consolidate native-LRB invocation - ... and 71 more: https://git.openjdk.java.net/jdk/compare/faf23de5...75958efb ------------- Changes: https://git.openjdk.java.net/jdk/pull/505/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=24 Stats: 2425 lines in 55 files changed: 1651 ins; 565 del; 209 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From stefank at openjdk.java.net Thu Oct 29 13:23:42 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 29 Oct 2020 13:23:42 GMT Subject: RFR: 8255582: Introduce SemaphoreLock and SemaphoreLocker In-Reply-To: References: <_JWavDFnX9LXuj9uXsd6RDwRDrsoVb-tG-CVQ6fMA90=.5d37cc4c-b6cf-4dc8-ac47-0372cf368ead@github.com> Message-ID: On Thu, 29 Oct 2020 13:14:45 GMT, Stefan Karlsson wrote: >> For low-level locks, an alternative could be to use PlatformMutex. > >> For low-level locks, an alternative could be to use PlatformMutex. > > Sounds like a good idea. Though, I think that class needs to be brought out to its own header, or out of os::, so that I don't have to include os.hpp whenever a forward declaration would be enough. I'm closing this PR, given the other available alternatives. Though I must say, I don't think it's fair to use an "imminent rewrite of the lock ranking" as a motivation to push back on my wish to use some kind of locking that doesn't force me to fake a lock order when the code intend to be a leaf operation. A lock ranking rewrite has been discussed for over a decade now. ------------- PR: https://git.openjdk.java.net/jdk/pull/927 From stefank at openjdk.java.net Thu Oct 29 13:23:42 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 29 Oct 2020 13:23:42 GMT Subject: Withdrawn: 8255582: Introduce SemaphoreLock and SemaphoreLocker In-Reply-To: <_JWavDFnX9LXuj9uXsd6RDwRDrsoVb-tG-CVQ6fMA90=.5d37cc4c-b6cf-4dc8-ac47-0372cf368ead@github.com> References: <_JWavDFnX9LXuj9uXsd6RDwRDrsoVb-tG-CVQ6fMA90=.5d37cc4c-b6cf-4dc8-ac47-0372cf368ead@github.com> Message-ID: <4_k5WGPxg-EdmnBwOymHgHjMNDquA7G0-gfKCxB4KLs=.6f6df21a-d295-4369-b389-f8e072971d54@github.com> On Thu, 29 Oct 2020 10:01:22 GMT, Stefan Karlsson wrote: > Semaphores can be used as low-level locks, but the readability of the code using them could be better. I propose that we introduce two new classes: > > SemaphoreLock - which provides the operations lock, unlock, try_lock. > > SemaphoreLocker - Equivalent to MutexLocker. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/927 From rehn at openjdk.java.net Thu Oct 29 13:50:43 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 29 Oct 2020 13:50:43 GMT Subject: RFR: 8255582: Introduce SemaphoreLock and SemaphoreLocker In-Reply-To: References: <_JWavDFnX9LXuj9uXsd6RDwRDrsoVb-tG-CVQ6fMA90=.5d37cc4c-b6cf-4dc8-ac47-0372cf368ead@github.com> Message-ID: On Thu, 29 Oct 2020 13:21:02 GMT, Stefan Karlsson wrote: > I'm closing this PR, given the other available alternatives. > > Though I must say, I don't think it's fair to use an "imminent rewrite of the lock ranking" as a motivation to push back on my wish to use some kind of locking that doesn't force me to fake a lock order when the code intend to be a leaf operation. A lock ranking rewrite has been discussed for over a decade now. The problem is that people think they are leaf, but they are really not. If they truly where leaf, shouldn't setting lock rank to 'event' always work? ------------- PR: https://git.openjdk.java.net/jdk/pull/927 From shade at openjdk.java.net Thu Oct 29 14:09:50 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 29 Oct 2020 14:09:50 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v12] In-Reply-To: References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: On Tue, 27 Oct 2020 14:52:09 GMT, Roman Kennke wrote: >> src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp line 81: >> >>> 79: _heap(ShenandoahHeap::heap()), >>> 80: _mark_context(_heap->marking_context()), >>> 81: _strong(true) >> >> Do we want to turn this to yet another template parameter, like for dedup? That would also resolve passing `true` or `false` to `strong` argument without comments. > > We need to switch strength in ShenandoahConcurrentMark::do_task() and we get passed-in a ready closure there. I am not sure how we could do that with template-args. Template args only make sense for things that don't change during marking. Tried to see what if templating can be done. I think the major hurdle is that weak _tasks_ can be stolen by other workers. Which means every worker should check for weakness for every task, and thus would require changing the closure "flavor" on the fly. This seems like a no-go. You can resolve this conversation. ------------- PR: https://git.openjdk.java.net/jdk/pull/505 From mcimadamore at openjdk.java.net Thu Oct 29 14:13:03 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 29 Oct 2020 14:13:03 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v18] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the third incubation round of the foreign memory access API incubation (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: > > * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from multiple threads > * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class has been added, which defines several useful dereference routines; these are really just thin wrappers around memory access var handles, but they make the barrier of entry for using this API somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit of dereference. > > This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done by calling `MemoryAddress::asSegmentRestricted`). > > A list of the API, implementation and test changes is provided below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be happy to point at existing discussions, and/or to provide the feedback required. > > A big thank to Erik Osterlund, Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. > > Thanks > Maurizio > > Javadoc: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which it is easy to derive all the other handles using plain var handle combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. > * `MemoryLayouts` > * A new layout, for machine addresses, has been added to the mix. > > > > ### Implementation changes > > There are two main things to discuss here: support for shared segments, and the general simplification of the memory access var handle support. > > #### Shared segments > > The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment is shared, it would be possible for a thread to close it while another is accessing it. > > After considering several options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. > > Sadly, none of these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). > > The question is, then, once we detect that a thread is accessing the very segment we're about to close, what should happen? We first experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread is accessing the segment being closed. > > As written in the javadoc, this doesn't mean that clients should just catch and try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should be treated as such. > > In terms of gritty implementation, we needed to centralize memory access routines in a single place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` annotation, which tells the VM that something important is going on. > > To achieve this, we created a new (autogenerated) class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during access (which is important when registering segments against cleaners). > > Of course, to make memory access safe, memory access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead of unsafe, so that a liveness check can be triggered (in case a scope is present). > > `ScopedMemoryAccess` has a `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed successfully. > > The implementation of `MemoryScope` (now significantly simplified from what we had before), has two implementations, one for confined segments and one for shared segments; the main difference between the two is what happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. > > #### Memory access var handles overhaul > > The key realization here was that if all memory access var handles took a coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle form. > > This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that e.g. additional offset is injected into a base memory access var handle. > > This also helped in simplifying the implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level access on the innards of the memory access var handle. All that code is now gone. > > #### Test changes > > Not much to see here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared segment case. > > [1] - https://openjdk.java.net/jeps/393 > [2] - https://openjdk.java.net/jeps/389 > [3] - https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html > [4] - https://openjdk.java.net/jeps/312 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Fix issues with derived buffers and IO operations ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/548/files - new: https://git.openjdk.java.net/jdk/pull/548/files/b01af093..e3ec6b4c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=17 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=16-17 Stats: 81 lines in 5 files changed: 77 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/548.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/548/head:pull/548 PR: https://git.openjdk.java.net/jdk/pull/548 From rkennke at openjdk.java.net Thu Oct 29 14:15:04 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 29 Oct 2020 14:15:04 GMT Subject: RFR: 8254315: Shenandoah: Concurrent weak reference processing [v26] In-Reply-To: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> References: <8byaPRNFSF4tG_fA2jxtiDwcEbbMS_Zmk39w86ugIV4=.6a942481-9fd0-44f7-a42a-3668b22bea3e@github.com> Message-ID: > Until now, references (as in java.lang.ref.Reference and its subclasses WeakReference, SoftReference, PhantomReference and the non-public FinalReference - I'll collectively call them weak references for the purpose of clarity). Workloads that make heavvy use of such weak references will therefore potentially cause significant GC pauses. > > There are 3 main items that contribute to pause time linear to number of references, or worse: > - We need to scan and consider each reference on the various 'discovered' lists. > - We need to mark through subgraph of objects that are reachable only through FinalReference. Notice that this is theoretically only bounded by the live data set size. > - Finally, all no-longer-reachable references need to be enqueued in the 'pending list' > > The problem is somewhat mitigated by pre-cleaning the discovered list: Any weak reference that we find to be strongly reachable will be removed before we go into the final-mark-pause. However, that is only a band-aid. > > The solution to this is two-fold: > 1. Extend concurrent marking to also mark the 'finalizable' subgraph of the heap. This requires to extend the marking bitmap to allow for two kinds of reachability: each object can now be strongly and finalizably reachable. Whenever marking encounters a FinalReference, it will mark through the referent and switch to 'finalizably' reachability for all objects starting from the referent. When marking encounters finalizably reachable objects while marking strongly, it will 'upgrade' reachability of such objects to strongly reachable. All of this can be done concurrently. Any encounter of a Reference (or subclass) object will enqueue that object into a thread-local 'discovered' list. Except for FinalReference, marking stops there, and does not mark through the referent. > 2. Concurrent processing is performed after the final-mark pause. GC workers scan all discovered lists that have been collected by concurrent marking, and depending on reachability of the referent, either drop the Reference, or enqueue it into the global 'pending' list (from where it will be processed by Java reference handler thread). In addition to that, we must ensure that no referents become resurrected by accessing Reference.get() on it. In order to achieve this, we employ special barriers in Reference.get() intrinsics that return NULL when the referent is not reachable. > > Testing: hotspot_gc_shenadoah (release+fastdebug, x86+aarch64), specjvm+specjbb without regressions, tier1, tier2, vmTestbase_vm_metaspace, vmTestbase_nsk_jvmti, with -XX:+UseShenandoahGC without regressions, specjvm with various levels of verification Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Invert strong/weak in marking tasks and related code ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/505/files - new: https://git.openjdk.java.net/jdk/pull/505/files/75958efb..60b39cb0 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=25 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=505&range=24-25 Stats: 58 lines in 7 files changed: 4 ins; 4 del; 50 mod Patch: https://git.openjdk.java.net/jdk/pull/505.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/505/head:pull/505 PR: https://git.openjdk.java.net/jdk/pull/505 From mcimadamore at openjdk.java.net Thu Oct 29 14:16:49 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 29 Oct 2020 14:16:49 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) In-Reply-To: References: Message-ID: On Tue, 27 Oct 2020 14:40:29 GMT, Maurizio Cimadamore wrote: >> This patch contains the changes associated with the third incubation round of the foreign memory access API incubation (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: >> >> * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from multiple threads >> * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee that the memory will be deallocated, eventually >> * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class has been added, which defines several useful dereference routines; these are really just thin wrappers around memory access var handles, but they make the barrier of entry for using this API somewhat lower. >> >> A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit of dereference. >> >> This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done by calling `MemoryAddress::asSegmentRestricted`). >> >> A list of the API, implementation and test changes is provided below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be happy to point at existing discussions, and/or to provide the feedback required. >> >> A big thank to Erik Osterlund, Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. >> >> Thanks >> Maurizio >> >> Javadoc: >> >> http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html >> >> Specdiff: >> >> http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html >> >> CSR: >> >> https://bugs.openjdk.java.net/browse/JDK-8254163 >> >> >> >> ### API Changes >> >> * `MemorySegment` >> * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) >> * added a no-arg factory for a native restricted segment representing entire native heap >> * rename `withOwnerThread` to `handoff` >> * add new `share` method, to create shared segments >> * add new `registerCleaner` method, to register a segment against a cleaner >> * add more helpers to create arrays from a segment e.g. `toIntArray` >> * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) >> * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) >> * `MemoryAddress` >> * drop `segment` accessor >> * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative to a given segment >> * `MemoryAccess` >> * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. `getByteAtOffset` vs `getByteAtIndex`). >> * `MemoryHandles` >> * drop `withOffset` combinator >> * drop `withStride` combinator >> * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which it is easy to derive all the other handles using plain var handle combinators. >> * `Addressable` >> * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. >> * `MemoryLayouts` >> * A new layout, for machine addresses, has been added to the mix. >> >> >> >> ### Implementation changes >> >> There are two main things to discuss here: support for shared segments, and the general simplification of the memory access var handle support. >> >> #### Shared segments >> >> The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment is shared, it would be possible for a thread to close it while another is accessing it. >> >> After considering several options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. >> >> Sadly, none of these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). >> >> The question is, then, once we detect that a thread is accessing the very segment we're about to close, what should happen? We first experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread is accessing the segment being closed. >> >> As written in the javadoc, this doesn't mean that clients should just catch and try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should be treated as such. >> >> In terms of gritty implementation, we needed to centralize memory access routines in a single place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` annotation, which tells the VM that something important is going on. >> >> To achieve this, we created a new (autogenerated) class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during access (which is important when registering segments against cleaners). >> >> Of course, to make memory access safe, memory access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead of unsafe, so that a liveness check can be triggered (in case a scope is present). >> >> `ScopedMemoryAccess` has a `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed successfully. >> >> The implementation of `MemoryScope` (now significantly simplified from what we had before), has two implementations, one for confined segments and one for shared segments; the main difference between the two is what happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. >> >> #### Memory access var handles overhaul >> >> The key realization here was that if all memory access var handles took a coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle form. >> >> This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that e.g. additional offset is injected into a base memory access var handle. >> >> This also helped in simplifying the implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level access on the innards of the memory access var handle. All that code is now gone. >> >> #### Test changes >> >> Not much to see here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared segment case. >> >> [1] - https://openjdk.java.net/jeps/393 >> [2] - https://openjdk.java.net/jeps/389 >> [3] - https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html >> [4] - https://openjdk.java.net/jeps/312 > >> @mcimadamore, if you pull from current master, you would get the Linux x86_32 tier1 run "for free". > > Just did that - I also removed TestMismatch from the problem list in the latest iteration, and fixed the alignment for long/double layouts, after chatting with the team (https://bugs.openjdk.java.net/browse/JDK-8255350) I've just uploaded another iteration which addresses some comments from @AlanBateman. Basically, there are some operations on Channel and Socket which take ByteBuffer as arguments, and then, if such buffers are *direct*, they get the address and pass it down to some native function. This idiom is problematic because there's no way to guarantee that the buffer won't be closed (if obtained from a memory segment) after the address has been obtained. As a stop gap solution, I've introduced checks in `DirectBuffer::address` method, which is used in around 30 places in the JDK. This method will now throw if (a) the buffer has a shared scope, or (b) if the scope is confined, but already closed. With this extra check, I believe there's no way to misuse the buffer obtained from a segment. We have discussed plans to remove this limitations (which we think will be possible) - but for the time being, it's better to play the conservative card. ------------- PR: https://git.openjdk.java.net/jdk/pull/548 From stefank at openjdk.java.net Thu Oct 29 14:17:46 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 29 Oct 2020 14:17:46 GMT Subject: RFR: 8255582: Introduce SemaphoreLock and SemaphoreLocker In-Reply-To: References: <_JWavDFnX9LXuj9uXsd6RDwRDrsoVb-tG-CVQ6fMA90=.5d37cc4c-b6cf-4dc8-ac47-0372cf368ead@github.com> Message-ID: On Thu, 29 Oct 2020 13:48:28 GMT, Robbin Ehn wrote: > > I'm closing this PR, given the other available alternatives. > > Though I must say, I don't think it's fair to use an "imminent rewrite of the lock ranking" as a motivation to push back on my wish to use some kind of locking that doesn't force me to fake a lock order when the code intend to be a leaf operation. A lock ranking rewrite has been discussed for over a decade now. > > The problem is that people think they are leaf, but they are really not. > If they truly where leaf, shouldn't setting lock rank to 'event' always work? > > EDIT: > What think we have an issue with is non-reentrant code, you are not a leaf but you know the code paths in the exclusive code region can never call this 'module' again. Which leads to "give me a rank that works" I don't care. > Correct ? Yes, you might be right. ------------- PR: https://git.openjdk.java.net/jdk/pull/927 From eosterlund at openjdk.java.net Thu Oct 29 14:22:56 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 29 Oct 2020 14:22:56 GMT Subject: Integrated: 8255243: Reinforce escape barrier interactions with ZGC conc stack processing In-Reply-To: References: Message-ID: On Fri, 23 Oct 2020 10:25:43 GMT, Erik ?sterlund wrote: > The escape barrier reallocates scalarized objects potentially deep into the stack of a remote thread. Each allocation can safepoint, causing referenced frames to be invalid. Some sprinklings were added that deal with that, but I believe it was subsequently broken with the integration of the new vector API, that has its own new deoptimization code that did not know about this. Not surprisingly, the integration of the new vector API had no idea about this subtlety, and allocates an object, and then reads an object deep from the stack of a remote thread (using an escape barrier). I suppose the issue is that all these 3 things were integrated at almost the same time. The problematic code sequence is in VectorSupport::allocate_vector() in vectorSupport.cpp, which is called from Deoptimization::realloc_objects(). It first allocates an oop (possibly safepointing), and then reads a vector oop from the stack. This is usually fine, but not through the escape barrier, with concurrent stack sc anning. While I have not seen any crashes yet, I can see from code inspection, that there is no way that this works correctly. > > In order to make this less fragile for future changes, we should really have a RAII object that keeps the target thread's stack of the escape barrier, stable and processed, across safepoints. This patch fixes that. Then it becomes much easier to reason about its correctness, compared to hoping the various hooks are applied after each safepoint. > > With this new robustness fix, the thread running the escape barrier, keeps the target thread stack processed, straight through safepoints on the requesting thread, making it easy and intuitive to understand why this works correctly. The RAII object basically just has to cover the code block that pokes at the remote stack and goes in and out of safepoints, arbitrarily. Arguably, this escape barrier doesn't need to be blazingly fast, and can afford keeping stacks sane through its operation. This pull request has now been integrated. Changeset: 5b185585 Author: Erik ?sterlund URL: https://git.openjdk.java.net/jdk/commit/5b185585 Stats: 268 lines in 15 files changed: 164 ins; 72 del; 32 mod 8255243: Reinforce escape barrier interactions with ZGC conc stack processing Co-authored-by: Richard Reingruber Reviewed-by: rrich, sspitsyn ------------- PR: https://git.openjdk.java.net/jdk/pull/832 From gziemski at openjdk.java.net Thu Oct 29 14:49:47 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Thu, 29 Oct 2020 14:49:47 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) [v3] In-Reply-To: References: <6rybIs1odojqcKQ6zzl39wj2IxGxnMCVQXgpeajzqns=.64424fc9-74d7-4e2b-9d89-df6f7da44770@github.com> Message-ID: On Thu, 29 Oct 2020 05:08:59 GMT, David Holmes wrote: >> Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: >> >> make UseOSErrorReporting flag Windows only > > src/hotspot/os/windows/globals_windows.hpp line 39: > >> 37: constraint) \ >> 38: \ >> 39: product(bool, UseOSErrorReporting, false \ > > Comma missing after "false" Good catch, thank you. ------------- PR: https://git.openjdk.java.net/jdk/pull/813 From gziemski at openjdk.java.net Thu Oct 29 14:49:48 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Thu, 29 Oct 2020 14:49:48 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) [v3] In-Reply-To: <0peTckbitQpAZsGhPCZsBBh74gmIBgea50MLkLezS94=.f6187eec-ff8c-40e1-aad1-315bc4de70ee@github.com> References: <6rybIs1odojqcKQ6zzl39wj2IxGxnMCVQXgpeajzqns=.64424fc9-74d7-4e2b-9d89-df6f7da44770@github.com> <0peTckbitQpAZsGhPCZsBBh74gmIBgea50MLkLezS94=.f6187eec-ff8c-40e1-aad1-315bc4de70ee@github.com> Message-ID: On Wed, 28 Oct 2020 16:54:03 GMT, Thomas Stuefe wrote: >> Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: >> >> make UseOSErrorReporting flag Windows only > > src/hotspot/share/utilities/vmError.cpp line 1437: > >> 1435: } else { >> 1436: #if defined(_WINDOWS) >> 1437: // If UseOsErrorReporting we call this for each level of the call stack > > Could you please change this comment to refer to UseOSErrorReporting? (Note the capital s). Makes it easier to grep for it. Same goes for os_windows.cpp:2357 . Fixed. > src/hotspot/share/utilities/vmError.cpp line 1631: > >> 1629: } >> 1630: >> 1631: #if defined(_WINDOWS) > > If you like you could abbreviate this Hunk with something like > if (WINDOWS_ONLY(!UseOsErrorReporting) NOT_WINDOWS(true)) { > but this is fine too, I leave it up to you. I like it. ------------- PR: https://git.openjdk.java.net/jdk/pull/813 From tschatzl at openjdk.java.net Thu Oct 29 15:10:48 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 29 Oct 2020 15:10:48 GMT Subject: RFR: 8255232: G1: Make G1BiasedMappedArray freeable [v2] In-Reply-To: References: <966v_gu-j37Do5XhKCgwhCu0DDdwyKPQ04oUQkvzEIs=.96808fbd-a650-448e-96af-58a2bf7b9c2d@github.com> Message-ID: On Thu, 29 Oct 2020 10:04:06 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> ayang review > > Thank you for the change. Thanks @albertnetymk @kimbarrett for your reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/808 From tschatzl at openjdk.java.net Thu Oct 29 15:10:49 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 29 Oct 2020 15:10:49 GMT Subject: Integrated: 8255232: G1: Make G1BiasedMappedArray freeable In-Reply-To: <966v_gu-j37Do5XhKCgwhCu0DDdwyKPQ04oUQkvzEIs=.96808fbd-a650-448e-96af-58a2bf7b9c2d@github.com> References: <966v_gu-j37Do5XhKCgwhCu0DDdwyKPQ04oUQkvzEIs=.96808fbd-a650-448e-96af-58a2bf7b9c2d@github.com> Message-ID: On Thu, 22 Oct 2020 13:47:00 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that makes G1BiasedMappedArray freeable? > > Previously all G1BiasedMappedArray were created as unfreeable i.e. assigned to static variables. However with JDK-8253600 I need one such biased map for the full collector which is created and deleted during full GC. So the biased array should also be freed as necessary to avoid a memory leak. > > The alternative would be to statically allocate that map anyway and provide it to the current G1FullCollector instance, but I do not think the single malloc call is perf sensitive compared to full collector work and there is much point in doing something more complicated at this time. In the future I hope that the young gen collector will also be extracted from G1CollectedHeap with the same need. If/when allocation of these helper data structures becomes a problem I would suggest looking into this again. > > One option then could be using some ResoureArea for these things in the future. > > For this change there should be no change in behavior at all. > > Testing: tier1-5 > > Thanks, > Thomas This pull request has now been integrated. Changeset: 5c520c3f Author: Thomas Schatzl URL: https://git.openjdk.java.net/jdk/commit/5c520c3f Stats: 35 lines in 4 files changed: 27 ins; 2 del; 6 mod 8255232: G1: Make G1BiasedMappedArray freeable Reviewed-by: ayang, kbarrett ------------- PR: https://git.openjdk.java.net/jdk/pull/808 From tschatzl at openjdk.java.net Thu Oct 29 19:31:03 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 29 Oct 2020 19:31:03 GMT Subject: RFR: 8253600: G1: Fully support pinned regions for full gc [v2] In-Reply-To: References: Message-ID: > Hi all, > > can I get reviews for this change that implements "proper" support for pinned regions in the G1 full collector? > > By proper I mean that at the end of gc, pinned regions contain the correct TAMS and bitmap markings under the TAMS so that dead objects within them are supported? > > Currently all (pinned) regions have their TAMS set to bottom() and their bitmap above TAMS cleared (at least logically :) ). This works as long objects within these regions can't be dead as it is the case now: > - humongous regions are either live or fully reclaimed. > - all other pinned regions are archive regions at the moment that are always treated as fully live (and do not contain dead objects). > > This change is a requirement for fixing JDK-8253081 as some earlier change made it possible to have dead objects within open archive regions. It also enables supporting removal of gclocker use for g1, i.e. using region pinning. > > Based on the PR#808 (https://github.com/openjdk/jdk/pull/808). > > Testing: tier1-8, testing with prototype for region pinning, testing with prototype for JDK-8253081. > Performance testing: no regressions > > Some comments for questions that might come up during review: > > - how does this work with the bitmaps now: > - at start of full gc the next bitmap is cleared > - full gc marks the next bitmap > - for all pinned regions, keep TAMS and top() (*), otherwise set TAMS to bottom > - swap bitmaps > - clear next bitmap for next marking > > (*) this means that from a usage POV pinned regions are considered full. This is inaccurate, but sufficient: full gc clears all remembered sets anyway, so we do not need that information for gc efficiency purposes anyway to evacuate later. The next marking before old gen evacuation will update it to the correct values anyway. G1 does not support allocation into "holes" in pinned regions that can be open archive only at this time too, so there is no need to be more exact. > > - use of a region attribute table for phase 2+ only: compared to before we need fast access to information whether a given reference goes into a pinned region (as opposed to an archive region) wrt to adjusting that pointer to avoid doing work for these references. > > Phase 1 marking could have used this information for the do-we-need-to-preserve-the-mark check too: however this would have required g1 to add an extra another pass over all regions to update that. This seemed slower than just checking this information "more slowly" for the objects that need mark preservation. Tests showed that this is the case for <0.00% (yeah, these references that need mark preservation are rounding errors in cases it matters) of overall references, so I did not add that pass. > (Additionally g1 full gc is a last-ditch effort, and while marking takes a significant time, it does not completely dominate it). > > I.e. the second clause in the condition of this hunk is intentionally slower than could be: > @@ -52,7 +52,9 @@ inline bool G1FullGCMarker::mark_object(oop obj) { > // Marked by us, preserve if needed. > markWord mark = obj->mark(); > if (obj->mark_must_be_preserved(mark) && > // It is not necessary to preserve marks for objects in pinned regions because > // we do not change their headers (i.e. forward them). > !G1CollectedHeap::heap()->heap_region_containing(obj)->is_pinned()) { > preserved_stack()->push(obj, mark); > } > - there is no code yet that checks for empty pinned regions yet. Only JDK-8253081 introduces that because still all contents of all archive regions are live forever. > > Also please note that the 51b297b change is from the #808 change. > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: sjohanss review Also remove _archive_allocator_map et al as the new attribute table implements the same functionality also suggested by sjohanss in private. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/824/files - new: https://git.openjdk.java.net/jdk/pull/824/files/02928bc9..68839936 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=824&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=824&range=00-01 Stats: 375 lines in 27 files changed: 117 ins; 192 del; 66 mod Patch: https://git.openjdk.java.net/jdk/pull/824.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/824/head:pull/824 PR: https://git.openjdk.java.net/jdk/pull/824 From tschatzl at openjdk.java.net Thu Oct 29 19:51:55 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 29 Oct 2020 19:51:55 GMT Subject: RFR: 8253600: G1: Fully support pinned regions for full gc [v3] In-Reply-To: References: Message-ID: > Hi all, > > can I get reviews for this change that implements "proper" support for pinned regions in the G1 full collector? > > By proper I mean that at the end of gc, pinned regions contain the correct TAMS and bitmap markings under the TAMS so that dead objects within them are supported? > > Currently all (pinned) regions have their TAMS set to bottom() and their bitmap above TAMS cleared (at least logically :) ). This works as long objects within these regions can't be dead as it is the case now: > - humongous regions are either live or fully reclaimed. > - all other pinned regions are archive regions at the moment that are always treated as fully live (and do not contain dead objects). > > This change is a requirement for fixing JDK-8253081 as some earlier change made it possible to have dead objects within open archive regions. It also enables supporting removal of gclocker use for g1, i.e. using region pinning. > > Based on the PR#808 (https://github.com/openjdk/jdk/pull/808). > > Testing: tier1-8, testing with prototype for region pinning, testing with prototype for JDK-8253081. > Performance testing: no regressions > > Some comments for questions that might come up during review: > > - how does this work with the bitmaps now: > - at start of full gc the next bitmap is cleared > - full gc marks the next bitmap > - for all pinned regions, keep TAMS and top() (*), otherwise set TAMS to bottom > - swap bitmaps > - clear next bitmap for next marking > > (*) this means that from a usage POV pinned regions are considered full. This is inaccurate, but sufficient: full gc clears all remembered sets anyway, so we do not need that information for gc efficiency purposes anyway to evacuate later. The next marking before old gen evacuation will update it to the correct values anyway. G1 does not support allocation into "holes" in pinned regions that can be open archive only at this time too, so there is no need to be more exact. > > - use of a region attribute table for phase 2+ only: compared to before we need fast access to information whether a given reference goes into a pinned region (as opposed to an archive region) wrt to adjusting that pointer to avoid doing work for these references. > > Phase 1 marking could have used this information for the do-we-need-to-preserve-the-mark check too: however this would have required g1 to add an extra another pass over all regions to update that. This seemed slower than just checking this information "more slowly" for the objects that need mark preservation. Tests showed that this is the case for <0.00% (yeah, these references that need mark preservation are rounding errors in cases it matters) of overall references, so I did not add that pass. > (Additionally g1 full gc is a last-ditch effort, and while marking takes a significant time, it does not completely dominate it). > > I.e. the second clause in the condition of this hunk is intentionally slower than could be: > @@ -52,7 +52,9 @@ inline bool G1FullGCMarker::mark_object(oop obj) { > // Marked by us, preserve if needed. > markWord mark = obj->mark(); > if (obj->mark_must_be_preserved(mark) && > // It is not necessary to preserve marks for objects in pinned regions because > // we do not change their headers (i.e. forward them). > !G1CollectedHeap::heap()->heap_region_containing(obj)->is_pinned()) { > preserved_stack()->push(obj, mark); > } > - there is no code yet that checks for empty pinned regions yet. Only JDK-8253081 introduces that because still all contents of all archive regions are live forever. > > Also please note that the 51b297b change is from the #808 change. > > Thanks, > Thomas Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: - Merge branch 'master' into 8253600-full-gc-pinned-region-support - Merge branch 'master' into 8253600-full-gc-pinned-region-support - sjohanss review Also remove _archive_allocator_map et al as the new attribute table implements the same functionality also suggested by sjohanss in private. - Initial import - Initial import ------------- Changes: https://git.openjdk.java.net/jdk/pull/824/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=824&range=02 Stats: 507 lines in 29 files changed: 213 ins; 199 del; 95 mod Patch: https://git.openjdk.java.net/jdk/pull/824.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/824/head:pull/824 PR: https://git.openjdk.java.net/jdk/pull/824 From tschatzl at openjdk.java.net Thu Oct 29 20:06:48 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 29 Oct 2020 20:06:48 GMT Subject: RFR: 8253600: G1: Fully support pinned regions for full gc [v3] In-Reply-To: References: Message-ID: On Thu, 29 Oct 2020 10:23:25 GMT, Stefan Johansson wrote: >> Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: >> >> - Merge branch 'master' into 8253600-full-gc-pinned-region-support >> - Merge branch 'master' into 8253600-full-gc-pinned-region-support >> - sjohanss review >> >> Also remove _archive_allocator_map et al as the new attribute table >> implements the same functionality also suggested by sjohanss in >> private. >> - Initial import >> - Initial import > > Nice change Thomas, mostly just small comments. The new change (apart from the merge) should fix all the concerns @kstefanj mentioned. In addition to that I looked into completely moving the G1ArchiveAllocator::_archive_region_map into the G1FullCollector::_region_attr_table as they serve the same purpose. Other than in full gc, G1ArchiveAllocator::_archive_region_map is only used in non-perf critical code (object dumping, asserts), so the replacements should be more than fast enough. The only drawback is that this adds a new method in CollectedHeap as a start for CDS archive support for other collectors which is thought of. More is needed, but that's the minimum to replace functionality previously provided by G1ArchiveAllocator::_archive_region_map at this time. Its default implementation simply returns false if asked whether a given object on the heap is an archive object. virtual bool is_archived_object(oop object) const { return false; } ------------- PR: https://git.openjdk.java.net/jdk/pull/824 From zgu at openjdk.java.net Thu Oct 29 20:46:50 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Thu, 29 Oct 2020 20:46:50 GMT Subject: RFR: 8255606: Enable concurrent stack processing on x86_32 platforms Message-ID: 8255606: Enable concurrent stack processing on x86_32 platforms ------------- Commit messages: - Merge branch 'master' into JDK-8255606-conc-stack-x86_32 - 8255606: Enable concurrent stack processing on x86_32 platforms Changes: https://git.openjdk.java.net/jdk/pull/945/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=945&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255606 Stats: 63 lines in 7 files changed: 34 ins; 14 del; 15 mod Patch: https://git.openjdk.java.net/jdk/pull/945.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/945/head:pull/945 PR: https://git.openjdk.java.net/jdk/pull/945 From akozlov at openjdk.java.net Thu Oct 29 21:13:00 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 29 Oct 2020 21:13:00 GMT Subject: RFR: 8255416: Investigate err_msg to detect unnecessary uses [v4] In-Reply-To: <33_QxzGck_FYstuHHF_usmlcLQIL45p-xKBwAWxBQkA=.bf31affe-035f-4466-a2e7-a7cd13cb2617@github.com> References: <-P5id4x50oiTzG6JHl4GD-HJbqYM4juVm9IpsNvXlV4=.7e07be8c-8389-41e2-a563-1f3be709ac51@github.com> <33_QxzGck_FYstuHHF_usmlcLQIL45p-xKBwAWxBQkA=.bf31affe-035f-4466-a2e7-a7cd13cb2617@github.com> Message-ID: On Thu, 29 Oct 2020 08:47:12 GMT, Aleksey Shipilev wrote: >> Oh, now I see it. Smart. >> >> But it occurred to me that someone may want to start a err_msg buffer up with a string literal, only to then add additional content via FormatBuffer::append(). So initializing it with a literal and no arguments may be a valid usecase. >> >> So I'd still prefer keeping this out. Alternatively, if you like to keep it in, could we just not implement the second constructor? Which should give us linker errors if the static check fails. > > +1 to leave this undefined to get a linkage error. There are already precedents to do this in Hotspot code, for example > > Node(const Node&); // not defined; linker error to use these > ... > // should never be used > AdapterHandlerEntry(); This is a good suggestion. Link-time failure is certainly better than runtime one, thanks! Although err_msg was implemented with FormatBuffer since the beginning [1], I assume it should be an implementation detail. Otherwise there would be no function-like err_msg thing. And at the present, there is only a single case of append to err_msg. FormatErrBuffer can inherit FormatBuffer privately. It will completely remove err_msg interface relation with FormatBuffer, which is aligned with the most of current uses. Although I don't clearly understand reasons not to introduce this (beside it may be a bit over-engineered, but without implications to maintainability, I suppose), this may change assumptions about err_msg. [1] http://hg.openjdk.java.net/jdk10/jdk10/hotspot/rev/f03d0a26bf83#l32.32 ------------- PR: https://git.openjdk.java.net/jdk/pull/905 From akozlov at openjdk.java.net Thu Oct 29 21:13:00 2020 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 29 Oct 2020 21:13:00 GMT Subject: RFR: 8255416: Investigate err_msg to detect unnecessary uses [v4] In-Reply-To: References: Message-ID: <40uuidQshCOFVs9ptZYLfxiesDU-8sVW6rU_bpLeGa8=.e53d49eb-5b31-4916-8ad7-0d1872947b2d@github.com> > Hi, > > When a single string without formatting arguments is provided to `err_msg`, it's redundancy, as the same message could be used without any err_msg. This is a follow-up to the discussion https://github.com/openjdk/jdk/pull/812#discussion_r511784050 > > Please review a change that makes `err_msg` with a single string to fail compilation. > > Detected uses of err_msg with a single string were eliminated as well. Anton Kozlov has updated the pull request incrementally with two additional commits since the last revision: - Unlink err_msg interface from FormatBuffer - Remove implementation of the dummy ctor ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/905/files - new: https://git.openjdk.java.net/jdk/pull/905/files/8a99cdcc..47ff851e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=905&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=905&range=02-03 Stats: 11 lines in 4 files changed: 4 ins; 0 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/905.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/905/head:pull/905 PR: https://git.openjdk.java.net/jdk/pull/905 From coleenp at openjdk.java.net Thu Oct 29 21:25:45 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 29 Oct 2020 21:25:45 GMT Subject: RFR: 8255452: Doing GC during JVMTI MethodExit event posting breaks return oop In-Reply-To: References: Message-ID: On Thu, 29 Oct 2020 12:44:58 GMT, Erik ?sterlund wrote: > The imasm::remove_activation() call does not deal with safepoints very well. However, when the MethodExit JVMTI event is being called, we call into the runtime in the middle of remove_activation(). If the value being returned is an object type, then the top-of-stack contains the oop. However, the GC does not traverse said oop in any oop map, because it is simply not expected that we safepoint in the middle of remove_activation(). > > The JvmtiExport::post_method_exit() function we end up calling, reads the top-of-stack oop, and puts it in a handle. Then it calls JVMTI callbacks, that eventually call Java and a bunch of stuff that safepoints. So after the JVMTI callback, we can expect the top-of-stack oop to be broken. Unfortunately, when we continue, we therefore end up returning a broken oop. > > Notably, the fact that InterpreterRuntime::post_method_exit is a JRT_ENTRY, is wrong, as we can safepoint on the way back to Java, which will break the return oop in a similar way. So this patch makes it a JRT_BLOCK_ENTRY, moving the transition to VM and back, into a block of code that is protected against GC. Before the JRT_BLOCK is called, we stash away the return oop, and after the JRT_BLOCK_END, we restore the top-of-stack oop. In the path when InterpreterRuntime::post_method_exit is called when throwing an exception, we don't have the same problem of retaining an oop result, and hence the JRT_BLOCK/JRT_BLOCK_END section is not performed in this case; the logic is the same as before for this path. > > This is a JVMTI bug that has probably been around for a long time. It crashes with all GCs, but was discovered recently after concurrent stack processing, as StefanK has been running better GC stressing code in JVMTI, and the bug reproduced more easily with concurrent stack processing, as the timings were a bit different. The following reproducer failed pretty much 100% of the time: > while true; do make test JTREG="RETAIN=all" TEST=test/hotspot/jtreg/vmTestbase/nsk/jdi/MethodExitEvent/returnValue/returnValue003/returnValue003.java TEST_OPTS_JAVA_OPTIONS="-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=0.0001 -XX:ZFragmentationLimit=0.01 -XX:+VerifyOops -XX:+ZVerifyViews -Xint" ; done > > With my fix I can run this repeatedly without any more failures. I have also sanity checked the patch by running tier 1-5, so that it does not introduces any new issues on its own. I have also used Stefan's nice external GC stressing with jcmd technique that was used to trigger crashes with other GCs, to make sure said crashes no longer reproduce either. Changes requested by coleenp (Reviewer). src/hotspot/share/prims/jvmtiExport.cpp line 1600: > 1598: > 1599: if (exception_exit) { > 1600: post_method_exit_inner(thread, mh, state, exception_exit, current_frame, result, value); I think for exception exit, you also need JRT_BLOCK because you want the transition to thread_in_VM for this code, since JRT_BLOCK_ENTRY doesn't do the transition. It should be safe for exception exit and retain the old behavior. ------------- PR: https://git.openjdk.java.net/jdk/pull/930 From burban at openjdk.java.net Thu Oct 29 22:18:44 2020 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Thu, 29 Oct 2020 22:18:44 GMT Subject: RFR: 8254072: AArch64: Get rid of --disable-warnings-as-errors on Windows+ARM64 build [v4] In-Reply-To: References: Message-ID: On Tue, 27 Oct 2020 14:04:04 GMT, Andrew Haley wrote: >> Bernhard Urban-Forster has updated the pull request incrementally with two additional commits since the last revision: >> >> - uppercase suffix >> - add assert > > Marked as reviewed by aph (Reviewer). Would you mind sponsor it @theRealAph or @magicus? ------------- PR: https://git.openjdk.java.net/jdk/pull/530 From eosterlund at openjdk.java.net Thu Oct 29 22:37:54 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 29 Oct 2020 22:37:54 GMT Subject: RFR: 8255452: Doing GC during JVMTI MethodExit event posting breaks return oop In-Reply-To: References: Message-ID: On Thu, 29 Oct 2020 21:23:12 GMT, Coleen Phillimore wrote: >> The imasm::remove_activation() call does not deal with safepoints very well. However, when the MethodExit JVMTI event is being called, we call into the runtime in the middle of remove_activation(). If the value being returned is an object type, then the top-of-stack contains the oop. However, the GC does not traverse said oop in any oop map, because it is simply not expected that we safepoint in the middle of remove_activation(). >> >> The JvmtiExport::post_method_exit() function we end up calling, reads the top-of-stack oop, and puts it in a handle. Then it calls JVMTI callbacks, that eventually call Java and a bunch of stuff that safepoints. So after the JVMTI callback, we can expect the top-of-stack oop to be broken. Unfortunately, when we continue, we therefore end up returning a broken oop. >> >> Notably, the fact that InterpreterRuntime::post_method_exit is a JRT_ENTRY, is wrong, as we can safepoint on the way back to Java, which will break the return oop in a similar way. So this patch makes it a JRT_BLOCK_ENTRY, moving the transition to VM and back, into a block of code that is protected against GC. Before the JRT_BLOCK is called, we stash away the return oop, and after the JRT_BLOCK_END, we restore the top-of-stack oop. In the path when InterpreterRuntime::post_method_exit is called when throwing an exception, we don't have the same problem of retaining an oop result, and hence the JRT_BLOCK/JRT_BLOCK_END section is not performed in this case; the logic is the same as before for this path. >> >> This is a JVMTI bug that has probably been around for a long time. It crashes with all GCs, but was discovered recently after concurrent stack processing, as StefanK has been running better GC stressing code in JVMTI, and the bug reproduced more easily with concurrent stack processing, as the timings were a bit different. The following reproducer failed pretty much 100% of the time: >> while true; do make test JTREG="RETAIN=all" TEST=test/hotspot/jtreg/vmTestbase/nsk/jdi/MethodExitEvent/returnValue/returnValue003/returnValue003.java TEST_OPTS_JAVA_OPTIONS="-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=0.0001 -XX:ZFragmentationLimit=0.01 -XX:+VerifyOops -XX:+ZVerifyViews -Xint" ; done >> >> With my fix I can run this repeatedly without any more failures. I have also sanity checked the patch by running tier 1-5, so that it does not introduces any new issues on its own. I have also used Stefan's nice external GC stressing with jcmd technique that was used to trigger crashes with other GCs, to make sure said crashes no longer reproduce either. > > src/hotspot/share/prims/jvmtiExport.cpp line 1600: > >> 1598: >> 1599: if (exception_exit) { >> 1600: post_method_exit_inner(thread, mh, state, exception_exit, current_frame, result, value); > > I think for exception exit, you also need JRT_BLOCK because you want the transition to thread_in_VM for this code, since JRT_BLOCK_ENTRY doesn't do the transition. It should be safe for exception exit and retain the old behavior. Thanks for having a look coleen. In fact, not doing the JRT_BLOCK for the exception entry is intentional, because that entry goes through a different JRT_ENTRY (not JRT_BLOCK_ENTRY), that already transitions. So if I do the JRT_BLOCK for the exception path, it asserts saying hey you are already in VM. ------------- PR: https://git.openjdk.java.net/jdk/pull/930 From minqi at openjdk.java.net Fri Oct 30 00:24:55 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Fri, 30 Oct 2020 00:24:55 GMT Subject: RFR: 8254309: appcds GCDuringDump.java failed - class must exist Message-ID: Hi, Please review When CDS at dump time initializes archived heap, some classes are loaded. If at this time system runs out of memory the class will not be loaded. This is what we saw in this bug. The fix checks if OOM happened, if so we print out log and exit gracefully not causing a crash. Added a test case for testing purpose when exception/OOM happens during this stage. Also check during preload classes when OOM happens, exit vm with proper message. Thanks Yumin ------------- Commit messages: - 8254309: appcds GCDuringDump.java failed - class must exist Changes: https://git.openjdk.java.net/jdk/pull/948/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=948&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254309 Stats: 166 lines in 8 files changed: 152 ins; 10 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/948.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/948/head:pull/948 PR: https://git.openjdk.java.net/jdk/pull/948 From iklam at openjdk.java.net Fri Oct 30 03:49:02 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 30 Oct 2020 03:49:02 GMT Subject: RFR: 8255285: Move JVMFlag origins into a new enum JVMFlagOrigin [v4] In-Reply-To: References: Message-ID: > Many JVM function take an `JVMFlag::Flags` parameter to indicate the origin of the flag -- i.e., "who is setting this flag". E.g., in arguments.hpp: > > static bool parse_argument(const char* arg, JVMFlag::Flags origin); > > However, `JVMFlag::Flags` contains many other bits that are unrelated to the origin. We should add a new enum `JVMFlagOrigin` that has only the valid values for the origin. This makes it possible to do more type-safety checks at C++ compilation time. > > This patch also renamed the confusing bit `JVMFlag::ORIG_COMMAND_LINE` to `WAS_SET_IN_COMMAND_LINE` and added documentation, so that it won't be confused with `JVMFlagOrigin::COMMAND_LINE`. Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: - Merge branch 'master' of https://github.com/openjdk/jdk into 8255285-new-enum-JVMFlagOrigin - fixed build - Merge branch 'master' into 8255285-new-enum-JVMFlagOrigin - renamed WAS_SET_IN_COMMAND_LINE to WAS_SET_ON_COMMAND_LINE - Removed aliases of JVMFlagOrigin::X as JVMFlag::X - fixed whitespaces - jvmflagorigin ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/823/files - new: https://git.openjdk.java.net/jdk/pull/823/files/b1d53802..2e0552d6 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=823&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=823&range=02-03 Stats: 2283 lines in 60 files changed: 565 ins; 1559 del; 159 mod Patch: https://git.openjdk.java.net/jdk/pull/823.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/823/head:pull/823 PR: https://git.openjdk.java.net/jdk/pull/823 From iklam at openjdk.java.net Fri Oct 30 03:49:03 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 30 Oct 2020 03:49:03 GMT Subject: Integrated: 8255285: Move JVMFlag origins into a new enum JVMFlagOrigin In-Reply-To: References: Message-ID: On Fri, 23 Oct 2020 06:33:06 GMT, Ioi Lam wrote: > Many JVM function take an `JVMFlag::Flags` parameter to indicate the origin of the flag -- i.e., "who is setting this flag". E.g., in arguments.hpp: > > static bool parse_argument(const char* arg, JVMFlag::Flags origin); > > However, `JVMFlag::Flags` contains many other bits that are unrelated to the origin. We should add a new enum `JVMFlagOrigin` that has only the valid values for the origin. This makes it possible to do more type-safety checks at C++ compilation time. > > This patch also renamed the confusing bit `JVMFlag::ORIG_COMMAND_LINE` to `WAS_SET_IN_COMMAND_LINE` and added documentation, so that it won't be confused with `JVMFlagOrigin::COMMAND_LINE`. This pull request has now been integrated. Changeset: 1a89d68e Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/1a89d68e Stats: 229 lines in 23 files changed: 39 ins; 14 del; 176 mod 8255285: Move JVMFlag origins into a new enum JVMFlagOrigin Reviewed-by: dholmes, redestad ------------- PR: https://git.openjdk.java.net/jdk/pull/823 From rrich at openjdk.java.net Fri Oct 30 06:58:48 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Fri, 30 Oct 2020 06:58:48 GMT Subject: RFR: 8255452: Doing GC during JVMTI MethodExit event posting breaks return oop In-Reply-To: References: Message-ID: On Thu, 29 Oct 2020 21:23:21 GMT, Coleen Phillimore wrote: >> The imasm::remove_activation() call does not deal with safepoints very well. However, when the MethodExit JVMTI event is being called, we call into the runtime in the middle of remove_activation(). If the value being returned is an object type, then the top-of-stack contains the oop. However, the GC does not traverse said oop in any oop map, because it is simply not expected that we safepoint in the middle of remove_activation(). >> >> The JvmtiExport::post_method_exit() function we end up calling, reads the top-of-stack oop, and puts it in a handle. Then it calls JVMTI callbacks, that eventually call Java and a bunch of stuff that safepoints. So after the JVMTI callback, we can expect the top-of-stack oop to be broken. Unfortunately, when we continue, we therefore end up returning a broken oop. >> >> Notably, the fact that InterpreterRuntime::post_method_exit is a JRT_ENTRY, is wrong, as we can safepoint on the way back to Java, which will break the return oop in a similar way. So this patch makes it a JRT_BLOCK_ENTRY, moving the transition to VM and back, into a block of code that is protected against GC. Before the JRT_BLOCK is called, we stash away the return oop, and after the JRT_BLOCK_END, we restore the top-of-stack oop. In the path when InterpreterRuntime::post_method_exit is called when throwing an exception, we don't have the same problem of retaining an oop result, and hence the JRT_BLOCK/JRT_BLOCK_END section is not performed in this case; the logic is the same as before for this path. >> >> This is a JVMTI bug that has probably been around for a long time. It crashes with all GCs, but was discovered recently after concurrent stack processing, as StefanK has been running better GC stressing code in JVMTI, and the bug reproduced more easily with concurrent stack processing, as the timings were a bit different. The following reproducer failed pretty much 100% of the time: >> while true; do make test JTREG="RETAIN=all" TEST=test/hotspot/jtreg/vmTestbase/nsk/jdi/MethodExitEvent/returnValue/returnValue003/returnValue003.java TEST_OPTS_JAVA_OPTIONS="-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=0.0001 -XX:ZFragmentationLimit=0.01 -XX:+VerifyOops -XX:+ZVerifyViews -Xint" ; done >> >> With my fix I can run this repeatedly without any more failures. I have also sanity checked the patch by running tier 1-5, so that it does not introduces any new issues on its own. I have also used Stefan's nice external GC stressing with jcmd technique that was used to trigger crashes with other GCs, to make sure said crashes no longer reproduce either. > > Changes requested by coleenp (Reviewer). Hi Erik, is it possible for GC to mistake a primitive value for a reference when posting the exit event? My understanding is: we are at a random bci of a method that is forced to return early. The expression stack is emptied and the return value is pushed on the expression stack then we call into the interpreter runtime to post the JVMTI method exit event during which we come to a safepoint for GC. The oop map for the bci does not cover this forced early return and if the return value is an object then the reference pushed on the expression stack before is not updated by GC. With your fix the value is updated if it is a reference. If this is correct then to me it appears as if GC can also crash because the oop map for the random bci tells there has to be a reference at the stack position of the return value if it actually is a primitive value. ------------- PR: https://git.openjdk.java.net/jdk/pull/930 From dlong at openjdk.java.net Fri Oct 30 08:06:42 2020 From: dlong at openjdk.java.net (Dean Long) Date: Fri, 30 Oct 2020 08:06:42 GMT Subject: RFR: 8255452: Doing GC during JVMTI MethodExit event posting breaks return oop In-Reply-To: References: Message-ID: On Fri, 30 Oct 2020 06:56:13 GMT, Richard Reingruber wrote: >> Changes requested by coleenp (Reviewer). > > Hi Erik, > > is it possible for GC to mistake a primitive value for a reference when posting the exit event? > > My understanding is: we are at a random bci of a method that is forced to return early. The expression stack is emptied and the return value is pushed on the expression stack then we call into the interpreter runtime to post the JVMTI method exit event during which we come to a safepoint for GC. The oop map for the bci does not cover this forced early return and if the return value is an object then the reference pushed on the expression stack before is not updated by GC. With your fix the value is updated if it is a reference. > > If this is correct then to me it appears as if GC can also crash because the oop map for the random bci tells there has to be a reference at the stack position of the return value if it actually is a primitive value. I think you've discovered JDK-6449023. ------------- PR: https://git.openjdk.java.net/jdk/pull/930 From dlong at openjdk.java.net Fri Oct 30 08:51:47 2020 From: dlong at openjdk.java.net (Dean Long) Date: Fri, 30 Oct 2020 08:51:47 GMT Subject: RFR: 8255452: Doing GC during JVMTI MethodExit event posting breaks return oop In-Reply-To: References: Message-ID: On Thu, 29 Oct 2020 12:44:58 GMT, Erik ?sterlund wrote: > The imasm::remove_activation() call does not deal with safepoints very well. However, when the MethodExit JVMTI event is being called, we call into the runtime in the middle of remove_activation(). If the value being returned is an object type, then the top-of-stack contains the oop. However, the GC does not traverse said oop in any oop map, because it is simply not expected that we safepoint in the middle of remove_activation(). > > The JvmtiExport::post_method_exit() function we end up calling, reads the top-of-stack oop, and puts it in a handle. Then it calls JVMTI callbacks, that eventually call Java and a bunch of stuff that safepoints. So after the JVMTI callback, we can expect the top-of-stack oop to be broken. Unfortunately, when we continue, we therefore end up returning a broken oop. > > Notably, the fact that InterpreterRuntime::post_method_exit is a JRT_ENTRY, is wrong, as we can safepoint on the way back to Java, which will break the return oop in a similar way. So this patch makes it a JRT_BLOCK_ENTRY, moving the transition to VM and back, into a block of code that is protected against GC. Before the JRT_BLOCK is called, we stash away the return oop, and after the JRT_BLOCK_END, we restore the top-of-stack oop. In the path when InterpreterRuntime::post_method_exit is called when throwing an exception, we don't have the same problem of retaining an oop result, and hence the JRT_BLOCK/JRT_BLOCK_END section is not performed in this case; the logic is the same as before for this path. > > This is a JVMTI bug that has probably been around for a long time. It crashes with all GCs, but was discovered recently after concurrent stack processing, as StefanK has been running better GC stressing code in JVMTI, and the bug reproduced more easily with concurrent stack processing, as the timings were a bit different. The following reproducer failed pretty much 100% of the time: > while true; do make test JTREG="RETAIN=all" TEST=test/hotspot/jtreg/vmTestbase/nsk/jdi/MethodExitEvent/returnValue/returnValue003/returnValue003.java TEST_OPTS_JAVA_OPTIONS="-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=0.0001 -XX:ZFragmentationLimit=0.01 -XX:+VerifyOops -XX:+ZVerifyViews -Xint" ; done > > With my fix I can run this repeatedly without any more failures. I have also sanity checked the patch by running tier 1-5, so that it does not introduces any new issues on its own. I have also used Stefan's nice external GC stressing with jcmd technique that was used to trigger crashes with other GCs, to make sure said crashes no longer reproduce either. Marked as reviewed by dlong (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/930 From adinn at openjdk.java.net Fri Oct 30 10:09:47 2020 From: adinn at openjdk.java.net (Andrew Dinn) Date: Fri, 30 Oct 2020 10:09:47 GMT Subject: RFR: JDK-8255544: Create a checked cast In-Reply-To: References: Message-ID: <6YaEcdIMOsmooaeubut73cdV91T9to6s--FTMCuRtqc=.ab00c03d-3a36-4fa0-8b7c-03370cdb2072@github.com> On Wed, 28 Oct 2020 15:50:52 GMT, Andrew Haley wrote: > In many places we've added C-style casts to silence compiler warnings, for example when truncating a size_t to an int when we know the size_t is a small struct. Such casts are inherently risky, because they effectively disable useful compiler warnings. We should add a form of cast that checks at runtime that a truncation does not overflow. This is great. All we need now is to get hotspot devs to use it :-) ------------- Marked as reviewed by adinn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/904 From rkennke at openjdk.java.net Fri Oct 30 11:13:52 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 30 Oct 2020 11:13:52 GMT Subject: RFR: 8255614: Shenandoah: Consolidate/streamline runtime LRBs Message-ID: <3CaMGD6aXUxD8SxDhn1XxU-c0I1NM6r-iosxLLScZT0=.8a0e96ba-17a8-4d59-bcac-7febc85970a1@github.com> Currently, our various LRB entry points are a mess, and quite inefficient too. - We have three entry points, one is checking for null, and calls a non-null version, but that checks for null again - We don't have to check for null at all: it can be subsumed in the cset-check - The LRB resolves forwardee even though has_forwarded_objects() and in_cset() has not been checked - The LRB entry is not inlineable The proposed change coalesces the 3 entries into one, moves it to shenandoahBarrierSet.inline.hpp and make it inlineable, rearranges the impl to allow cset-check to subsume the NULL-check. As a bonus, it pushes the NULL-check around keep-alive down after the (compile-time) check for weak-ref, so that this path becomes a no-op in the majority of cases. ------------- Commit messages: - 8255614: Shenandoah: Consolidate/streamline runtime LRBs Changes: https://git.openjdk.java.net/jdk/pull/953/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=953&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255614 Stats: 105 lines in 7 files changed: 36 ins; 51 del; 18 mod Patch: https://git.openjdk.java.net/jdk/pull/953.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/953/head:pull/953 PR: https://git.openjdk.java.net/jdk/pull/953 From mcimadamore at openjdk.java.net Fri Oct 30 11:40:58 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Fri, 30 Oct 2020 11:40:58 GMT Subject: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator) [v19] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the third incubation round of the foreign memory access API incubation (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: > > * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from multiple threads > * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class has been added, which defines several useful dereference routines; these are really just thin wrappers around memory access var handles, but they make the barrier of entry for using this API somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit of dereference. > > This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done by calling `MemoryAddress::asSegmentRestricted`). > > A list of the API, implementation and test changes is provided below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be happy to point at existing discussions, and/or to provide the feedback required. > > A big thank to Erik Osterlund, Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. > > Thanks > Maurizio > > Javadoc: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which it is easy to derive all the other handles using plain var handle combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. > * `MemoryLayouts` > * A new layout, for machine addresses, has been added to the mix. > > > > ### Implementation changes > > There are two main things to discuss here: support for shared segments, and the general simplification of the memory access var handle support. > > #### Shared segments > > The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment is shared, it would be possible for a thread to close it while another is accessing it. > > After considering several options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a close operation. For this to work, it is crucial that no GC safepoints can occur between a segment liveness check and the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call). It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints. > > Sadly, none of these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First, we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]). > > The question is, then, once we detect that a thread is accessing the very segment we're about to close, what should happen? We first experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread is accessing the segment being closed. > > As written in the javadoc, this doesn't mean that clients should just catch and try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should be treated as such. > > In terms of gritty implementation, we needed to centralize memory access routines in a single place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped` annotation, which tells the VM that something important is going on. > > To achieve this, we created a new (autogenerated) class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access, like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during access (which is important when registering segments against cleaners). > > Of course, to make memory access safe, memory access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead of unsafe, so that a liveness check can be triggered (in case a scope is present). > > `ScopedMemoryAccess` has a `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed successfully. > > The implementation of `MemoryScope` (now significantly simplified from what we had before), has two implementations, one for confined segments and one for shared segments; the main difference between the two is what happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING` state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail. > > #### Memory access var handles overhaul > > The key realization here was that if all memory access var handles took a coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle form. > > This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that e.g. additional offset is injected into a base memory access var handle. > > This also helped in simplifying the implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level access on the innards of the memory access var handle. All that code is now gone. > > #### Test changes > > Not much to see here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test, since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner` functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared segment case. > > [1] - https://openjdk.java.net/jeps/393 > [2] - https://openjdk.java.net/jeps/389 > [3] - https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html > [4] - https://openjdk.java.net/jeps/312 Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 25 commits: - Merge branch 'master' into 8254162 - Fix issues with derived buffers and IO operations - More 32-bit fixes for TestLayouts - * Add final to MappedByteBuffer::SCOPED_MEMORY_ACCESS field * Tweak TestLayouts to make it 32-bit friendly after recent MemoryLayouts tweaks - Remove TestMismatch from 32-bit problem list - Merge branch 'master' into 8254162 - Tweak javadoc for MemorySegment::mapFromPath Tweak alignment for long/double Java layouts on 32 bits platforms - Merge branch 'master' into 8254162 - Address review comment for scoped memory access makefile - Address CSR comments - ... and 15 more: https://git.openjdk.java.net/jdk/compare/e48016b1...bd400615 ------------- Changes: https://git.openjdk.java.net/jdk/pull/548/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=548&range=18 Stats: 7618 lines in 80 files changed: 4892 ins; 1537 del; 1189 mod Patch: https://git.openjdk.java.net/jdk/pull/548.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/548/head:pull/548 PR: https://git.openjdk.java.net/jdk/pull/548 From mcimadamore at openjdk.java.net Fri Oct 30 11:55:05 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Fri, 30 Oct 2020 11:55:05 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v13] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the first incubation round of the foreign linker access API incubation > (see JEP 389 [1]). This work is meant to sit on top of the foreign memory access support (see JEP 393 [2] and associated pull request [3]). > > The main goal of this API is to provide a way to call native functions from Java code without the need of intermediate JNI glue code. In order to do this, native calls are modeled through the MethodHandle API. I suggest reading the writeup [4] I put together few weeks ago, which illustrates what the foreign linker support is, and how it should be used by clients. > > Disclaimer: the pull request mechanism isn't great at managing *dependent* reviews. For this reasons, I'm attaching a webrev which contains only the differences between this PR and the memory access PR. I will be periodically uploading new webrevs, as new iterations come out, to try and make the life of reviewers as simple as possible. > > A big thank to Jorn Vernee and Vladimir Ivanov - they are the main architects of all the hotspot changes you see here, and without their help, the foreign linker support wouldn't be what it is today. As usual, a big thank to Paul Sandoz, who provided many insights (often by trying the bits first hand). > > Thanks > Maurizio > > Webrev: > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/webrev > > Javadoc: > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/javadoc/jdk/incubator/foreign/package-summary.html > > Specdiff (relative to [3]): > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/specdiff_delta/overview-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254232 > > > > ### API Changes > > The API changes are actually rather slim: > > * `LibraryLookup` > * This class allows clients to lookup symbols in native libraries; the interface is fairly simple; you can load a library by name, or absolute path, and then lookup symbols on that library. > * `FunctionDescriptor` > * This is an abstraction that is very similar, in spirit, to `MethodType`; it is, at its core, an aggregate of memory layouts for the function arguments/return type. A function descriptor is used to describe the signature of a native function. > * `CLinker` > * This is the real star of the show. A `CLinker` has two main methods: `downcallHandle` and `upcallStub`; the first takes a native symbol (as obtained from `LibraryLookup`), a `MethodType` and a `FunctionDescriptor` and returns a `MethodHandle` instance which can be used to call the target native symbol. The second takes an existing method handle, and a `FunctionDescriptor` and returns a new `MemorySegment` corresponding to a code stub allocated by the VM which acts as a trampoline from native code to the user-provided method handle. This is very useful for implementing upcalls. > * This class also contains the various layout constants that should be used by clients when describing native signatures (e.g. `C_LONG` and friends); these layouts contain additional ABI classfication information (in the form of layout attributes) which is used by the runtime to *infer* how Java arguments should be shuffled for the native call to take place. > * Finally, this class provides some helper functions e.g. so that clients can convert Java strings into C strings and back. > * `NativeScope` > * This is an helper class which allows clients to group together logically related allocations; that is, rather than allocating separate memory segments using separate *try-with-resource* constructs, a `NativeScope` allows clients to use a _single_ block, and allocate all the required segments there. This is not only an usability boost, but also a performance boost, since not all allocation requests will be turned into `malloc` calls. > * `MemorySegment` > * Only one method added here - namely `handoff(NativeScope)` which allows a segment to be transferred onto an existing native scope. > > ### Safety > > The foreign linker API is intrinsically unsafe; many things can go wrong when requesting a native method handle. For instance, the description of the native signature might be wrong (e.g. have too many arguments) - and the runtime has, in the general case, no way to detect such mismatches. For these reasons, obtaining a `CLinker` instance is a *restricted* operation, which can be enabled by specifying the usual JDK property `-Dforeign.restricted=permit` (as it's the case for other restricted method in the foreign memory API). > > ### Implementation changes > > The Java changes associated with `LibraryLookup` are relative straightforward; the only interesting thing to note here is that library loading does _not_ depend on class loaders, so `LibraryLookup` is not subject to the same restrictions which apply to JNI library loading (e.g. same library cannot be loaded by different classloaders). > > As for `NativeScope` the changes are again relatively straightforward; it is an API which sits neatly on top of the foreign meory access API, providing some kind of allocation service which shares the same underlying memory segment(s), and turns an allocation request into a segment slice, which is a much less expensive operation. `NativeScope` comes in two variants: there are native scopes for which the allocation size is known a priori, and native scopes which can grow - these two schemes are implemented by two separate subclasses of `AbstractNativeScopeImpl`. > > Of course the bulk of the changes are to support the `CLinker` downcall/upcall routines. These changes cut pretty deep into the JVM; I'll briefly summarize the goal of some of this changes - for further details, Jorn has put together a detailed writeup which explains the rationale behind the VM support, with some references to the code [5]. > > The main idea behind foreign linker is to infer, given a Java method type (expressed as a `MethodType` instance) and the description of the signature of a native function (expressed as a `FunctionDescriptor` instance) a _recipe_ that can be used to turn a Java call into the corresponding native call targeting the requested native function. > > This inference scheme can be defined in a pretty straightforward fashion by looking at the various ABI specifications (for instance, see [6] for the SysV ABI, which is the one used on Linux/Mac). The various `CallArranger` classes, of which we have a flavor for each supported platform, do exactly that kind of inference. > > For the inference process to work, we need to attach extra information to memory layouts; it is no longer sufficient to know e.g. that a layout is 32/64 bits - we need to know whether it is meant to represent a floating point value, or an integral value; this knowledge is required because floating points are passed in different registers by most ABIs. For this reason, `CLinker` offers a set of pre-baked, platform-dependent layout constants which contain the required classification attributes (e.g. a `Clinker.TypeKind` enum value). The runtime extracts this attribute, and performs classification accordingly. > > A native call is decomposed into a sequence of basic, primitive operations, called `Binding` (see the great javadoc on the `Binding.java` class for more info). There are many such bindings - for instance the `Move` binding is used to move a value into a specific machine register/stack slot. So, the main job of the various `CallingArranger` classes is to determine, given a Java `MethodType` and `FunctionDescriptor` what is the set of bindings associated with the downcall/upcall. > > At the heart of the foreign linker support is the `ProgrammableInvoker` class. This class effectively generates a `MethodHandle` which follows the steps described by the various bindings obtained by `CallArranger`. There are actually various strategies to interpret these bindings - listed below: > > * basic intepreted mode; in this mode, all bindings are interpreted using a stack-based machine written in Java (see `BindingInterpreter`), except for the `Move` bindings. For these bindings, the move is implemented by allocating a *buffer* (whose size is ABI specific) and by moving all the lowered values into positions within this buffer. The buffer is then passed to a piece of assembly code inside the VM which takes values from the buffer and moves them in their expected registers/stack slots (note that each position in the buffer corresponds to a different register). This is the most general invocation mode, the more "customizable" one, but also the slowest - since for every call there is some extra allocation which takes place. > > * specialized interpreted mode; same as before, but instead of interpreting the bindings with a stack-based interpreter, we generate a method handle chain which effectively interprets all the bindings (again, except `Move` ones). > > * intrinsified mode; this is typically used in combination with the specialized interpreted mode described above (although it can also be used with the Java-based binding interpreter). The goal here is to remove the buffer allocation and copy by introducing an additional JVM intrinsic. If a native call recipe is constant (e.g. the set of bindings is constant, which is probably the case if the native method handle is stored in a `static`, `final` field), then the VM can generate specialized assembly code which interprets the `Move` binding without the need to go for an intermediate buffer. This gives us back performances that are on par with JNI. > > For upcalls, the support is not (yet) as advanced, and only the basic interpreted mode is available there. We plan to add support for intrinsified modes there as well, which should considerably boost perfomances (probably well beyond what JNI can offer at the moment, since the upcall support in JNI is not very well optimized). > > Again, for more readings on the internals of the foreign linker support, please refer to [5]. > > #### Test changes > > Many new tests have been added to validate the foreign linker support; we have high level tests (see `StdLibTest`) which aim at testing the linker from the perspective of code that clients could write. But we also have deeper combinatorial tests (see `TestUpcall` and `TestDowncall`) which are meant to stress every corner of the ABI implementation. There are also some great tests (see the `callarranger` folder) which test the various `CallArranger`s for all the possible platforms; these tests adopt more of a white-box approach - that is, instead of treating the linker machinery as a black box and verify that the support works by checking that the native call returned the results we expected, these tests aims at checking that the set of bindings generated by the call arranger is correct. This also mean that we can test the classification logic for Windows, Mac and Linux regardless of the platform we're executing on. > > Some additional microbenchmarks have been added to compare the performances of downcall/upcall with JNI. > > [1] - https://openjdk.java.net/jeps/389 > [2] - https://openjdk.java.net/jeps/393 > [3] - https://git.openjdk.java.net/jdk/pull/548 > [4] - https://github.com/openjdk/panama-foreign/blob/foreign-jextract/doc/panama_ffi.md > [5] - http://cr.openjdk.java.net/~jvernee/docs/Foreign-abi%20downcall%20intrinsics%20technical%20description.html Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 58 commits: - Merge branch '8254162' into 8254231_linker - Merge branch 'master' into 8254162 - Fix issues with derived buffers and IO operations - More 32-bit fixes for TestLayouts - * Add final to MappedByteBuffer::SCOPED_MEMORY_ACCESS field * Tweak TestLayouts to make it 32-bit friendly after recent MemoryLayouts tweaks - Remove TestMismatch from 32-bit problem list - Merge branch 'master' into 8254162 - Tweak javadoc for MemorySegment::mapFromPath Tweak alignment for long/double Java layouts on 32 bits platforms - Merge branch 'master' into 8254162 - Address review comment for scoped memory access makefile - ... and 48 more: https://git.openjdk.java.net/jdk/compare/e48016b1...4a2c2240 ------------- Changes: https://git.openjdk.java.net/jdk/pull/634/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=634&range=12 Stats: 75299 lines in 271 files changed: 72395 ins; 1615 del; 1289 mod Patch: https://git.openjdk.java.net/jdk/pull/634.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/634/head:pull/634 PR: https://git.openjdk.java.net/jdk/pull/634 From zgu at openjdk.java.net Fri Oct 30 12:14:55 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Fri, 30 Oct 2020 12:14:55 GMT Subject: RFR: 8255606: Enable concurrent stack processing on x86_32 platforms [v2] In-Reply-To: References: Message-ID: > 8255606: Enable concurrent stack processing on x86_32 platforms Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: Fix jump direction ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/945/files - new: https://git.openjdk.java.net/jdk/pull/945/files/215f7f7d..c946b816 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=945&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=945&range=00-01 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/945.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/945/head:pull/945 PR: https://git.openjdk.java.net/jdk/pull/945 From mcimadamore at openjdk.java.net Fri Oct 30 12:16:02 2020 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Fri, 30 Oct 2020 12:16:02 GMT Subject: RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v14] In-Reply-To: References: Message-ID: > This patch contains the changes associated with the first incubation round of the foreign linker access API incubation > (see JEP 389 [1]). This work is meant to sit on top of the foreign memory access support (see JEP 393 [2] and associated pull request [3]). > > The main goal of this API is to provide a way to call native functions from Java code without the need of intermediate JNI glue code. In order to do this, native calls are modeled through the MethodHandle API. I suggest reading the writeup [4] I put together few weeks ago, which illustrates what the foreign linker support is, and how it should be used by clients. > > Disclaimer: the pull request mechanism isn't great at managing *dependent* reviews. For this reasons, I'm attaching a webrev which contains only the differences between this PR and the memory access PR. I will be periodically uploading new webrevs, as new iterations come out, to try and make the life of reviewers as simple as possible. > > A big thank to Jorn Vernee and Vladimir Ivanov - they are the main architects of all the hotspot changes you see here, and without their help, the foreign linker support wouldn't be what it is today. As usual, a big thank to Paul Sandoz, who provided many insights (often by trying the bits first hand). > > Thanks > Maurizio > > Webrev: > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/webrev > > Javadoc: > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/javadoc/jdk/incubator/foreign/package-summary.html > > Specdiff (relative to [3]): > > http://cr.openjdk.java.net/~mcimadamore/8254231_v1/specdiff_delta/overview-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254232 > > > > ### API Changes > > The API changes are actually rather slim: > > * `LibraryLookup` > * This class allows clients to lookup symbols in native libraries; the interface is fairly simple; you can load a library by name, or absolute path, and then lookup symbols on that library. > * `FunctionDescriptor` > * This is an abstraction that is very similar, in spirit, to `MethodType`; it is, at its core, an aggregate of memory layouts for the function arguments/return type. A function descriptor is used to describe the signature of a native function. > * `CLinker` > * This is the real star of the show. A `CLinker` has two main methods: `downcallHandle` and `upcallStub`; the first takes a native symbol (as obtained from `LibraryLookup`), a `MethodType` and a `FunctionDescriptor` and returns a `MethodHandle` instance which can be used to call the target native symbol. The second takes an existing method handle, and a `FunctionDescriptor` and returns a new `MemorySegment` corresponding to a code stub allocated by the VM which acts as a trampoline from native code to the user-provided method handle. This is very useful for implementing upcalls. > * This class also contains the various layout constants that should be used by clients when describing native signatures (e.g. `C_LONG` and friends); these layouts contain additional ABI classfication information (in the form of layout attributes) which is used by the runtime to *infer* how Java arguments should be shuffled for the native call to take place. > * Finally, this class provides some helper functions e.g. so that clients can convert Java strings into C strings and back. > * `NativeScope` > * This is an helper class which allows clients to group together logically related allocations; that is, rather than allocating separate memory segments using separate *try-with-resource* constructs, a `NativeScope` allows clients to use a _single_ block, and allocate all the required segments there. This is not only an usability boost, but also a performance boost, since not all allocation requests will be turned into `malloc` calls. > * `MemorySegment` > * Only one method added here - namely `handoff(NativeScope)` which allows a segment to be transferred onto an existing native scope. > > ### Safety > > The foreign linker API is intrinsically unsafe; many things can go wrong when requesting a native method handle. For instance, the description of the native signature might be wrong (e.g. have too many arguments) - and the runtime has, in the general case, no way to detect such mismatches. For these reasons, obtaining a `CLinker` instance is a *restricted* operation, which can be enabled by specifying the usual JDK property `-Dforeign.restricted=permit` (as it's the case for other restricted method in the foreign memory API). > > ### Implementation changes > > The Java changes associated with `LibraryLookup` are relative straightforward; the only interesting thing to note here is that library loading does _not_ depend on class loaders, so `LibraryLookup` is not subject to the same restrictions which apply to JNI library loading (e.g. same library cannot be loaded by different classloaders). > > As for `NativeScope` the changes are again relatively straightforward; it is an API which sits neatly on top of the foreign meory access API, providing some kind of allocation service which shares the same underlying memory segment(s), and turns an allocation request into a segment slice, which is a much less expensive operation. `NativeScope` comes in two variants: there are native scopes for which the allocation size is known a priori, and native scopes which can grow - these two schemes are implemented by two separate subclasses of `AbstractNativeScopeImpl`. > > Of course the bulk of the changes are to support the `CLinker` downcall/upcall routines. These changes cut pretty deep into the JVM; I'll briefly summarize the goal of some of this changes - for further details, Jorn has put together a detailed writeup which explains the rationale behind the VM support, with some references to the code [5]. > > The main idea behind foreign linker is to infer, given a Java method type (expressed as a `MethodType` instance) and the description of the signature of a native function (expressed as a `FunctionDescriptor` instance) a _recipe_ that can be used to turn a Java call into the corresponding native call targeting the requested native function. > > This inference scheme can be defined in a pretty straightforward fashion by looking at the various ABI specifications (for instance, see [6] for the SysV ABI, which is the one used on Linux/Mac). The various `CallArranger` classes, of which we have a flavor for each supported platform, do exactly that kind of inference. > > For the inference process to work, we need to attach extra information to memory layouts; it is no longer sufficient to know e.g. that a layout is 32/64 bits - we need to know whether it is meant to represent a floating point value, or an integral value; this knowledge is required because floating points are passed in different registers by most ABIs. For this reason, `CLinker` offers a set of pre-baked, platform-dependent layout constants which contain the required classification attributes (e.g. a `Clinker.TypeKind` enum value). The runtime extracts this attribute, and performs classification accordingly. > > A native call is decomposed into a sequence of basic, primitive operations, called `Binding` (see the great javadoc on the `Binding.java` class for more info). There are many such bindings - for instance the `Move` binding is used to move a value into a specific machine register/stack slot. So, the main job of the various `CallingArranger` classes is to determine, given a Java `MethodType` and `FunctionDescriptor` what is the set of bindings associated with the downcall/upcall. > > At the heart of the foreign linker support is the `ProgrammableInvoker` class. This class effectively generates a `MethodHandle` which follows the steps described by the various bindings obtained by `CallArranger`. There are actually various strategies to interpret these bindings - listed below: > > * basic intepreted mode; in this mode, all bindings are interpreted using a stack-based machine written in Java (see `BindingInterpreter`), except for the `Move` bindings. For these bindings, the move is implemented by allocating a *buffer* (whose size is ABI specific) and by moving all the lowered values into positions within this buffer. The buffer is then passed to a piece of assembly code inside the VM which takes values from the buffer and moves them in their expected registers/stack slots (note that each position in the buffer corresponds to a different register). This is the most general invocation mode, the more "customizable" one, but also the slowest - since for every call there is some extra allocation which takes place. > > * specialized interpreted mode; same as before, but instead of interpreting the bindings with a stack-based interpreter, we generate a method handle chain which effectively interprets all the bindings (again, except `Move` ones). > > * intrinsified mode; this is typically used in combination with the specialized interpreted mode described above (although it can also be used with the Java-based binding interpreter). The goal here is to remove the buffer allocation and copy by introducing an additional JVM intrinsic. If a native call recipe is constant (e.g. the set of bindings is constant, which is probably the case if the native method handle is stored in a `static`, `final` field), then the VM can generate specialized assembly code which interprets the `Move` binding without the need to go for an intermediate buffer. This gives us back performances that are on par with JNI. > > For upcalls, the support is not (yet) as advanced, and only the basic interpreted mode is available there. We plan to add support for intrinsified modes there as well, which should considerably boost perfomances (probably well beyond what JNI can offer at the moment, since the upcall support in JNI is not very well optimized). > > Again, for more readings on the internals of the foreign linker support, please refer to [5]. > > #### Test changes > > Many new tests have been added to validate the foreign linker support; we have high level tests (see `StdLibTest`) which aim at testing the linker from the perspective of code that clients could write. But we also have deeper combinatorial tests (see `TestUpcall` and `TestDowncall`) which are meant to stress every corner of the ABI implementation. There are also some great tests (see the `callarranger` folder) which test the various `CallArranger`s for all the possible platforms; these tests adopt more of a white-box approach - that is, instead of treating the linker machinery as a black box and verify that the support works by checking that the native call returned the results we expected, these tests aims at checking that the set of bindings generated by the call arranger is correct. This also mean that we can test the classification logic for Windows, Mac and Linux regardless of the platform we're executing on. > > Some additional microbenchmarks have been added to compare the performances of downcall/upcall with JNI. > > [1] - https://openjdk.java.net/jeps/389 > [2] - https://openjdk.java.net/jeps/393 > [3] - https://git.openjdk.java.net/jdk/pull/548 > [4] - https://github.com/openjdk/panama-foreign/blob/foreign-jextract/doc/panama_ffi.md > [5] - http://cr.openjdk.java.net/~jvernee/docs/Foreign-abi%20downcall%20intrinsics%20technical%20description.html Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Fix typo in upcall helper for aarch64 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/634/files - new: https://git.openjdk.java.net/jdk/pull/634/files/4a2c2240..98718866 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=634&range=13 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=634&range=12-13 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/634.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/634/head:pull/634 PR: https://git.openjdk.java.net/jdk/pull/634 From coleenp at openjdk.java.net Fri Oct 30 12:33:58 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 30 Oct 2020 12:33:58 GMT Subject: RFR: 8255452: Doing GC during JVMTI MethodExit event posting breaks return oop In-Reply-To: References: Message-ID: On Thu, 29 Oct 2020 22:34:30 GMT, Erik ?sterlund wrote: >> src/hotspot/share/prims/jvmtiExport.cpp line 1600: >> >>> 1598: >>> 1599: if (exception_exit) { >>> 1600: post_method_exit_inner(thread, mh, state, exception_exit, current_frame, result, value); >> >> I think for exception exit, you also need JRT_BLOCK because you want the transition to thread_in_VM for this code, since JRT_BLOCK_ENTRY doesn't do the transition. It should be safe for exception exit and retain the old behavior. > > Thanks for having a look coleen. In fact, not doing the JRT_BLOCK for the exception entry is intentional, because that entry goes through a different JRT_ENTRY (not JRT_BLOCK_ENTRY), that already transitions. So if I do the JRT_BLOCK for the exception path, it asserts saying hey you are already in VM. Oh that's actually horrible. I wonder if it's possible to hoist saving the result oop into the InterpreterRuntime entry. And pass the Handle into JvmtiExport::post_method_exit(). ------------- PR: https://git.openjdk.java.net/jdk/pull/930 From zgu at openjdk.java.net Fri Oct 30 13:12:05 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Fri, 30 Oct 2020 13:12:05 GMT Subject: RFR: 8255614: Shenandoah: Consolidate/streamline runtime LRBs In-Reply-To: <3CaMGD6aXUxD8SxDhn1XxU-c0I1NM6r-iosxLLScZT0=.8a0e96ba-17a8-4d59-bcac-7febc85970a1@github.com> References: <3CaMGD6aXUxD8SxDhn1XxU-c0I1NM6r-iosxLLScZT0=.8a0e96ba-17a8-4d59-bcac-7febc85970a1@github.com> Message-ID: On Fri, 30 Oct 2020 11:06:54 GMT, Roman Kennke wrote: > Currently, our various LRB entry points are a mess, and quite inefficient too. > - We have three entry points, one is checking for null, and calls a non-null version, but that checks for null again > - We don't have to check for null at all: it can be subsumed in the cset-check > - The LRB resolves forwardee even though has_forwarded_objects() and in_cset() has not been checked > - The LRB entry is not inlineable > > The proposed change coalesces the 3 entries into one, moves it to shenandoahBarrierSet.inline.hpp and make it inlineable, rearranges the impl to allow cset-check to subsume the NULL-check. As a bonus, it pushes the NULL-check around keep-alive down after the (compile-time) check for weak-ref, so that this path becomes a no-op in the majority of cases. src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.inline.hpp line 83: > 81: return obj; > 82: } > 83: if (_heap->has_forwarded_objects() && Please add an assertion for obj != NULL src/hotspot/share/gc/shenandoah/shenandoahCollectionSet.inline.hpp line 49: > 47: bool ShenandoahCollectionSet::is_in_loc(void* p) const { > 48: assert(p == NULL || _heap->is_in(p), "Must be in the heap"); > 49: uintx index = ((uintx) p) >> _region_size_bytes_shift; Is this right? if heap is not zero-based ------------- PR: https://git.openjdk.java.net/jdk/pull/953 From rkennke at openjdk.java.net Fri Oct 30 13:33:12 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 30 Oct 2020 13:33:12 GMT Subject: RFR: 8255614: Shenandoah: Consolidate/streamline runtime LRBs [v2] In-Reply-To: References: <3CaMGD6aXUxD8SxDhn1XxU-c0I1NM6r-iosxLLScZT0=.8a0e96ba-17a8-4d59-bcac-7febc85970a1@github.com> Message-ID: On Fri, 30 Oct 2020 13:05:23 GMT, Zhengyu Gu wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Add null-check > > src/hotspot/share/gc/shenandoah/shenandoahCollectionSet.inline.hpp line 49: > >> 47: bool ShenandoahCollectionSet::is_in_loc(void* p) const { >> 48: assert(p == NULL || _heap->is_in(p), "Must be in the heap"); >> 49: uintx index = ((uintx) p) >> _region_size_bytes_shift; > > Is this right? if heap is not zero-based Yes. The biased cset-map is allocated such that NULL object maps to a special page that always yields false when checking in_cset(NULL). We use that same technique in JIT-compiled code to avoid explicit NULL-checks. ------------- PR: https://git.openjdk.java.net/jdk/pull/953 From rkennke at openjdk.java.net Fri Oct 30 13:33:11 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 30 Oct 2020 13:33:11 GMT Subject: RFR: 8255614: Shenandoah: Consolidate/streamline runtime LRBs [v2] In-Reply-To: <3CaMGD6aXUxD8SxDhn1XxU-c0I1NM6r-iosxLLScZT0=.8a0e96ba-17a8-4d59-bcac-7febc85970a1@github.com> References: <3CaMGD6aXUxD8SxDhn1XxU-c0I1NM6r-iosxLLScZT0=.8a0e96ba-17a8-4d59-bcac-7febc85970a1@github.com> Message-ID: > Currently, our various LRB entry points are a mess, and quite inefficient too. > - We have three entry points, one is checking for null, and calls a non-null version, but that checks for null again > - We don't have to check for null at all: it can be subsumed in the cset-check > - The LRB resolves forwardee even though has_forwarded_objects() and in_cset() has not been checked > - The LRB entry is not inlineable > > The proposed change coalesces the 3 entries into one, moves it to shenandoahBarrierSet.inline.hpp and make it inlineable, rearranges the impl to allow cset-check to subsume the NULL-check. As a bonus, it pushes the NULL-check around keep-alive down after the (compile-time) check for weak-ref, so that this path becomes a no-op in the majority of cases. Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Add null-check ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/953/files - new: https://git.openjdk.java.net/jdk/pull/953/files/f2b87253..99db8d30 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=953&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=953&range=00-01 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/953.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/953/head:pull/953 PR: https://git.openjdk.java.net/jdk/pull/953 From zgu at openjdk.java.net Fri Oct 30 14:12:59 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Fri, 30 Oct 2020 14:12:59 GMT Subject: RFR: 8255614: Shenandoah: Consolidate/streamline runtime LRBs [v2] In-Reply-To: References: <3CaMGD6aXUxD8SxDhn1XxU-c0I1NM6r-iosxLLScZT0=.8a0e96ba-17a8-4d59-bcac-7febc85970a1@github.com> Message-ID: On Fri, 30 Oct 2020 13:33:11 GMT, Roman Kennke wrote: >> Currently, our various LRB entry points are a mess, and quite inefficient too. >> - We have three entry points, one is checking for null, and calls a non-null version, but that checks for null again >> - We don't have to check for null at all: it can be subsumed in the cset-check >> - The LRB resolves forwardee even though has_forwarded_objects() and in_cset() has not been checked >> - The LRB entry is not inlineable >> >> The proposed change coalesces the 3 entries into one, moves it to shenandoahBarrierSet.inline.hpp and make it inlineable, rearranges the impl to allow cset-check to subsume the NULL-check. As a bonus, it pushes the NULL-check around keep-alive down after the (compile-time) check for weak-ref, so that this path becomes a no-op in the majority of cases. > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Add null-check Marked as reviewed by zgu (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/953 From eosterlund at openjdk.java.net Fri Oct 30 14:12:57 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 30 Oct 2020 14:12:57 GMT Subject: RFR: 8255452: Doing GC during JVMTI MethodExit event posting breaks return oop In-Reply-To: References: Message-ID: On Fri, 30 Oct 2020 00:58:06 GMT, Coleen Phillimore wrote: >> Thanks for having a look coleen. In fact, not doing the JRT_BLOCK for the exception entry is intentional, because that entry goes through a different JRT_ENTRY (not JRT_BLOCK_ENTRY), that already transitions. So if I do the JRT_BLOCK for the exception path, it asserts saying hey you are already in VM. > > Oh that's actually horrible. I wonder if it's possible to hoist saving the result oop into the InterpreterRuntime entry. And pass the Handle into JvmtiExport::post_method_exit(). I tried that first, and ended up with a bunch of non-trivial code duplication instead, as reading the oop is done in both paths but for different reasons. One to preserve/restore it (interpreter remove_activation entry), but also inside of JvmtiExport::post_method_exit() so that it can be passed into the MethodExit. I will give it another shot and see if it is possible to refactor it in a better way. ------------- PR: https://git.openjdk.java.net/jdk/pull/930 From eosterlund at openjdk.java.net Fri Oct 30 14:23:55 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 30 Oct 2020 14:23:55 GMT Subject: RFR: 8255452: Doing GC during JVMTI MethodExit event posting breaks return oop In-Reply-To: References: Message-ID: On Fri, 30 Oct 2020 08:49:08 GMT, Dean Long wrote: >> The imasm::remove_activation() call does not deal with safepoints very well. However, when the MethodExit JVMTI event is being called, we call into the runtime in the middle of remove_activation(). If the value being returned is an object type, then the top-of-stack contains the oop. However, the GC does not traverse said oop in any oop map, because it is simply not expected that we safepoint in the middle of remove_activation(). >> >> The JvmtiExport::post_method_exit() function we end up calling, reads the top-of-stack oop, and puts it in a handle. Then it calls JVMTI callbacks, that eventually call Java and a bunch of stuff that safepoints. So after the JVMTI callback, we can expect the top-of-stack oop to be broken. Unfortunately, when we continue, we therefore end up returning a broken oop. >> >> Notably, the fact that InterpreterRuntime::post_method_exit is a JRT_ENTRY, is wrong, as we can safepoint on the way back to Java, which will break the return oop in a similar way. So this patch makes it a JRT_BLOCK_ENTRY, moving the transition to VM and back, into a block of code that is protected against GC. Before the JRT_BLOCK is called, we stash away the return oop, and after the JRT_BLOCK_END, we restore the top-of-stack oop. In the path when InterpreterRuntime::post_method_exit is called when throwing an exception, we don't have the same problem of retaining an oop result, and hence the JRT_BLOCK/JRT_BLOCK_END section is not performed in this case; the logic is the same as before for this path. >> >> This is a JVMTI bug that has probably been around for a long time. It crashes with all GCs, but was discovered recently after concurrent stack processing, as StefanK has been running better GC stressing code in JVMTI, and the bug reproduced more easily with concurrent stack processing, as the timings were a bit different. The following reproducer failed pretty much 100% of the time: >> while true; do make test JTREG="RETAIN=all" TEST=test/hotspot/jtreg/vmTestbase/nsk/jdi/MethodExitEvent/returnValue/returnValue003/returnValue003.java TEST_OPTS_JAVA_OPTIONS="-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=0.0001 -XX:ZFragmentationLimit=0.01 -XX:+VerifyOops -XX:+ZVerifyViews -Xint" ; done >> >> With my fix I can run this repeatedly without any more failures. I have also sanity checked the patch by running tier 1-5, so that it does not introduces any new issues on its own. I have also used Stefan's nice external GC stressing with jcmd technique that was used to trigger crashes with other GCs, to make sure said crashes no longer reproduce either. > > Marked as reviewed by dlong (Reviewer). > I think you've discovered JDK-6449023. And you fix looks like the workaround I tried: > https://bugs.openjdk.java.net/browse/JDK-6449023?focusedCommentId=14206078&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14206078 Oh wow. I had no idea people have been having issues with this since 2009! Thanks for the pointer. Well, let's hope we can finally close it now after marinating the bug for 11 years. ------------- PR: https://git.openjdk.java.net/jdk/pull/930 From kbarrett at openjdk.java.net Fri Oct 30 14:29:59 2020 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 30 Oct 2020 14:29:59 GMT Subject: RFR: 8255596: Mutex safepoint checking options and flags should be scoped enums Message-ID: Please review this change to some enums in the Mutex class. SafepointCheckFlag and SafepointCheckRequired are changed to scoped enums. Also removed the anonymous enum defining _allow_vm_block_flag and _as_suspend_equivalent_flag, instead defining those as bool constants. To avoid changing all references to the SafepointCheckXXX enumerators (due to the additional scoping introduced by using scoped enums), same named constants are defined at Mutex class scope. Some renaming might be preferable in the long term, but I didn't want to do that just to get the improved type checking. An X-macro approach to defining the enumerators and hoisting them into class scope could have been taken, but the number of enumerators here doesn't seem to warrant the additional infrastructure to do so. Changing the enum types uncovered a few places in the implementation of Mutex and MutexLocker where enum values were being implicitly converted to bool, with associated assumptions about the order or values of the enumerators. Those have been fixed. Testing: tier1 ------------- Commit messages: - fix assert messages - use scoped enums Changes: https://git.openjdk.java.net/jdk/pull/957/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=957&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255596 Stats: 23 lines in 3 files changed: 12 ins; 2 del; 9 mod Patch: https://git.openjdk.java.net/jdk/pull/957.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/957/head:pull/957 PR: https://git.openjdk.java.net/jdk/pull/957 From tschatzl at openjdk.java.net Fri Oct 30 15:29:54 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Fri, 30 Oct 2020 15:29:54 GMT Subject: RFR: 8255596: Mutex safepoint checking options and flags should be scoped enums In-Reply-To: References: Message-ID: On Fri, 30 Oct 2020 14:20:25 GMT, Kim Barrett wrote: > Please review this change to some enums in the Mutex class. > SafepointCheckFlag and SafepointCheckRequired are changed to scoped > enums. Also removed the anonymous enum defining _allow_vm_block_flag > and _as_suspend_equivalent_flag, instead defining those as bool > constants. > > To avoid changing all references to the SafepointCheckXXX enumerators > (due to the additional scoping introduced by using scoped enums), same > named constants are defined at Mutex class scope. Some renaming might > be preferable in the long term, but I didn't want to do that just to > get the improved type checking. An X-macro approach to defining the > enumerators and hoisting them into class scope could have been taken, > but the number of enumerators here doesn't seem to warrant the additional > infrastructure to do so. > > Changing the enum types uncovered a few places in the implementation > of Mutex and MutexLocker where enum values were being implicitly > converted to bool, with associated assumptions about the order or > values of the enumerators. Those have been fixed. > > Testing: > tier1 Lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/957 From gziemski at openjdk.java.net Fri Oct 30 15:56:09 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Fri, 30 Oct 2020 15:56:09 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) [v4] In-Reply-To: References: Message-ID: > hi all, > > Please review this simple fix for POSIX platforms, which addresses a time out that occurs while handling a crash with UseOSErrorReporting turned ON. > > It appears that "UseOSErrorReporting" flag was only ever meant to be used on Windows platform and was mistakenly left available for other platforms. In this fix we make sure to only use the flag on Windows platform and make it a NOP for other platforms. > > Note #1: A similar hang issue occurs today even on Windows, with the only difference being that before a process times out (takes 2 minutes) it runs out of stack space in about 250 loops, so that's the only reason it doesn't linger for that long. Windows issue is tracked separately by https://bugs.openjdk.java.net/browse/JDK-8250782 > > Note #2: Creating native crash log (on macOS) is a non-trivial, research wise effort, that is tracked by https://bugs.openjdk.java.net/browse/JDK-8237727 > > Note #3 Removal of the "UseOSErrorReporting" flag will be depended on whether we can do #2 and at that time we can decide whether to keep it and implement it for other platforms or whether to remove it, provided that #2 can not be done reliably. Gerard Ziemski has updated the pull request incrementally with two additional commits since the last revision: - Fixed one more leftover UseOsErrorReporting to UseOSErrorReporting - last tweaks and fixes ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/813/files - new: https://git.openjdk.java.net/jdk/pull/813/files/b849b3c4..e21976a0 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=813&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=813&range=02-03 Stats: 8 lines in 3 files changed: 0 ins; 4 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/813.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/813/head:pull/813 PR: https://git.openjdk.java.net/jdk/pull/813 From eosterlund at openjdk.java.net Fri Oct 30 16:05:55 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 30 Oct 2020 16:05:55 GMT Subject: RFR: 8255452: Doing GC during JVMTI MethodExit event posting breaks return oop In-Reply-To: References: Message-ID: On Fri, 30 Oct 2020 14:20:42 GMT, Erik ?sterlund wrote: >> Marked as reviewed by dlong (Reviewer). > >> I think you've discovered JDK-6449023. And you fix looks like the workaround I tried: >> https://bugs.openjdk.java.net/browse/JDK-6449023?focusedCommentId=14206078&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14206078 > > Oh wow. I had no idea people have been having issues with this since 2009! Thanks for the pointer. Well, let's hope we can finally close it now after marinating the bug for 11 years. > Hi Erik, > > is it possible for GC to mistake a primitive value for a reference when posting the exit event? > > My understanding is: we are at a random bci of a method that is forced to return early. The expression stack is emptied and the return value is pushed on the expression stack then we call into the interpreter runtime to post the JVMTI method exit event during which we come to a safepoint for GC. The oop map for the bci does not cover this forced early return and if the return value is an object then the reference pushed on the expression stack before is not updated by GC. With your fix the value is updated if it is a reference. > > If this is correct then to me it appears as if GC can also crash because the oop map for the random bci tells there has to be a reference at the stack position of the return value if it actually is a primitive value. I think what you are saying is true. Note though that the return value of ForceEarlyReturn is installed with a handshake. The handshake polls of the interpreter are emitted in loop backedges and returns. At loop backedges, the expression stack is empty (required by OSR), and at returns the types match correctly. However, if an arbitrary bytecode performs a runtime call with call_VM() while the bottom of the expression stack is an oop, then I think there is an issue. At that call_VM, the early return value could get installed, and when the C++ function returns, we check for early returns, further dispatching to an unwind routine that posts the MethodExit notification. If we GC during this MethodExit notification, then I think you can crash the GC. The GC code generates an oop map for the frame, checking what the types in the expression stack should be. The early return int is pushed on the slot intersecting with the bottom entry in the expression stack. That bottom entry could be an o op, and the early return value could be an int. Then the early return int will be passed to the oop closure, which should result in a crash. So I suspect that almost always, the handshake installing the ForceEarlyReturn value is installed with a handshake in a bytecode backedge or at a return, where the interpreter safepoint polls for the fast path code. Then you won't notice the issue. But in the rare scenario that the ForceEarlyReturn value is installed in a slow path call from a random bytecode... I can't see how that would work correctly. ------------- PR: https://git.openjdk.java.net/jdk/pull/930 From eosterlund at openjdk.java.net Fri Oct 30 16:40:53 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 30 Oct 2020 16:40:53 GMT Subject: RFR: 8255452: Doing GC during JVMTI MethodExit event posting breaks return oop In-Reply-To: References: Message-ID: On Fri, 30 Oct 2020 16:02:41 GMT, Erik ?sterlund wrote: > > Hi Erik, > > is it possible for GC to mistake a primitive value for a reference when posting the exit event? > > My understanding is: we are at a random bci of a method that is forced to return early. The expression stack is emptied and the return value is pushed on the expression stack then we call into the interpreter runtime to post the JVMTI method exit event during which we come to a safepoint for GC. The oop map for the bci does not cover this forced early return and if the return value is an object then the reference pushed on the expression stack before is not updated by GC. With your fix the value is updated if it is a reference. > > If this is correct then to me it appears as if GC can also crash because the oop map for the random bci tells there has to be a reference at the stack position of the return value if it actually is a primitive value. > > I think what you are saying is true. Note though that the return value of ForceEarlyReturn is installed with a handshake. The handshake polls of the interpreter are emitted in loop backedges and returns. At loop backedges, the expression stack is empty (required by OSR), and at returns the types match correctly. However, if an arbitrary bytecode performs a runtime call with call_VM() while the bottom of the expression stack is an oop, then I think there is an issue. At that call_VM, the early return value could get installed, and when the C++ function returns, we check for early returns, further dispatching to an unwind routine that posts the MethodExit notification. If we GC during this MethodExit notification, then I think you can crash the GC. The GC code generates an oop map for the frame, checking what the types in the expression stack should be. The early return int is pushed on the slot intersecting with the bottom entry in the expression stack. That bottom entry could be an oop, and the early return value could be an int. Then the early return int will be passed to the oop closure, which should result in a crash. > > So I suspect that almost always, the handshake installing the ForceEarlyReturn value is installed with a handshake in a bytecode backedge or at a return, where the interpreter safepoint polls for the fast path code. Then you won't notice the issue. But in the rare scenario that the ForceEarlyReturn value is installed in a slow path call from a random bytecode... I can't see how that would work correctly. Looking more closely, I gotta say I have no idea why there is a call to clear the expression stack at all in TemplateInterpreterGenerator::generate_earlyret_entry_for(). It is unclear to me what problem if any that solves. It does however seem like it introduces this problem of having a forced int return value intersect with an oop in the expression stack for a BCI performing slow-path calls into the VM. Simply removing the code that clears the expression stack, removes the issue, and I can't see that it introduces any other issue. Anyway, I think this is a separate bug. Do you mind if I push the fix for that bug, as a different RFR? It will likely involve poking around at more platform dependent code. ------------- PR: https://git.openjdk.java.net/jdk/pull/930 From stuefe at openjdk.java.net Fri Oct 30 16:55:54 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 30 Oct 2020 16:55:54 GMT Subject: RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) [v3] In-Reply-To: <3D8q_g6SYuSs4guk3x1SVKxinDjSuh3J9Oa8CSbs8tQ=.11c0f642-dcd2-4b5f-aca9-64efcba604c3@github.com> References: <6rybIs1odojqcKQ6zzl39wj2IxGxnMCVQXgpeajzqns=.64424fc9-74d7-4e2b-9d89-df6f7da44770@github.com> <3D8q_g6SYuSs4guk3x1SVKxinDjSuh3J9Oa8CSbs8tQ=.11c0f642-dcd2-4b5f-aca9-64efcba604c3@github.com> Message-ID: On Thu, 29 Oct 2020 05:05:21 GMT, David Holmes wrote: >> Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: >> >> make UseOSErrorReporting flag Windows only > > Looks fine to me, but will require a trivial CSR request. Looks all still good to me. Thank you for doing this! ------------- PR: https://git.openjdk.java.net/jdk/pull/813 From rkennke at openjdk.java.net Fri Oct 30 17:15:56 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 30 Oct 2020 17:15:56 GMT Subject: Integrated: 8255614: Shenandoah: Consolidate/streamline runtime LRBs In-Reply-To: <3CaMGD6aXUxD8SxDhn1XxU-c0I1NM6r-iosxLLScZT0=.8a0e96ba-17a8-4d59-bcac-7febc85970a1@github.com> References: <3CaMGD6aXUxD8SxDhn1XxU-c0I1NM6r-iosxLLScZT0=.8a0e96ba-17a8-4d59-bcac-7febc85970a1@github.com> Message-ID: On Fri, 30 Oct 2020 11:06:54 GMT, Roman Kennke wrote: > Currently, our various LRB entry points are a mess, and quite inefficient too. > - We have three entry points, one is checking for null, and calls a non-null version, but that checks for null again > - We don't have to check for null at all: it can be subsumed in the cset-check > - The LRB resolves forwardee even though has_forwarded_objects() and in_cset() has not been checked > - The LRB entry is not inlineable > > The proposed change coalesces the 3 entries into one, moves it to shenandoahBarrierSet.inline.hpp and make it inlineable, rearranges the impl to allow cset-check to subsume the NULL-check. As a bonus, it pushes the NULL-check around keep-alive down after the (compile-time) check for weak-ref, so that this path becomes a no-op in the majority of cases. This pull request has now been integrated. Changeset: 8600d0d9 Author: Roman Kennke URL: https://git.openjdk.java.net/jdk/commit/8600d0d9 Stats: 106 lines in 7 files changed: 37 ins; 51 del; 18 mod 8255614: Shenandoah: Consolidate/streamline runtime LRBs Reviewed-by: zgu ------------- PR: https://git.openjdk.java.net/jdk/pull/953 From kvn at openjdk.java.net Fri Oct 30 17:47:06 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 30 Oct 2020 17:47:06 GMT Subject: RFR: 8255616: Disable AOT and Graal in Oracle OpenJDK Message-ID: We shipped Ahead-of-Time compilation (the jaotc tool) in JDK 9, as an experimental feature. We shipped Graal as an experimental JIT compiler in JDK 10. We haven't seen much use of these features, and the effort required to support and enhance them is significant. We therefore intend to disable these features in Oracle builds as of JDK 16. We'll leave the sources for these features in the repository, in case any one else is interested in building them. But we will not update or test them. We'll continue to build and ship JVMCI as an experimental feature in Oracle builds. Tested changes in all tiers. I verified that with these changes I still able to build Graal in open repo and run graalunit testing: `open$ bash test/hotspot/jtreg/compiler/graalunit/downloadLibs.sh /mydir/graalunit_lib/` `open$ bash configure --with-debug-level=fastdebug --with-graalunit-lib=/mydir/graalunit_lib/ --with-jtreg=/mydir/jtreg` `open$ make jdk-image` `open$ make test-image` `open$ make run-test TEST=compiler/graalunit/HotspotTest.java` ------------- Commit messages: - 8255616: Disable AOT and Graal in Oracle OpenJDK Changes: https://git.openjdk.java.net/jdk/pull/960/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=960&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255616 Stats: 36 lines in 4 files changed: 21 ins; 11 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/960.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/960/head:pull/960 PR: https://git.openjdk.java.net/jdk/pull/960 From iignatyev at openjdk.java.net Fri Oct 30 17:54:57 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Fri, 30 Oct 2020 17:54:57 GMT Subject: RFR: 8255616: Disable AOT and Graal in Oracle OpenJDK In-Reply-To: References: Message-ID: On Fri, 30 Oct 2020 17:40:51 GMT, Vladimir Kozlov wrote: > We shipped Ahead-of-Time compilation (the jaotc tool) in JDK 9, as an experimental feature. We shipped Graal as an experimental JIT compiler in JDK 10. We haven't seen much use of these features, and the effort required to support and enhance them is significant. We therefore intend to disable these features in Oracle builds as of JDK 16. > > We'll leave the sources for these features in the repository, in case any one else is interested in building them. But we will not update or test them. > > We'll continue to build and ship JVMCI as an experimental feature in Oracle builds. > > Tested changes in all tiers. > > I verified that with these changes I still able to build Graal in open repo and run graalunit testing: > > `open$ bash test/hotspot/jtreg/compiler/graalunit/downloadLibs.sh /mydir/graalunit_lib/` > `open$ bash configure --with-debug-level=fastdebug --with-graalunit-lib=/mydir/graalunit_lib/ --with-jtreg=/mydir/jtreg` > `open$ make jdk-image` > `open$ make test-image` > `open$ make run-test TEST=compiler/graalunit/HotspotTest.java` LGTM ------------- Marked as reviewed by iignatyev (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/960 From ccheung at openjdk.java.net Fri Oct 30 18:02:54 2020 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Fri, 30 Oct 2020 18:02:54 GMT Subject: RFR: 8254309: appcds GCDuringDump.java failed - class must exist In-Reply-To: References: Message-ID: <0iuPSWzYkLtTnOguiCUaD2Al1si40eyJjvcbn7eFzhc=.e430b455-f458-41ce-bbdd-35e2f6d89ab1@github.com> On Fri, 30 Oct 2020 00:17:22 GMT, Yumin Qi wrote: > Hi, Please review > When CDS at dump time initializes archived heap, some classes are loaded. If at this time system runs out of memory the class will not be loaded. This is what we saw in this bug. The fix checks if OOM happened, if so we print out log and exit gracefully not causing a crash. Added a test case for testing purpose when exception/OOM happens during this stage. Also check during preload classes when OOM happens, exit vm with proper message. > > Tests: tier1-4 > > Thanks > Yumin Looks good. src/hotspot/share/memory/archiveUtils.cpp line 327: > 325: if (exception->is_a(SystemDictionary::OutOfMemoryError_klass())) { > 326: vm_exit_during_cds_dumping("Out of memory. Please run with a bigger Java heap"); > 327: } I'd suggesting replacing 'bigger' with 'larger'. I think it would be more informative if the message includes the MaxHeapSize setting. test/hotspot/jtreg/runtime/cds/appcds/javaldr/GCDuringDumpTransformer.java line 108: > 106: public static void makeGarbage() { > 107: for (int x=0; x<10; x++) { > 108: Object[] a = new Object[40000]; Any reason for increasing the size by 4 times? ------------- PR: https://git.openjdk.java.net/jdk/pull/948 From epavlova at openjdk.java.net Fri Oct 30 18:03:54 2020 From: epavlova at openjdk.java.net (Ekaterina Pavlova) Date: Fri, 30 Oct 2020 18:03:54 GMT Subject: RFR: 8255616: Disable AOT and Graal in Oracle OpenJDK In-Reply-To: References: Message-ID: <4hKiJTtog9q4y4x7hxaHMWUFdtjzrJe73i8Sys0Ukyg=.a1fdde60-fc8d-476b-9487-c796a668856f@github.com> On Fri, 30 Oct 2020 17:52:09 GMT, Igor Ignatyev wrote: >> We shipped Ahead-of-Time compilation (the jaotc tool) in JDK 9, as an experimental feature. We shipped Graal as an experimental JIT compiler in JDK 10. We haven't seen much use of these features, and the effort required to support and enhance them is significant. We therefore intend to disable these features in Oracle builds as of JDK 16. >> >> We'll leave the sources for these features in the repository, in case any one else is interested in building them. But we will not update or test them. >> >> We'll continue to build and ship JVMCI as an experimental feature in Oracle builds. >> >> Tested changes in all tiers. >> >> I verified that with these changes I still able to build Graal in open repo and run graalunit testing: >> >> `open$ bash test/hotspot/jtreg/compiler/graalunit/downloadLibs.sh /mydir/graalunit_lib/` >> `open$ bash configure --with-debug-level=fastdebug --with-graalunit-lib=/mydir/graalunit_lib/ --with-jtreg=/mydir/jtreg` >> `open$ make jdk-image` >> `open$ make test-image` >> `open$ make run-test TEST=compiler/graalunit/HotspotTest.java` > > LGTM Looks good. ------------- PR: https://git.openjdk.java.net/jdk/pull/960 From rrich at openjdk.java.net Fri Oct 30 18:40:56 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Fri, 30 Oct 2020 18:40:56 GMT Subject: RFR: 8255452: Doing GC during JVMTI MethodExit event posting breaks return oop In-Reply-To: References: Message-ID: On Fri, 30 Oct 2020 16:38:40 GMT, Erik ?sterlund wrote: > I think what you are saying is true. Note though that the return value of > ForceEarlyReturn is installed with a handshake. The handshake polls of the > interpreter are emitted in loop backedges and returns Yes right, I wasn't really aware of this. The thread has to be suspended though before the forced return (with a vm operation). > I gotta say I have no idea why there is a call to clear the expression stack > at all in TemplateInterpreterGenerator::generate_earlyret_entry_for(). It is > unclear to me what problem if any that solves. Don't really know either. Only thing I currently can think of is that the expression stack could overflow, not on x86 though. > Anyway, I think this is a separate bug. Do you mind if I push the fix for that > bug, as a different RFR? Not at all. > It will likely involve poking around at more platform dependent code. Likely. Another solution might be to delay loading the return value until the activation is removed. ------------- PR: https://git.openjdk.java.net/jdk/pull/930 From rrich at openjdk.java.net Fri Oct 30 18:43:57 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Fri, 30 Oct 2020 18:43:57 GMT Subject: RFR: 8255452: Doing GC during JVMTI MethodExit event posting breaks return oop In-Reply-To: References: Message-ID: <2LEXltEp4CX8xzC9-Bufoe0vhsFqwduh6H6R9z2sCqs=.0f8443c1-b2d0-4451-8dfe-f081360b96d2@github.com> On Thu, 29 Oct 2020 12:44:58 GMT, Erik ?sterlund wrote: > The imasm::remove_activation() call does not deal with safepoints very well. However, when the MethodExit JVMTI event is being called, we call into the runtime in the middle of remove_activation(). If the value being returned is an object type, then the top-of-stack contains the oop. However, the GC does not traverse said oop in any oop map, because it is simply not expected that we safepoint in the middle of remove_activation(). > > The JvmtiExport::post_method_exit() function we end up calling, reads the top-of-stack oop, and puts it in a handle. Then it calls JVMTI callbacks, that eventually call Java and a bunch of stuff that safepoints. So after the JVMTI callback, we can expect the top-of-stack oop to be broken. Unfortunately, when we continue, we therefore end up returning a broken oop. > > Notably, the fact that InterpreterRuntime::post_method_exit is a JRT_ENTRY, is wrong, as we can safepoint on the way back to Java, which will break the return oop in a similar way. So this patch makes it a JRT_BLOCK_ENTRY, moving the transition to VM and back, into a block of code that is protected against GC. Before the JRT_BLOCK is called, we stash away the return oop, and after the JRT_BLOCK_END, we restore the top-of-stack oop. In the path when InterpreterRuntime::post_method_exit is called when throwing an exception, we don't have the same problem of retaining an oop result, and hence the JRT_BLOCK/JRT_BLOCK_END section is not performed in this case; the logic is the same as before for this path. > > This is a JVMTI bug that has probably been around for a long time. It crashes with all GCs, but was discovered recently after concurrent stack processing, as StefanK has been running better GC stressing code in JVMTI, and the bug reproduced more easily with concurrent stack processing, as the timings were a bit different. The following reproducer failed pretty much 100% of the time: > while true; do make test JTREG="RETAIN=all" TEST=test/hotspot/jtreg/vmTestbase/nsk/jdi/MethodExitEvent/returnValue/returnValue003/returnValue003.java TEST_OPTS_JAVA_OPTIONS="-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=0.0001 -XX:ZFragmentationLimit=0.01 -XX:+VerifyOops -XX:+ZVerifyViews -Xint" ; done > > With my fix I can run this repeatedly without any more failures. I have also sanity checked the patch by running tier 1-5, so that it does not introduces any new issues on its own. I have also used Stefan's nice external GC stressing with jcmd technique that was used to trigger crashes with other GCs, to make sure said crashes no longer reproduce either. Marked as reviewed by rrich (Committer). ------------- PR: https://git.openjdk.java.net/jdk/pull/930 From minqi at openjdk.java.net Fri Oct 30 19:02:09 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Fri, 30 Oct 2020 19:02:09 GMT Subject: RFR: 8254309: appcds GCDuringDump.java failed - class must exist [v2] In-Reply-To: References: Message-ID: <7CDS5Utku-UVFK2G0nzH7emY-9PH2OKxU7-X_Q6n60k=.c5006e08-2e32-445e-9101-d549d9ebb427@github.com> > Hi, Please review > When CDS at dump time initializes archived heap, some classes are loaded. If at this time system runs out of memory the class will not be loaded. This is what we saw in this bug. The fix checks if OOM happened, if so we print out log and exit gracefully not causing a crash. Added a test case for testing purpose when exception/OOM happens during this stage. Also check during preload classes when OOM happens, exit vm with proper message. > > Tests: tier1-4 > > Thanks > Yumin Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: Revise as review comment, add MaxHeapSize in exit message ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/948/files - new: https://git.openjdk.java.net/jdk/pull/948/files/60d69366..133668e1 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=948&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=948&range=00-01 Stats: 3 lines in 3 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/948.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/948/head:pull/948 PR: https://git.openjdk.java.net/jdk/pull/948 From never at openjdk.java.net Fri Oct 30 19:30:03 2020 From: never at openjdk.java.net (Tom Rodriguez) Date: Fri, 30 Oct 2020 19:30:03 GMT Subject: RFR: 8255578: [JVMCI] be more careful about reflective reads of Class.componentType. Message-ID: cc @vnkozlov ------------- Commit messages: - 8255578: [JVMCI] be more careful about reflective reads of Class.componentType. Changes: https://git.openjdk.java.net/jdk/pull/962/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=962&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255578 Stats: 25 lines in 3 files changed: 25 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/962.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/962/head:pull/962 PR: https://git.openjdk.java.net/jdk/pull/962 From minqi at openjdk.java.net Fri Oct 30 19:36:58 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Fri, 30 Oct 2020 19:36:58 GMT Subject: RFR: 8254309: appcds GCDuringDump.java failed - class must exist [v2] In-Reply-To: <0iuPSWzYkLtTnOguiCUaD2Al1si40eyJjvcbn7eFzhc=.e430b455-f458-41ce-bbdd-35e2f6d89ab1@github.com> References: <0iuPSWzYkLtTnOguiCUaD2Al1si40eyJjvcbn7eFzhc=.e430b455-f458-41ce-bbdd-35e2f6d89ab1@github.com> Message-ID: On Fri, 30 Oct 2020 17:57:41 GMT, Calvin Cheung wrote: >> Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: >> >> Revise as review comment, add MaxHeapSize in exit message > > src/hotspot/share/memory/archiveUtils.cpp line 327: > >> 325: if (exception->is_a(SystemDictionary::OutOfMemoryError_klass())) { >> 326: vm_exit_during_cds_dumping("Out of memory. Please run with a bigger Java heap"); >> 327: } > > I'd suggesting replacing 'bigger' with 'larger'. > > I think it would be more informative if the message includes the MaxHeapSize setting. fixed with detail message. > test/hotspot/jtreg/runtime/cds/appcds/javaldr/GCDuringDumpTransformer.java line 108: > >> 106: public static void makeGarbage() { >> 107: for (int x=0; x<10; x++) { >> 108: Object[] a = new Object[40000]; > > Any reason for increasing the size by 4 times? changed back to original value. ------------- PR: https://git.openjdk.java.net/jdk/pull/948 From rkennke at openjdk.java.net Fri Oct 30 19:39:07 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 30 Oct 2020 19:39:07 GMT Subject: RFR: 8255691: Shenandoah: Invoke native-LRB only on non-strong refs [v3] In-Reply-To: <2O7ZS_-HyPZpm5m2sMTf-dZv8Qkw3tB47hMgORkckoU=.a58d7945-a2ea-4f76-9e01-540521bd77ea@github.com> References: <2O7ZS_-HyPZpm5m2sMTf-dZv8Qkw3tB47hMgORkckoU=.a58d7945-a2ea-4f76-9e01-540521bd77ea@github.com> Message-ID: <2wI5SLQjP_SmJOstjPHk3ct5b1Pr3nZEvi97NWCbo-E=.39961cac-4a3b-48c4-b608-425f024020b0@github.com> > The way that current native LRB is implemented is wrong (but non-fatal) and misleading. It's purpose is to prevent resurrection of unreachable non-strong references, and it should only be invoked on non-strong references, not all native references. This distinction will become even more important once we get concurrent reference processing: then we also want to invoke this barrier on referent-loads. > > This changes the runtime-part of native-LRB so that it is only invoked when it's invoked with non-strong reference decorator. Otherwise it acts as regular LRB. > > Testing: hotspot_gc_shenandoah Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Rename LRB-native -> LRB-weak ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/961/files - new: https://git.openjdk.java.net/jdk/pull/961/files/9911e5b5..86c80228 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=961&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=961&range=01-02 Stats: 65 lines in 13 files changed: 0 ins; 0 del; 65 mod Patch: https://git.openjdk.java.net/jdk/pull/961.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/961/head:pull/961 PR: https://git.openjdk.java.net/jdk/pull/961 From kvn at openjdk.java.net Fri Oct 30 20:01:55 2020 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 30 Oct 2020 20:01:55 GMT Subject: RFR: 8255578: [JVMCI] be more careful about reflective reads of Class.componentType. In-Reply-To: References: Message-ID: On Fri, 30 Oct 2020 19:23:39 GMT, Tom Rodriguez wrote: > cc @vnkozlov Good. Next time do `/label add hotspot-compiler` for our group to see JVMCI changes. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/962 From dlong at openjdk.java.net Fri Oct 30 20:08:59 2020 From: dlong at openjdk.java.net (Dean Long) Date: Fri, 30 Oct 2020 20:08:59 GMT Subject: RFR: 8255578: [JVMCI] be more careful about reflective reads of Class.componentType. In-Reply-To: References: Message-ID: <9nGB-mN82ydGtNaG-Vb7uwHJg4AnsDz_kz2b1U56c3E=.d95f49a4-0eac-4a8b-b5a5-a261136d6bdb@github.com> On Fri, 30 Oct 2020 19:23:39 GMT, Tom Rodriguez wrote: > cc @vnkozlov Marked as reviewed by dlong (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/962 From ccheung at openjdk.java.net Fri Oct 30 20:30:59 2020 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Fri, 30 Oct 2020 20:30:59 GMT Subject: RFR: 8254309: appcds GCDuringDump.java failed - class must exist [v2] In-Reply-To: <7CDS5Utku-UVFK2G0nzH7emY-9PH2OKxU7-X_Q6n60k=.c5006e08-2e32-445e-9101-d549d9ebb427@github.com> References: <7CDS5Utku-UVFK2G0nzH7emY-9PH2OKxU7-X_Q6n60k=.c5006e08-2e32-445e-9101-d549d9ebb427@github.com> Message-ID: On Fri, 30 Oct 2020 19:02:09 GMT, Yumin Qi wrote: >> Hi, Please review >> When CDS at dump time initializes archived heap, some classes are loaded. If at this time system runs out of memory the class will not be loaded. This is what we saw in this bug. The fix checks if OOM happened, if so we print out log and exit gracefully not causing a crash. Added a test case for testing purpose when exception/OOM happens during this stage. Also check during preload classes when OOM happens, exit vm with proper message. >> >> Tests: tier1-4 >> >> Thanks >> Yumin > > Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: > > Revise as review comment, add MaxHeapSize in exit message Just one nit. src/hotspot/share/memory/archiveUtils.cpp line 326: > 324: assert(exception != nullptr, "Sanity check"); > 325: if (exception->is_a(SystemDictionary::OutOfMemoryError_klass())) { > 326: vm_exit_during_cds_dumping(err_msg("Out of memory. Please run with a larger Java heap, current MaxHeapSize = " SIZE_FORMAT "M", MaxHeapSize/M)); Line is too long. Consider break this into 2 to 3 lines. ------------- Marked as reviewed by ccheung (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/948 From coleenp at openjdk.java.net Fri Oct 30 20:31:10 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 30 Oct 2020 20:31:10 GMT Subject: RFR: 8212879: Make JVMTI TagMap table not hash on oop address Message-ID: This change turns the HashTable that JVMTI uses for object tagging into a regular Hotspot hashtable - the one in hashtable.hpp with resizing and rehashing. Instead of pointing directly to oops so that GC has to walk the table to follow oops and then to rehash the table, this table points to WeakHandle. GC walks the backing OopStorages concurrently. The hash function for the table is a hash of the lower 32 bits of the address. A flag is set during GC (gc_notification if in a safepoint, and through a call to JvmtiTagMap::needs_processing()) so that the table is rehashed at the next use. The gc_notification mechanism of weak oop processing is used to notify Jvmti to post ObjectFree events. In concurrent GCs there can be a window of time between weak oop marking where the oop is unmarked, so dead (the phantom load in peek returns NULL) but the gc_notification hasn't been done yet. In this window, a heap walk or GetObjectsWithTags call would not find an object before the ObjectFree event is posted. This is dealt with in two ways: 1. In the Heap walk, there's an unconditional table walk to post events if events are needed to post. 2. For GetObjectWithTags, if a dead oop is found in the table and posting is required, we use the VM thread to post the event. Event posting cannot be done in a JavaThread because the posting needs to be done while holding the table lock, so that the JvmtiEnv state doesn't change before posting is done. ObjectFree callbacks are limited in what they can do as per the JVMTI Specification. The allowed callbacks to the VM already have code to allow NonJava threads. To avoid rehashing, I also tried to use object->identity_hash() but this breaks because entries can be added to the table during heapwalk, where the objects use marking. The starting markWord is saved and restored. Adding a hashcode during this operation makes restoring the former markWord (locked, inflated, etc) too complicated. Plus we don't want all these objects to have hashcodes because locking operations after tagging would have to always use inflated locks. Much of this change is to remove serial weak oop processing for the weakProcessor, ZGC and Shenandoah. The GCs have been stress tested with jvmti code. It has also been tested with tier1-6. Thank you to Stefan, Erik and Kim for their help with this change. ------------- Commit messages: - 8212879: Make JVMTI TagMap table not hash on oop address Changes: https://git.openjdk.java.net/jdk/pull/967/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=967&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8212879 Stats: 1737 lines in 41 files changed: 631 ins; 990 del; 116 mod Patch: https://git.openjdk.java.net/jdk/pull/967.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/967/head:pull/967 PR: https://git.openjdk.java.net/jdk/pull/967 From iklam at openjdk.java.net Fri Oct 30 20:37:02 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 30 Oct 2020 20:37:02 GMT Subject: RFR: 8254309: appcds GCDuringDump.java failed - class must exist [v2] In-Reply-To: <7CDS5Utku-UVFK2G0nzH7emY-9PH2OKxU7-X_Q6n60k=.c5006e08-2e32-445e-9101-d549d9ebb427@github.com> References: <7CDS5Utku-UVFK2G0nzH7emY-9PH2OKxU7-X_Q6n60k=.c5006e08-2e32-445e-9101-d549d9ebb427@github.com> Message-ID: On Fri, 30 Oct 2020 19:02:09 GMT, Yumin Qi wrote: >> Hi, Please review >> When CDS at dump time initializes archived heap, some classes are loaded. If at this time system runs out of memory the class will not be loaded. This is what we saw in this bug. The fix checks if OOM happened, if so we print out log and exit gracefully not causing a crash. Added a test case for testing purpose when exception/OOM happens during this stage. Also check during preload classes when OOM happens, exit vm with proper message. >> >> Tests: tier1-4 >> >> Thanks >> Yumin > > Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: > > Revise as review comment, add MaxHeapSize in exit message Changes requested by iklam (Reviewer). src/hotspot/share/memory/heapShared.cpp line 1055: > 1053: _dump_time_subgraph_info_table = new (ResourceObj::C_HEAP, mtClass)DumpTimeKlassSubGraphInfoTable(); > 1054: > 1055: if (_dump_time_subgraph_info_table == nullptr) { There's no need to check for failure for `new (ResourceObj::C_HEAP, mtClass)`, because it calls this operator: void* ResourceObj::operator new(size_t size, allocation_type type, MEMFLAGS flags) throw() { address res = NULL; switch (type) { case C_HEAP: res = (address)AllocateHeap(size, flags, CALLER_PC); .... which calls char* AllocateHeap(size_t size, MEMFLAGS flags, const NativeCallStack& stack, AllocFailType alloc_failmode /* = AllocFailStrategy::EXIT_OOM*/) { char* p = (char*) os::malloc(size, flags, stack); if (p == NULL && alloc_failmode == AllocFailStrategy::EXIT_OOM) { vm_exit_out_of_memory(size, OOM_MALLOC_ERROR, "AllocateHeap"); } which will exit the VM automatically when we run out of C heap. ------------- PR: https://git.openjdk.java.net/jdk/pull/948 From erikj at openjdk.java.net Fri Oct 30 20:48:57 2020 From: erikj at openjdk.java.net (Erik Joelsson) Date: Fri, 30 Oct 2020 20:48:57 GMT Subject: RFR: 8212879: Make JVMTI TagMap table not hash on oop address In-Reply-To: References: Message-ID: On Fri, 30 Oct 2020 20:23:04 GMT, Coleen Phillimore wrote: > This change turns the HashTable that JVMTI uses for object tagging into a regular Hotspot hashtable - the one in hashtable.hpp with resizing and rehashing. Instead of pointing directly to oops so that GC has to walk the table to follow oops and then to rehash the table, this table points to WeakHandle. GC walks the backing OopStorages concurrently. > > The hash function for the table is a hash of the lower 32 bits of the address. A flag is set during GC (gc_notification if in a safepoint, and through a call to JvmtiTagMap::needs_processing()) so that the table is rehashed at the next use. > > The gc_notification mechanism of weak oop processing is used to notify Jvmti to post ObjectFree events. In concurrent GCs there can be a window of time between weak oop marking where the oop is unmarked, so dead (the phantom load in peek returns NULL) but the gc_notification hasn't been done yet. In this window, a heap walk or GetObjectsWithTags call would not find an object before the ObjectFree event is posted. This is dealt with in two ways: > > 1. In the Heap walk, there's an unconditional table walk to post events if events are needed to post. > 2. For GetObjectWithTags, if a dead oop is found in the table and posting is required, we use the VM thread to post the event. > > Event posting cannot be done in a JavaThread because the posting needs to be done while holding the table lock, so that the JvmtiEnv state doesn't change before posting is done. ObjectFree callbacks are limited in what they can do as per the JVMTI Specification. The allowed callbacks to the VM already have code to allow NonJava threads. > > To avoid rehashing, I also tried to use object->identity_hash() but this breaks because entries can be added to the table during heapwalk, where the objects use marking. The starting markWord is saved and restored. Adding a hashcode during this operation makes restoring the former markWord (locked, inflated, etc) too complicated. Plus we don't want all these objects to have hashcodes because locking operations after tagging would have to always use inflated locks. > > Much of this change is to remove serial weak oop processing for the weakProcessor, ZGC and Shenandoah. The GCs have been stress tested with jvmti code. > > It has also been tested with tier1-6. > > Thank you to Stefan, Erik and Kim for their help with this change. Build changes look ok. ------------- PR: https://git.openjdk.java.net/jdk/pull/967 From minqi at openjdk.java.net Fri Oct 30 22:55:15 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Fri, 30 Oct 2020 22:55:15 GMT Subject: RFR: 8254309: appcds GCDuringDump.java failed - class must exist [v3] In-Reply-To: References: Message-ID: <2lUqJiNyXwQPxtCPRmGNhBIwfYm2N9Yn41c6bd_e19s=.054c9957-86b1-45b4-9b90-1136c7a9272c@github.com> > Hi, Please review > When CDS at dump time initializes archived heap, some classes are loaded. If at this time system runs out of memory the class will not be loaded. This is what we saw in this bug. The fix checks if OOM happened, if so we print out log and exit gracefully not causing a crash. Added a test case for testing purpose when exception/OOM happens during this stage. Also check during preload classes when OOM happens, exit vm with proper message. > > Tests: tier1-4 > > Thanks > Yumin Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: Remove the redundant check for CHeap allocation, roll back to original code ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/948/files - new: https://git.openjdk.java.net/jdk/pull/948/files/133668e1..c95f31d0 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=948&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=948&range=01-02 Stats: 9 lines in 1 file changed: 0 ins; 9 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/948.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/948/head:pull/948 PR: https://git.openjdk.java.net/jdk/pull/948 From minqi at openjdk.java.net Fri Oct 30 23:22:11 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Fri, 30 Oct 2020 23:22:11 GMT Subject: RFR: 8254309: appcds GCDuringDump.java failed - class must exist [v4] In-Reply-To: References: Message-ID: > Hi, Please review > When CDS at dump time initializes archived heap, some classes are loaded. If at this time system runs out of memory the class will not be loaded. This is what we saw in this bug. The fix checks if OOM happened, if so we print out log and exit gracefully not causing a crash. Added a test case for testing purpose when exception/OOM happens during this stage. Also check during preload classes when OOM happens, exit vm with proper message. > > Tests: tier1-4 > > Thanks > Yumin Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: Break long line into 3 shorter lines ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/948/files - new: https://git.openjdk.java.net/jdk/pull/948/files/c95f31d0..3342f53e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=948&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=948&range=02-03 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/948.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/948/head:pull/948 PR: https://git.openjdk.java.net/jdk/pull/948 From never at openjdk.java.net Fri Oct 30 23:33:56 2020 From: never at openjdk.java.net (Tom Rodriguez) Date: Fri, 30 Oct 2020 23:33:56 GMT Subject: RFR: 8255578: [JVMCI] be more careful about reflective reads of Class.componentType. In-Reply-To: <9nGB-mN82ydGtNaG-Vb7uwHJg4AnsDz_kz2b1U56c3E=.d95f49a4-0eac-4a8b-b5a5-a261136d6bdb@github.com> References: <9nGB-mN82ydGtNaG-Vb7uwHJg4AnsDz_kz2b1U56c3E=.d95f49a4-0eac-4a8b-b5a5-a261136d6bdb@github.com> Message-ID: On Fri, 30 Oct 2020 20:05:47 GMT, Dean Long wrote: >> cc @vnkozlov > > Marked as reviewed by dlong (Reviewer). The first PR i did it seems like it added the hotspot-compiler label automatically and complained when I tried to do it manually. How does it decide what labels to apply automatically? ------------- PR: https://git.openjdk.java.net/jdk/pull/962 From iklam at openjdk.java.net Fri Oct 30 23:49:01 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 30 Oct 2020 23:49:01 GMT Subject: RFR: 8254309: appcds GCDuringDump.java failed - class must exist [v4] In-Reply-To: References: Message-ID: On Fri, 30 Oct 2020 23:22:11 GMT, Yumin Qi wrote: >> Hi, Please review >> When CDS at dump time initializes archived heap, some classes are loaded. If at this time system runs out of memory the class will not be loaded. This is what we saw in this bug. The fix checks if OOM happened, if so we print out log and exit gracefully not causing a crash. Added a test case for testing purpose when exception/OOM happens during this stage. Also check during preload classes when OOM happens, exit vm with proper message. >> >> Tests: tier1-4 >> >> Thanks >> Yumin > > Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: > > Break long line into 3 shorter lines Marked as reviewed by iklam (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/948 From ccheung at openjdk.java.net Sat Oct 31 00:09:00 2020 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Sat, 31 Oct 2020 00:09:00 GMT Subject: RFR: 8254309: appcds GCDuringDump.java failed - class must exist [v4] In-Reply-To: References: Message-ID: On Fri, 30 Oct 2020 23:22:11 GMT, Yumin Qi wrote: >> Hi, Please review >> When CDS at dump time initializes archived heap, some classes are loaded. If at this time system runs out of memory the class will not be loaded. This is what we saw in this bug. The fix checks if OOM happened, if so we print out log and exit gracefully not causing a crash. Added a test case for testing purpose when exception/OOM happens during this stage. Also check during preload classes when OOM happens, exit vm with proper message. >> >> Tests: tier1-4 >> >> Thanks >> Yumin > > Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: > > Break long line into 3 shorter lines Looks good. ------------- Marked as reviewed by ccheung (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/948 From minqi at openjdk.java.net Sat Oct 31 00:12:58 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Sat, 31 Oct 2020 00:12:58 GMT Subject: Integrated: 8254309: appcds GCDuringDump.java failed - class must exist In-Reply-To: References: Message-ID: On Fri, 30 Oct 2020 00:17:22 GMT, Yumin Qi wrote: > Hi, Please review > When CDS at dump time initializes archived heap, some classes are loaded. If at this time system runs out of memory the class will not be loaded. This is what we saw in this bug. The fix checks if OOM happened, if so we print out log and exit gracefully not causing a crash. Added a test case for testing purpose when exception/OOM happens during this stage. Also check during preload classes when OOM happens, exit vm with proper message. > > Tests: tier1-4 > > Thanks > Yumin This pull request has now been integrated. Changeset: 9d5c9cc7 Author: Yumin Qi URL: https://git.openjdk.java.net/jdk/commit/9d5c9cc7 Stats: 160 lines in 8 files changed: 146 ins; 11 del; 3 mod 8254309: appcds GCDuringDump.java failed - class must exist Reviewed-by: ccheung, iklam ------------- PR: https://git.openjdk.java.net/jdk/pull/948 From iklam at openjdk.java.net Sat Oct 31 03:17:56 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Sat, 31 Oct 2020 03:17:56 GMT Subject: RFR: JDK-8255544: Create a checked cast In-Reply-To: References: Message-ID: On Wed, 28 Oct 2020 15:50:52 GMT, Andrew Haley wrote: > In many places we've added C-style casts to silence compiler warnings, for example when truncating a size_t to an int when we know the size_t is a small struct. Such casts are inherently risky, because they effectively disable useful compiler warnings. We should add a form of cast that checks at runtime that a truncation does not overflow. Can we change one place in HotSpot to use this function (preferably in a frequently-used code path), to verity that it really works? ------------- Changes requested by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/904 From sspitsyn at openjdk.java.net Sat Oct 31 09:56:55 2020 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Sat, 31 Oct 2020 09:56:55 GMT Subject: RFR: 8255452: Doing GC during JVMTI MethodExit event posting breaks return oop In-Reply-To: References: Message-ID: <5-JnsHKeZztip64W88tRwA9_pcuWze2jftUq0C6oYSM=.79916a5b-5dd1-402f-b1ba-0caa38a39c1b@github.com> On Thu, 29 Oct 2020 12:44:58 GMT, Erik ?sterlund wrote: > The imasm::remove_activation() call does not deal with safepoints very well. However, when the MethodExit JVMTI event is being called, we call into the runtime in the middle of remove_activation(). If the value being returned is an object type, then the top-of-stack contains the oop. However, the GC does not traverse said oop in any oop map, because it is simply not expected that we safepoint in the middle of remove_activation(). > > The JvmtiExport::post_method_exit() function we end up calling, reads the top-of-stack oop, and puts it in a handle. Then it calls JVMTI callbacks, that eventually call Java and a bunch of stuff that safepoints. So after the JVMTI callback, we can expect the top-of-stack oop to be broken. Unfortunately, when we continue, we therefore end up returning a broken oop. > > Notably, the fact that InterpreterRuntime::post_method_exit is a JRT_ENTRY, is wrong, as we can safepoint on the way back to Java, which will break the return oop in a similar way. So this patch makes it a JRT_BLOCK_ENTRY, moving the transition to VM and back, into a block of code that is protected against GC. Before the JRT_BLOCK is called, we stash away the return oop, and after the JRT_BLOCK_END, we restore the top-of-stack oop. In the path when InterpreterRuntime::post_method_exit is called when throwing an exception, we don't have the same problem of retaining an oop result, and hence the JRT_BLOCK/JRT_BLOCK_END section is not performed in this case; the logic is the same as before for this path. > > This is a JVMTI bug that has probably been around for a long time. It crashes with all GCs, but was discovered recently after concurrent stack processing, as StefanK has been running better GC stressing code in JVMTI, and the bug reproduced more easily with concurrent stack processing, as the timings were a bit different. The following reproducer failed pretty much 100% of the time: > while true; do make test JTREG="RETAIN=all" TEST=test/hotspot/jtreg/vmTestbase/nsk/jdi/MethodExitEvent/returnValue/returnValue003/returnValue003.java TEST_OPTS_JAVA_OPTIONS="-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=0.0001 -XX:ZFragmentationLimit=0.01 -XX:+VerifyOops -XX:+ZVerifyViews -Xint" ; done > > With my fix I can run this repeatedly without any more failures. I have also sanity checked the patch by running tier 1-5, so that it does not introduces any new issues on its own. I have also used Stefan's nice external GC stressing with jcmd technique that was used to trigger crashes with other GCs, to make sure said crashes no longer reproduce either. Hi Erik, Nice discovery! Indeed, this is a long standing issue. It looks good in general. I agree with Coleen, it would be nice if there is an elegant way to move the oop_result saving/restoring code to InterpreterRuntime::post_method_exit. Otherwise, I'm okay with what you have now. It is also nice discovery of the issue with clearing the expression stack. I think, it was my mistake in the initial implementation of the ForceEarlyReturn when I followed the PopFrame implementation pattern. It is good to separate it from the current fix. Thanks, Serguei ------------- PR: https://git.openjdk.java.net/jdk/pull/930 From aph at redhat.com Sat Oct 31 10:42:04 2020 From: aph at redhat.com (Andrew Haley) Date: Sat, 31 Oct 2020 10:42:04 +0000 Subject: RFR: JDK-8255544: Create a checked cast In-Reply-To: References: Message-ID: On 31/10/2020 03:17, Ioi Lam wrote: > On Wed, 28 Oct 2020 15:50:52 GMT, Andrew Haley wrote: > >> In many places we've added C-style casts to silence compiler warnings, for example when truncating a size_t to an int when we know the size_t is a small struct. Such casts are inherently risky, because they effectively disable useful compiler warnings. We should add a form of cast that checks at runtime that a truncation does not overflow. > > Can we change one place in HotSpot to use this function (preferably in a frequently-used code path), to verity that it really works? Oh, sure. I have plenty in the AArch64 back end, but I'll look for some others. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at openjdk.java.net Sat Oct 31 14:02:07 2020 From: aph at openjdk.java.net (Andrew Haley) Date: Sat, 31 Oct 2020 14:02:07 GMT Subject: RFR: JDK-8255544: Create a checked cast [v2] In-Reply-To: References: Message-ID: > In many places we've added C-style casts to silence compiler warnings, for example when truncating a size_t to an int when we know the size_t is a small struct. Such casts are inherently risky, because they effectively disable useful compiler warnings. We should add a form of cast that checks at runtime that a truncation does not overflow. Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: JDK-8255544: Create a checked cast ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/904/files - new: https://git.openjdk.java.net/jdk/pull/904/files/5f8c47da..a3c9516e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=904&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=904&range=00-01 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/904.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/904/head:pull/904 PR: https://git.openjdk.java.net/jdk/pull/904 From aph at openjdk.java.net Sat Oct 31 14:02:07 2020 From: aph at openjdk.java.net (Andrew Haley) Date: Sat, 31 Oct 2020 14:02:07 GMT Subject: RFR: JDK-8255544: Create a checked cast [v2] In-Reply-To: References: Message-ID: On Sat, 31 Oct 2020 03:14:52 GMT, Ioi Lam wrote: > Can we change one place in HotSpot to use this function (preferably in a frequently-used code path), to verity that it really works? How about this? ------------- PR: https://git.openjdk.java.net/jdk/pull/904 From iklam at openjdk.java.net Sat Oct 31 14:44:55 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Sat, 31 Oct 2020 14:44:55 GMT Subject: RFR: JDK-8255544: Create a checked cast [v2] In-Reply-To: References: Message-ID: On Sat, 31 Oct 2020 14:02:07 GMT, Andrew Haley wrote: >> In many places we've added C-style casts to silence compiler warnings, for example when truncating a size_t to an int when we know the size_t is a small struct. Such casts are inherently risky, because they effectively disable useful compiler warnings. We should add a form of cast that checks at runtime that a truncation does not overflow. > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8255544: Create a checked cast LGTM ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/904