From sviswanathan at openjdk.java.net Thu Apr 1 00:37:24 2021 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Thu, 1 Apr 2021 00:37:24 GMT Subject: Integrated: 8264054: Bad XMM performance on java.lang.MathBench.sqrtDouble In-Reply-To: References: Message-ID: On Tue, 30 Mar 2021 00:38:22 GMT, Sandhya Viswanathan wrote: > For the j.l.Math JMH at https://github.com/openjdk/jmh-jdk-microbenchmarks/blob/master/micros-jdk11/src/main/java/org/openjdk/bench/java/lang/MathBench.java, the performance for sqrt benchmark could be improved. Thanks a lot to Eric Caspole for finding this issue. > > Benchmark: > @Benchmark > public double sqrtDouble() { > return Math.sqrt(double4Dot1); > } > > Current code generated (linux format) by c2 JIT is: > vsqrtsd 0x50(%r10),%xmm0,%xmm0 > > The vsqrtsd instruction operation is specified as below: > VSQRTSD (VEX.128 encoded version) > DEST[63:0] := SQRT(SRC2[63:0]) > DEST[127:64] := SRC1[127:64] > DEST[MAXVL-1:128] := 0 > > The upper 127:64 bits are set from previous contents of xmm0. As the destination xmm0 register was not initialized prior to use by c2 JIT, this causes stall and lower performance. > > By adding xmm0 initialization prior to use, the performance of the above benchmark improves significantly. > > Code generated after patch: > vxorpd %xmm0,%xmm0,%xmm0 > vsqrtsd 0x50(%r10),%xmm0,%xmm0 > > Performance before patch: > Benchmark Mode Cnt Score Error Units > MathBench.sqrtDouble thrpt 8 193612.396 ? 95.807 ops/ms > > Performance after patch: > MathBench.sqrtDouble thrpt 8 276388.024 ? 846.372 ops/ms > > Best Regards, > Sandhya This pull request has now been integrated. Changeset: 52d8a229 Author: Sandhya Viswanathan URL: https://git.openjdk.java.net/jdk/commit/52d8a229 Stats: 947 lines in 3 files changed: 879 ins; 51 del; 17 mod 8264054: Bad XMM performance on java.lang.MathBench.sqrtDouble Co-authored-by: Eric Caspole Co-authored-by: Charlie Hunt Reviewed-by: neliasso, kvn, vlivanov ------------- PR: https://git.openjdk.java.net/jdk/pull/3256 From hshi at openjdk.java.net Thu Apr 1 01:04:36 2021 From: hshi at openjdk.java.net (Hui Shi) Date: Thu, 1 Apr 2021 01:04:36 GMT Subject: RFR: 8264223: CodeHeap::verify fails extra_hops assertion in fastdebug test [v3] In-Reply-To: References: <_U973M53ajT2sW5WNh-HMNqjGTmdb6fvusj9NGHS63Y=.70c7f6ca-b68b-4c3d-9b30-5cd673a2386e@github.com> Message-ID: On Wed, 31 Mar 2021 12:50:37 GMT, Lutz Schmidt wrote: > The change looks good to me. > Thank you for digging into the tricky code to understand what's happening. The previous limit (16+2*count) was purely heuristic. Obviously, you have a more pathological test case. Thanks! The entire CodeHeap optimization is great! ------------- PR: https://git.openjdk.java.net/jdk/pull/3212 From jiefu at openjdk.java.net Thu Apr 1 02:00:44 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Thu, 1 Apr 2021 02:00:44 GMT Subject: RFR: 8264557: Incorrect copyright year for test/micro/org/openjdk/bench/java/lang/MathBench.java after JDK-8264054 Message-ID: Hi all, The copyright year for test/micro/org/openjdk/bench/java/lang/MathBench.java [1] is incorrect since it isn't newly added in 2021. Let's fix it. Thanks. Best regards, Jie [1] https://github.com/openjdk/jdk/blob/master/test/micro/org/openjdk/bench/java/lang/MathBench.java#L2 ------------- Commit messages: - 8264557: Incorrect copyright year for test/micro/org/openjdk/bench/java/lang/MathBench.java after JDK-8264054 Changes: https://git.openjdk.java.net/jdk/pull/3299/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3299&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264557 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3299.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3299/head:pull/3299 PR: https://git.openjdk.java.net/jdk/pull/3299 From xgong at openjdk.java.net Thu Apr 1 03:39:43 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Thu, 1 Apr 2021 03:39:43 GMT Subject: RFR: 8264109: Add vectorized implementation for VectorMask.andNot() [v2] In-Reply-To: References: Message-ID: > Currently "VectorMask.andNot()" is not vectorized: > public VectorMask andNot(VectorMask m) { > // FIXME: Generate good code here. > return bOp(m, (i, a, b) -> a && !b); > } > This can be implemented with` "and(m.not())" `directly. Since `"VectorMask.and()/not()" `have been vectorized in hotspot, `"andNot"` > can be vectorized as well by calling them. > > The performance gain is >100% for such a simple JMH: > @Benchmark > public Object andNot(Blackhole bh) { > boolean[] mask = fm.apply(SPECIES.length()); > boolean[] r = fmt.apply(SPECIES.length()); > VectorMask rm = VectorMask.fromArray(SPECIES, r, 0); > > for (int ic = 0; ic < INVOC_COUNT; ic++) { > for (int i = 0; i < mask.length; i += SPECIES.length()) { > VectorMask vmask = VectorMask.fromArray(SPECIES, mask, i); > rm = rm.andNot(vmask); > } > } > return rm; > } Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: Move the changing to AbstractMask.andNot and revert changes in VectorMask Change-Id: Ie3ec8f53cb24526c8e1ccc68038355d024910818 CustomizedGitHooks: yes ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3211/files - new: https://git.openjdk.java.net/jdk/pull/3211/files/ccb73d92..33ac041e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3211&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3211&range=00-01 Stats: 9 lines in 2 files changed: 5 ins; 2 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/3211.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3211/head:pull/3211 PR: https://git.openjdk.java.net/jdk/pull/3211 From xgong at openjdk.java.net Thu Apr 1 03:50:18 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Thu, 1 Apr 2021 03:50:18 GMT Subject: RFR: 8264109: Add vectorized implementation for VectorMask.andNot() [v2] In-Reply-To: <4nYUWzXPdlo6tL9f4j7pTj8ArbRtFSZEHRae7P17UBE=.0d1f8bd8-ff98-49a5-b6d7-d5a4856bcaf0@github.com> References: <4nYUWzXPdlo6tL9f4j7pTj8ArbRtFSZEHRae7P17UBE=.0d1f8bd8-ff98-49a5-b6d7-d5a4856bcaf0@github.com> Message-ID: <7-bXxH3_wFxlJaN7-G6HnbWnrAgT8dqurUIYg9nCR-k=.f99cd05a-9ed2-4d74-965d-c81511db354f@github.com> On Wed, 31 Mar 2021 16:42:09 GMT, Paul Sandoz wrote: > Would you mind updating the existing `AbstractMask.andNot` implementation? rather than changing `VectorMask.andNot`. That fits in with the current implementation pattern. Hi @PaulSandoz , thanks for looking at this PR. I'v updated the patch according to your comments. Would you mind having look at it again? Thanks so much! ------------- PR: https://git.openjdk.java.net/jdk/pull/3211 From lucy at openjdk.java.net Thu Apr 1 07:02:24 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Thu, 1 Apr 2021 07:02:24 GMT Subject: RFR: 8264223: CodeHeap::verify fails extra_hops assertion in fastdebug test [v3] In-Reply-To: References: <_U973M53ajT2sW5WNh-HMNqjGTmdb6fvusj9NGHS63Y=.70c7f6ca-b68b-4c3d-9b30-5cd673a2386e@github.com> Message-ID: On Thu, 1 Apr 2021 01:01:19 GMT, Hui Shi wrote: >> The change looks good to me. >> Thank you for digging into the tricky code to understand what's happening. The previous limit (16+2*count) was purely heuristic. Obviously, you have a more pathological test case. > >> The change looks good to me. >> Thank you for digging into the tricky code to understand what's happening. The previous limit (16+2*count) was purely heuristic. Obviously, you have a more pathological test case. > > Thanks! The entire CodeHeap optimization is great! You will need a second review before the fix can be integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/3212 From roland at openjdk.java.net Thu Apr 1 07:15:23 2021 From: roland at openjdk.java.net (Roland Westrelin) Date: Thu, 1 Apr 2021 07:15:23 GMT Subject: RFR: 8263448: CTW: fatal error: meet not symmetric In-Reply-To: References: Message-ID: On Wed, 31 Mar 2021 23:31:05 GMT, Vladimir Kozlov wrote: > In bug's case instance_id should be changed to InstanceBot only if it is real id and not InstanceTop or InstanceBot (similar to Constant in type lattice). Otherwise we incorrectly change InstanceTop to InstanceBot. In short, we should follow PTR transformation in this code: > if (ptr == Constant) { > ptr = NotNull; > } > - instance_id = InstanceBot; > + if (instance_id > 0) { > + instance_id = InstanceBot; > + } > I noticed that `TypeInstPtr::xmeet_helper()` has the same code as `TypeAryPtr::xmeet_helper()` for case when `InstPtr` meets `AryPtr`. So instead of fixing two places I decided to call TypeAryPtr::xmeet_helper() from `TypeInstPtr::xmeet_helper()` for this case. We do similar thing in `TypeOopPtr::xmeet()`: > https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/type.cpp#L3283 > > Fixed code style in touched code. > > Tested failing test from bug report, hs-tier1-4, hs-comp. Looks good to me. ------------- Marked as reviewed by roland (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3298 From shade at openjdk.java.net Thu Apr 1 08:04:19 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 1 Apr 2021 08:04:19 GMT Subject: RFR: 8264223: CodeHeap::verify fails extra_hops assertion in fastdebug test [v3] In-Reply-To: References: <_U973M53ajT2sW5WNh-HMNqjGTmdb6fvusj9NGHS63Y=.70c7f6ca-b68b-4c3d-9b30-5cd673a2386e@github.com> Message-ID: On Fri, 26 Mar 2021 08:28:40 GMT, Hui Shi wrote: >> When test with -XX:+VerifyCodeCache, many tests fail due to extra_hops assertion in CodeHeap::verify. See full dump in JBS. >> >> # Internal Error (src/hotspot/share/memory/heap.cpp:838), pid=1525697, tid=1525715 >> # assert((count == 0) || (extra_hops < (16 + 2*count))) failed: CodeHeap: many extra hops due to optimization. blocks: 234, extra hops: 484. >> Discussion in https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-October/035508.html doesn't tell where assertion (extra_hops < (16 + 2*count) comes from. >> >> In CodeHeap::mark_segmap_as_used wehn is_FreeBlock_join is true, it creates extra hops in _segmap. _fragmentation_count in inced and trigger defrag_segmap when reach threshold. In my understanding, extra hop can not guarantee under (16 + 2*count). >> >> In following extreme case, before HeapBlock free, segmap is all 0 (each blob use 1 smallest segment), suppose free action is applied from right to left. This increase 9 unnecessary hop for 1 continous HeapBlock. assertion (extra_hops < (16 + 2*count) is not safe. >> |0|0|0|0|0|0|0|0|0|0| >> after free, it will be >> |0|1|1|1|1|1|1|1|1|1| >> >> Proposed fix is assert extra hops not exceed _fragmentation_count. And if it exceeds (16 + 2 * count), give warning on two many extra hops. >> >> fastdebug tier1, tier2 with VerifyCodeCache passed on X86_64 linux, no extra assertion found. > > Hui Shi has updated the pull request incrementally with one additional commit since the last revision: > > update copyright year It looks fine. But I have a question: are we reasonably sure that `extra_hops <= _fragmentation_count` always. More specifically, when `_fragmentation_count` drops to `0`, are `extra_hops` guaranteed to be `0` as well? ------------- PR: https://git.openjdk.java.net/jdk/pull/3212 From njian at openjdk.java.net Thu Apr 1 08:05:44 2021 From: njian at openjdk.java.net (Ningsheng Jian) Date: Thu, 1 Apr 2021 08:05:44 GMT Subject: RFR: 8264409: AArch64: generate better code for Vector API allTrue Message-ID: In Vector API NEON implementation, we use a vector register to represent vector mask, where an element value of -1 is a true mask and an element value of 0 is a false mask. The allTrue() api is used to check whether all the elements of current mask are set. Currently, the AArch64 NEON allTrue implementation looks like: andr $tmp, T16B $src1, $src2\t# src2 is maskAllTrue notr $tmp, T16B, $tmp addv $tmp, T16B, $tmp umov $dst, $tmp, B, 0 cmp $dst, 0 cset $dst where $src2 is a preset all true (-1) constant. We can optimize it to the code sequence like below, to check whether all bits are set: uminv $tmp, T16B, $src1 umov $dst, $tmp, B, 0 cmp $dst, 0xff cset $dst With this codegen improvement, we can see about 8%~70% performance uplift on different machines for Alibaba's Vector API bigdata benchmarks [1][2]. Tested with tier1 and vector api jtreg tests. [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/bigdata/BooleanArrayCheck.java#L61 [2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/bigdata/ValueRangeCheckAndCastL2I.java#L93 ------------- Commit messages: - 8264409: AArch64: generate better code for Vector API allTrue Changes: https://git.openjdk.java.net/jdk/pull/3302/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3302&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264409 Stats: 409 lines in 5 files changed: 13 ins; 12 del; 384 mod Patch: https://git.openjdk.java.net/jdk/pull/3302.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3302/head:pull/3302 PR: https://git.openjdk.java.net/jdk/pull/3302 From adinn at openjdk.java.net Thu Apr 1 09:05:26 2021 From: adinn at openjdk.java.net (Andrew Dinn) Date: Thu, 1 Apr 2021 09:05:26 GMT Subject: RFR: 8264409: AArch64: generate better code for Vector API allTrue In-Reply-To: References: Message-ID: On Thu, 1 Apr 2021 07:58:07 GMT, Ningsheng Jian wrote: > In Vector API NEON implementation, we use a vector register to represent vector mask, where an element value of -1 is a true mask and an element value of 0 is a false mask. The allTrue() api is used to check whether all the elements of current mask are set. > > Currently, the AArch64 NEON allTrue implementation looks like: > > andr $tmp, T16B $src1, $src2\t# src2 is maskAllTrue > notr $tmp, T16B, $tmp > addv $tmp, T16B, $tmp > umov $dst, $tmp, B, 0 > cmp $dst, 0 > cset $dst > > where $src2 is a preset all true (-1) constant. We can optimize it to the code sequence like below, to check whether all bits are set: > > uminv $tmp, T16B, $src1 > umov $dst, $tmp, B, 0 > cmp $dst, 0xff > cset $dst > > With this codegen improvement, we can see about 8%~70% performance uplift on different machines for Alibaba's Vector API bigdata benchmarks [1][2]. > > Tested with tier1 and vector api jtreg tests. > > [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/bigdata/BooleanArrayCheck.java#L61 > [2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/bigdata/ValueRangeCheckAndCastL2I.java#L93 It's a clever trick to use uminv for this specific case. The patch looks good to me. ------------- Marked as reviewed by adinn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3302 From vlivanov at openjdk.java.net Thu Apr 1 09:11:35 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Thu, 1 Apr 2021 09:11:35 GMT Subject: RFR: 8264548: Dependencies: ClassHierarchyWalker::is_witness() cleanups In-Reply-To: <7gwnAv5IEEhj8je8965QoyZXFOhdZ1EgpHBop6nlmzE=.7410cbc8-52da-4d51-8663-aaea9b630a11@github.com> References: <7gwnAv5IEEhj8je8965QoyZXFOhdZ1EgpHBop6nlmzE=.7410cbc8-52da-4d51-8663-aaea9b630a11@github.com> Message-ID: On Wed, 31 Mar 2021 22:05:20 GMT, Vladimir Kozlov wrote: >> Simplify `ClassHierarchyWalker::is_witness()`. >> >> Testing: hs-tier1 - hs-tier4. > > Looks like it has changes from: https://git.openjdk.java.net/jdk/pull/3293 I created it as a dependent PR. I think that's the way it works (https://github.com/openjdk/skara/pull/1087 which was used as an example in [1] looks similar). [1] https://mail.openjdk.java.net/pipermail/jdk-dev/2021-March/005232.html ------------- PR: https://git.openjdk.java.net/jdk/pull/3297 From lucy at openjdk.java.net Thu Apr 1 09:58:34 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Thu, 1 Apr 2021 09:58:34 GMT Subject: RFR: 8264223: CodeHeap::verify fails extra_hops assertion in fastdebug test [v3] In-Reply-To: References: <_U973M53ajT2sW5WNh-HMNqjGTmdb6fvusj9NGHS63Y=.70c7f6ca-b68b-4c3d-9b30-5cd673a2386e@github.com> Message-ID: On Thu, 1 Apr 2021 08:01:16 GMT, Aleksey Shipilev wrote: >> Hui Shi has updated the pull request incrementally with one additional commit since the last revision: >> >> update copyright year > > It looks fine. But I have a question: are we reasonably sure that `extra_hops <= _fragmentation_count` always. More specifically, when `_fragmentation_count` drops to `0`, are `extra_hops` guaranteed to be `0` as well? @shipilev Short answer: yes. Long answer: _fragmentation_count is incremented every time the segment map becomes _potentially_ more fragmented by introducing an additional chunk, see mark_segmap_as_used(). Therefore, _fragmentation_count overestimates the actual segmap fragmentation. Once _fragmentation_count hits the limit, defragmentation is triggered (defrag_segmap(true)) and the counter is set to zero. After defragmentation, segmap should not contain any extra hops - that's the purpose of defragmentation. If is does, I would classify that as a bug. ------------- PR: https://git.openjdk.java.net/jdk/pull/3212 From shade at openjdk.java.net Thu Apr 1 10:03:34 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 1 Apr 2021 10:03:34 GMT Subject: RFR: 8264223: CodeHeap::verify fails extra_hops assertion in fastdebug test [v3] In-Reply-To: References: <_U973M53ajT2sW5WNh-HMNqjGTmdb6fvusj9NGHS63Y=.70c7f6ca-b68b-4c3d-9b30-5cd673a2386e@github.com> Message-ID: On Fri, 26 Mar 2021 08:28:40 GMT, Hui Shi wrote: >> When test with -XX:+VerifyCodeCache, many tests fail due to extra_hops assertion in CodeHeap::verify. See full dump in JBS. >> >> # Internal Error (src/hotspot/share/memory/heap.cpp:838), pid=1525697, tid=1525715 >> # assert((count == 0) || (extra_hops < (16 + 2*count))) failed: CodeHeap: many extra hops due to optimization. blocks: 234, extra hops: 484. >> Discussion in https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-October/035508.html doesn't tell where assertion (extra_hops < (16 + 2*count) comes from. >> >> In CodeHeap::mark_segmap_as_used wehn is_FreeBlock_join is true, it creates extra hops in _segmap. _fragmentation_count in inced and trigger defrag_segmap when reach threshold. In my understanding, extra hop can not guarantee under (16 + 2*count). >> >> In following extreme case, before HeapBlock free, segmap is all 0 (each blob use 1 smallest segment), suppose free action is applied from right to left. This increase 9 unnecessary hop for 1 continous HeapBlock. assertion (extra_hops < (16 + 2*count) is not safe. >> |0|0|0|0|0|0|0|0|0|0| >> after free, it will be >> |0|1|1|1|1|1|1|1|1|1| >> >> Proposed fix is assert extra hops not exceed _fragmentation_count. And if it exceeds (16 + 2 * count), give warning on two many extra hops. >> >> fastdebug tier1, tier2 with VerifyCodeCache passed on X86_64 linux, no extra assertion found. > > Hui Shi has updated the pull request incrementally with one additional commit since the last revision: > > update copyright year Okay then, thanks. ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3212 From hshi at openjdk.java.net Thu Apr 1 10:48:22 2021 From: hshi at openjdk.java.net (Hui Shi) Date: Thu, 1 Apr 2021 10:48:22 GMT Subject: Integrated: 8264223: CodeHeap::verify fails extra_hops assertion in fastdebug test In-Reply-To: <_U973M53ajT2sW5WNh-HMNqjGTmdb6fvusj9NGHS63Y=.70c7f6ca-b68b-4c3d-9b30-5cd673a2386e@github.com> References: <_U973M53ajT2sW5WNh-HMNqjGTmdb6fvusj9NGHS63Y=.70c7f6ca-b68b-4c3d-9b30-5cd673a2386e@github.com> Message-ID: On Fri, 26 Mar 2021 02:50:59 GMT, Hui Shi wrote: > When test with -XX:+VerifyCodeCache, many tests fail due to extra_hops assertion in CodeHeap::verify. See full dump in JBS. > > # Internal Error (src/hotspot/share/memory/heap.cpp:838), pid=1525697, tid=1525715 > # assert((count == 0) || (extra_hops < (16 + 2*count))) failed: CodeHeap: many extra hops due to optimization. blocks: 234, extra hops: 484. > Discussion in https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-October/035508.html doesn't tell where assertion (extra_hops < (16 + 2*count) comes from. > > In CodeHeap::mark_segmap_as_used wehn is_FreeBlock_join is true, it creates extra hops in _segmap. _fragmentation_count in inced and trigger defrag_segmap when reach threshold. In my understanding, extra hop can not guarantee under (16 + 2*count). > > In following extreme case, before HeapBlock free, segmap is all 0 (each blob use 1 smallest segment), suppose free action is applied from right to left. This increase 9 unnecessary hop for 1 continous HeapBlock. assertion (extra_hops < (16 + 2*count) is not safe. > |0|0|0|0|0|0|0|0|0|0| > after free, it will be > |0|1|1|1|1|1|1|1|1|1| > > Proposed fix is assert extra hops not exceed _fragmentation_count. And if it exceeds (16 + 2 * count), give warning on two many extra hops. > > fastdebug tier1, tier2 with VerifyCodeCache passed on X86_64 linux, no extra assertion found. This pull request has now been integrated. Changeset: 011f6d13 Author: Hui Shi Committer: Lutz Schmidt URL: https://git.openjdk.java.net/jdk/commit/011f6d13 Stats: 13 lines in 2 files changed: 11 ins; 0 del; 2 mod 8264223: CodeHeap::verify fails extra_hops assertion in fastdebug test Reviewed-by: lucy, shade ------------- PR: https://git.openjdk.java.net/jdk/pull/3212 From vlivanov at openjdk.java.net Thu Apr 1 13:00:27 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Thu, 1 Apr 2021 13:00:27 GMT Subject: RFR: 8264546: Dependencies: Context class is always an InstanceKlass In-Reply-To: References: Message-ID: On Wed, 31 Mar 2021 22:02:54 GMT, Vladimir Kozlov wrote: >> A trivial refactoring/cleanup: dependency context is always an InstanceKlass, but `Dependencies` uses `Klass*`. >> Migrate` Dependencies` from `Klass*` to `InstanceKlass*`. >> >> Testing: hs-tier1 - hs-tier2 > > Good. Thanks, Vladimir. ------------- PR: https://git.openjdk.java.net/jdk/pull/3293 From vlivanov at openjdk.java.net Thu Apr 1 13:00:28 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Thu, 1 Apr 2021 13:00:28 GMT Subject: Integrated: 8264546: Dependencies: Context class is always an InstanceKlass In-Reply-To: References: Message-ID: On Wed, 31 Mar 2021 21:27:14 GMT, Vladimir Ivanov wrote: > A trivial refactoring/cleanup: dependency context is always an InstanceKlass, but `Dependencies` uses `Klass*`. > Migrate` Dependencies` from `Klass*` to `InstanceKlass*`. > > Testing: hs-tier1 - hs-tier2 This pull request has now been integrated. Changeset: 80681b54 Author: Vladimir Ivanov URL: https://git.openjdk.java.net/jdk/commit/80681b54 Stats: 33 lines in 4 files changed: 1 ins; 6 del; 26 mod 8264546: Dependencies: Context class is always an InstanceKlass Reviewed-by: kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/3293 From vlivanov at openjdk.java.net Thu Apr 1 14:03:48 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Thu, 1 Apr 2021 14:03:48 GMT Subject: RFR: 8264548: Dependencies: ClassHierarchyWalker::is_witness() cleanups [v2] In-Reply-To: References: Message-ID: > Simplify `ClassHierarchyWalker::is_witness()`. > > Testing: hs-tier1 - hs-tier4. Vladimir Ivanov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last revision: Refactor ClassHierarchyWalker::is_witness() ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3297/files - new: https://git.openjdk.java.net/jdk/pull/3297/files/fc92da2e..b7f31326 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3297&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3297&range=00-01 Stats: 3498 lines in 125 files changed: 2719 ins; 306 del; 473 mod Patch: https://git.openjdk.java.net/jdk/pull/3297.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3297/head:pull/3297 PR: https://git.openjdk.java.net/jdk/pull/3297 From vlivanov at openjdk.java.net Thu Apr 1 14:03:49 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Thu, 1 Apr 2021 14:03:49 GMT Subject: RFR: 8264548: Dependencies: ClassHierarchyWalker::is_witness() cleanups [v2] In-Reply-To: References: <7gwnAv5IEEhj8je8965QoyZXFOhdZ1EgpHBop6nlmzE=.7410cbc8-52da-4d51-8663-aaea9b630a11@github.com> Message-ID: On Thu, 1 Apr 2021 09:08:01 GMT, Vladimir Ivanov wrote: >> Looks like it has changes from: https://git.openjdk.java.net/jdk/pull/3293 > > I created it as a dependent PR. > > I think that's the way it works (https://github.com/openjdk/skara/pull/1087 which was used as an example in [1] looks similar). > > [1] https://mail.openjdk.java.net/pipermail/jdk-dev/2021-March/005232.html Anyway, I rebased and force-pushed the patch since #3293 is integrated now. ------------- PR: https://git.openjdk.java.net/jdk/pull/3297 From neliasso at openjdk.java.net Thu Apr 1 15:04:19 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Thu, 1 Apr 2021 15:04:19 GMT Subject: RFR: 8264557: Incorrect copyright year for test/micro/org/openjdk/bench/java/lang/MathBench.java after JDK-8264054 In-Reply-To: References: Message-ID: <6IBF3cgrWKR65rtbyYA9g8m76E1oW_vWtd7NNMq6fR0=.f865574e-82ad-46d0-8967-cf1779efdcb1@github.com> On Thu, 1 Apr 2021 01:54:31 GMT, Jie Fu wrote: > Hi all, > > The copyright year for test/micro/org/openjdk/bench/java/lang/MathBench.java [1] is incorrect since it isn't newly added in 2021. > Let's fix it. > > Thanks. > Best regards, > Jie > > > [1] https://github.com/openjdk/jdk/blob/master/test/micro/org/openjdk/bench/java/lang/MathBench.java#L2 Looks good. Change is trivial and needs no further approval. ------------- Marked as reviewed by neliasso (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3299 From psandoz at openjdk.java.net Thu Apr 1 15:16:26 2021 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Thu, 1 Apr 2021 15:16:26 GMT Subject: RFR: 8264109: Add vectorized implementation for VectorMask.andNot() [v2] In-Reply-To: References: Message-ID: On Thu, 1 Apr 2021 03:39:43 GMT, Xiaohong Gong wrote: >> Currently "VectorMask.andNot()" is not vectorized: >> public VectorMask andNot(VectorMask m) { >> // FIXME: Generate good code here. >> return bOp(m, (i, a, b) -> a && !b); >> } >> This can be implemented with` "and(m.not())" `directly. Since `"VectorMask.and()/not()" `have been vectorized in hotspot, `"andNot"` >> can be vectorized as well by calling them. >> >> The performance gain is >100% for such a simple JMH: >> @Benchmark >> public Object andNot(Blackhole bh) { >> boolean[] mask = fm.apply(SPECIES.length()); >> boolean[] r = fmt.apply(SPECIES.length()); >> VectorMask rm = VectorMask.fromArray(SPECIES, r, 0); >> >> for (int ic = 0; ic < INVOC_COUNT; ic++) { >> for (int i = 0; i < mask.length; i += SPECIES.length()) { >> VectorMask vmask = VectorMask.fromArray(SPECIES, mask, i); >> rm = rm.andNot(vmask); >> } >> } >> return rm; >> } > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Move the changing to AbstractMask.andNot and revert changes in VectorMask > > Change-Id: Ie3ec8f53cb24526c8e1ccc68038355d024910818 > CustomizedGitHooks: yes Looks good. ------------- Marked as reviewed by psandoz (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3211 From jiefu at openjdk.java.net Thu Apr 1 15:16:30 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Thu, 1 Apr 2021 15:16:30 GMT Subject: RFR: 8264557: Incorrect copyright year for test/micro/org/openjdk/bench/java/lang/MathBench.java after JDK-8264054 In-Reply-To: <6IBF3cgrWKR65rtbyYA9g8m76E1oW_vWtd7NNMq6fR0=.f865574e-82ad-46d0-8967-cf1779efdcb1@github.com> References: <6IBF3cgrWKR65rtbyYA9g8m76E1oW_vWtd7NNMq6fR0=.f865574e-82ad-46d0-8967-cf1779efdcb1@github.com> Message-ID: On Thu, 1 Apr 2021 15:01:37 GMT, Nils Eliasson wrote: > Looks good. > > Change is trivial and needs no further approval. Thanks @neliasso . ------------- PR: https://git.openjdk.java.net/jdk/pull/3299 From jiefu at openjdk.java.net Thu Apr 1 15:16:31 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Thu, 1 Apr 2021 15:16:31 GMT Subject: Integrated: 8264557: Incorrect copyright year for test/micro/org/openjdk/bench/java/lang/MathBench.java after JDK-8264054 In-Reply-To: References: Message-ID: On Thu, 1 Apr 2021 01:54:31 GMT, Jie Fu wrote: > Hi all, > > The copyright year for test/micro/org/openjdk/bench/java/lang/MathBench.java [1] is incorrect since it isn't newly added in 2021. > Let's fix it. > > Thanks. > Best regards, > Jie > > > [1] https://github.com/openjdk/jdk/blob/master/test/micro/org/openjdk/bench/java/lang/MathBench.java#L2 This pull request has now been integrated. Changeset: c04a743b Author: Jie Fu URL: https://git.openjdk.java.net/jdk/commit/c04a743b Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8264557: Incorrect copyright year for test/micro/org/openjdk/bench/java/lang/MathBench.java after JDK-8264054 Reviewed-by: neliasso ------------- PR: https://git.openjdk.java.net/jdk/pull/3299 From kvn at openjdk.java.net Thu Apr 1 16:20:17 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 1 Apr 2021 16:20:17 GMT Subject: RFR: 8263448: CTW: fatal error: meet not symmetric In-Reply-To: References: Message-ID: On Thu, 1 Apr 2021 07:12:38 GMT, Roland Westrelin wrote: >> In bug's case instance_id should be changed to InstanceBot only if it is real id and not InstanceTop or InstanceBot (similar to Constant in type lattice). Otherwise we incorrectly change InstanceTop to InstanceBot. In short, we should follow PTR transformation in this code: >> if (ptr == Constant) { >> ptr = NotNull; >> } >> - instance_id = InstanceBot; >> + if (instance_id > 0) { >> + instance_id = InstanceBot; >> + } >> I noticed that `TypeInstPtr::xmeet_helper()` has the same code as `TypeAryPtr::xmeet_helper()` for case when `InstPtr` meets `AryPtr`. So instead of fixing two places I decided to call TypeAryPtr::xmeet_helper() from `TypeInstPtr::xmeet_helper()` for this case. We do similar thing in `TypeOopPtr::xmeet()`: >> https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/type.cpp#L3283 >> >> Fixed code style in touched code. >> >> Tested failing test from bug report, hs-tier1-4, hs-comp. > > Looks good to me. Thank you, Roland. ------------- PR: https://git.openjdk.java.net/jdk/pull/3298 From kvn at openjdk.java.net Thu Apr 1 16:35:17 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 1 Apr 2021 16:35:17 GMT Subject: RFR: 8264548: Dependencies: ClassHierarchyWalker::is_witness() cleanups [v2] In-Reply-To: References: Message-ID: On Thu, 1 Apr 2021 14:03:48 GMT, Vladimir Ivanov wrote: >> Simplify `ClassHierarchyWalker::is_witness()`. >> >> Testing: hs-tier1 - hs-tier4. > > Vladimir Ivanov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last revision: > > Refactor ClassHierarchyWalker::is_witness() Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3297 From kvn at openjdk.java.net Thu Apr 1 17:06:14 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 1 Apr 2021 17:06:14 GMT Subject: Integrated: 8263448: CTW: fatal error: meet not symmetric In-Reply-To: References: Message-ID: On Wed, 31 Mar 2021 23:31:05 GMT, Vladimir Kozlov wrote: > In bug's case instance_id should be changed to InstanceBot only if it is real id and not InstanceTop or InstanceBot (similar to Constant in type lattice). Otherwise we incorrectly change InstanceTop to InstanceBot. In short, we should follow PTR transformation in this code: > if (ptr == Constant) { > ptr = NotNull; > } > - instance_id = InstanceBot; > + if (instance_id > 0) { > + instance_id = InstanceBot; > + } > I noticed that `TypeInstPtr::xmeet_helper()` has the same code as `TypeAryPtr::xmeet_helper()` for case when `InstPtr` meets `AryPtr`. So instead of fixing two places I decided to call TypeAryPtr::xmeet_helper() from `TypeInstPtr::xmeet_helper()` for this case. We do similar thing in `TypeOopPtr::xmeet()`: > https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/type.cpp#L3283 > > Fixed code style in touched code. > > Tested failing test from bug report, hs-tier1-4, hs-comp. This pull request has now been integrated. Changeset: 6e0da996 Author: Vladimir Kozlov URL: https://git.openjdk.java.net/jdk/commit/6e0da996 Stats: 67 lines in 1 file changed: 9 ins; 42 del; 16 mod 8263448: CTW: fatal error: meet not symmetric Reviewed-by: roland ------------- PR: https://git.openjdk.java.net/jdk/pull/3298 From kvn at openjdk.java.net Thu Apr 1 17:27:22 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 1 Apr 2021 17:27:22 GMT Subject: RFR: 8264548: Dependencies: ClassHierarchyWalker::is_witness() cleanups [v2] In-Reply-To: References: Message-ID: On Thu, 1 Apr 2021 16:32:31 GMT, Vladimir Kozlov wrote: >> Vladimir Ivanov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last revision: >> >> Refactor ClassHierarchyWalker::is_witness() > > Good. > I created it as a dependent PR. > > I think that's the way it works ([openjdk/skara#1087](https://github.com/openjdk/skara/pull/1087) which was used as an example in [1] looks similar). > > [1] https://mail.openjdk.java.net/pipermail/jdk-dev/2021-March/005232.html As I understand when you are creating PR which based on an other PR you have to change `base:` from `master` to corresponding `pr/3293` so that only difference of current PR are seen. ------------- PR: https://git.openjdk.java.net/jdk/pull/3297 From kvn at openjdk.java.net Thu Apr 1 17:47:17 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 1 Apr 2021 17:47:17 GMT Subject: RFR: 8261137: Optimization of Box nodes in uncommon_trap [v13] In-Reply-To: References: <8Riu9VCQLM7_vDp5DOMtLZK3yMLQzAkwlIKo4ab0F7Q=.662dbffe-c320-47ea-bc67-508e2c382b12@github.com> Message-ID: On Tue, 30 Mar 2021 01:53:12 GMT, Wang Huang wrote: >> JDK-8075052 has removed useless autobox. However, in some cases, the box is still saved. For instance: >> @Benchmark >> public void testMethod(Blackhole bh) { >> int sum = 0; >> for (int i = 0; i < data.length; i++) { >> Integer ii = Integer.valueOf(data[i]); >> if (i < data.length) { >> sum += ii.intValue(); >> } >> } >> bh.consume(sum); >> } >> Although the variable ii is only used at ii.intValue(), it cannot be eliminated as a result of being used by a hidden uncommon_trap. >> The uncommon_trap is generated by the optimized "if", because its condition is always true. >> >> We can postpone box in uncommon_trap in this situation. We treat box as a scalarized object by adding a SafePointScalarObjectNode in the uncommon_trap node, >> and deleting the use of box: >> >> There is no additional fail/error(s) of jtreg after this patch. >> >> I adjust my codes and add a new benchmark >> >> public class MyBenchmark { >> >> static int[] data = new int[10000]; >> >> static { >> for(int i = 0; i < data.length; ++i) { >> data[i] = i * 1337 % 7331; >> } >> } >> >> @Benchmark >> public void testMethod(Blackhole bh) { >> int sum = 0; >> for (int i = 0; i < data.length; i++) { >> Integer ii = Integer.valueOf(data[i]); >> black(); >> if (i < 100000) { >> sum += ii.intValue(); >> } >> } >> bh.consume(sum); >> } >> >> public void black(){} >> } >> >> >> aarch64: >> base line? >> Benchmark Mode Samples Score Score error Units >> o.s.MyBenchmark.testMethod avgt 30 88.513 1.111 us/op >> >> opt? >> Benchmark Mode Samples Score Score error Units >> o.s.MyBenchmark.testMethod avgt 30 52.776 0.096 us/op >> >> x86: >> base line? >> Benchmark Mode Samples Score Score error Units >> o.s.MyBenchmark.testMethod avgt 30 81.066 3.156 us/op >> >> opt: >> Benchmark Mode Samples Score Score error Units >> o.s.MyBenchmark.testMethod avgt 30 55.596 0.775 us/op > > Wang Huang has updated the pull request incrementally with one additional commit since the last revision: > > trivial fix Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2401 From kvn at openjdk.java.net Thu Apr 1 17:47:17 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 1 Apr 2021 17:47:17 GMT Subject: RFR: 8261137: Optimization of Box nodes in uncommon_trap [v11] In-Reply-To: References: <8Riu9VCQLM7_vDp5DOMtLZK3yMLQzAkwlIKo4ab0F7Q=.662dbffe-c320-47ea-bc67-508e2c382b12@github.com> Message-ID: On Wed, 31 Mar 2021 02:29:33 GMT, Wang Huang wrote: >>> It was my suggestion to pass call node as allocation so that we could trace back for what node SafePointScalarObject was created because you may have several Box objects for which we create SafePointScalarObject nodes. >> >> The problem with that is `alloc` quickly becomes stale since allocations/calls are removed shortly after scalarization takes place. The only usage of `alloc()` is in `PhaseMacroExpand::scalar_replacement`: >> assert(scobj->alloc() == alloc, "sanity"); >> >> Regarding the assert, frankly speaking, it looks repetitive and verbose. I'd prefer to just see it folded into: >> assert(alloc == NULL || alloc->is_Allocate() || alloc->as_CallStaticJava()->is_boxing_method(), "unexpected call node"); >> >> Or introduce a helper method to dump relevant info about the problematic node: >> assert(alloc == NULL || alloc->is_Allocate() || alloc->as_CallStaticJava()->is_boxing_method(), "unexpected call node: %s", dump_node(alloc)); >> >> Also, `alloc == NULL` case can be eliminated by avoiding passing `NULL` in `PhaseVector::scalarize_vbox_node()`. > >> Okay. Lets do as @iwanowww suggesting. > > Should we implement `dump_node` or should we just use `assert(alloc == NULL || alloc->is_Allocate() || alloc->as_CallStaticJava()->is_boxing_method(), "unexpected call node");` without dumping? Creating node's dump string in separate method is complicated. We can just use `false` in assert: #ifdef ASSERT if (alloc != nullptr && !alloc->is_Allocate() && !alloc->as_CallStaticJava()->is_boxing_method()) { alloc->dump(); assert(false, "unexpected call node for scalar replacement"); } #endif ------------- PR: https://git.openjdk.java.net/jdk/pull/2401 From kvn at openjdk.java.net Thu Apr 1 17:47:17 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 1 Apr 2021 17:47:17 GMT Subject: RFR: 8261137: Optimization of Box nodes in uncommon_trap [v11] In-Reply-To: References: <8Riu9VCQLM7_vDp5DOMtLZK3yMLQzAkwlIKo4ab0F7Q=.662dbffe-c320-47ea-bc67-508e2c382b12@github.com> Message-ID: On Thu, 1 Apr 2021 17:41:31 GMT, Vladimir Kozlov wrote: >>> Okay. Lets do as @iwanowww suggesting. >> >> Should we implement `dump_node` or should we just use `assert(alloc == NULL || alloc->is_Allocate() || alloc->as_CallStaticJava()->is_boxing_method(), "unexpected call node");` without dumping? > > Creating node's dump string in separate method is complicated. > We can just use `false` in assert: > #ifdef ASSERT > if (alloc != nullptr && !alloc->is_Allocate() && !alloc->as_CallStaticJava()->is_boxing_method()) { > alloc->dump(); > assert(false, "unexpected call node for scalar replacement"); > } > #endif I did not see your latest changes before answering your question. You did exactly as I suggested. Good. ------------- PR: https://git.openjdk.java.net/jdk/pull/2401 From neliasso at openjdk.java.net Thu Apr 1 20:48:44 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Thu, 1 Apr 2021 20:48:44 GMT Subject: RFR: 8264626: C1 should be able to inline excluded methods Message-ID: I noticed a behavioral between c1 and c2. In c2 excluded methods can still be inlined, which is the desired behaviour. Inlining is controlled separately. I propose a small change to c1 inlining that make it work in the same way. ------------- Commit messages: - c1: allow inlining of excluded methods Changes: https://git.openjdk.java.net/jdk/pull/3315/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3315&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264626 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3315.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3315/head:pull/3315 PR: https://git.openjdk.java.net/jdk/pull/3315 From neliasso at openjdk.java.net Thu Apr 1 21:13:37 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Thu, 1 Apr 2021 21:13:37 GMT Subject: RFR: 8264359: Compiler directives should enable DebugNonSafepoints when PrintAssembly is requested Message-ID: DebugNonSafepoints should be set when PrintAssembly is requested. This only happened for compile commands but not for compiler directives. This PR moves the check to compiler directives - that code path is used for both directives and commands. I am leaving the check and update in arguments.cpp - there might be some need for using flag PrintAssembly for stubs or wrappers, in a code path that doesn't use commands or directives. ------------- Commit messages: - Remove update from compilerOracle - enable DebugNonSafepoints when directives sets printasm Changes: https://git.openjdk.java.net/jdk/pull/3316/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3316&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264359 Stats: 8 lines in 2 files changed: 5 ins; 3 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3316.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3316/head:pull/3316 PR: https://git.openjdk.java.net/jdk/pull/3316 From kvn at openjdk.java.net Thu Apr 1 22:02:26 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 1 Apr 2021 22:02:26 GMT Subject: RFR: 8264359: Compiler directives should enable DebugNonSafepoints when PrintAssembly is requested In-Reply-To: References: Message-ID: On Thu, 1 Apr 2021 20:57:04 GMT, Nils Eliasson wrote: > DebugNonSafepoints should be set when PrintAssembly is requested. This only happened for compile commands but not for compiler directives. This PR moves the check to compiler directives - that code path is used for both directives and commands. I am leaving the check and update in arguments.cpp - there might be some need for using flag PrintAssembly for stubs or wrappers, in a code path that doesn't use commands or directives. Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3316 From whuang at openjdk.java.net Fri Apr 2 01:36:34 2021 From: whuang at openjdk.java.net (Wang Huang) Date: Fri, 2 Apr 2021 01:36:34 GMT Subject: RFR: 8261137: Optimization of Box nodes in uncommon_trap [v13] In-Reply-To: References: <8Riu9VCQLM7_vDp5DOMtLZK3yMLQzAkwlIKo4ab0F7Q=.662dbffe-c320-47ea-bc67-508e2c382b12@github.com> Message-ID: On Thu, 1 Apr 2021 17:44:15 GMT, Vladimir Kozlov wrote: > Good. Thank you for your approval. ;-) ------------- PR: https://git.openjdk.java.net/jdk/pull/2401 From whuang at openjdk.java.net Fri Apr 2 01:36:34 2021 From: whuang at openjdk.java.net (Wang Huang) Date: Fri, 2 Apr 2021 01:36:34 GMT Subject: RFR: 8261137: Optimization of Box nodes in uncommon_trap [v11] In-Reply-To: References: <8Riu9VCQLM7_vDp5DOMtLZK3yMLQzAkwlIKo4ab0F7Q=.662dbffe-c320-47ea-bc67-508e2c382b12@github.com> Message-ID: On Thu, 1 Apr 2021 17:43:02 GMT, Vladimir Kozlov wrote: >> Creating node's dump string in separate method is complicated. >> We can just use `false` in assert: >> #ifdef ASSERT >> if (alloc != nullptr && !alloc->is_Allocate() && !alloc->as_CallStaticJava()->is_boxing_method()) { >> alloc->dump(); >> assert(false, "unexpected call node for scalar replacement"); >> } >> #endif > > I did not see your latest changes before answering your question. You did exactly as I suggested. Good. Thank you for your review. ------------- PR: https://git.openjdk.java.net/jdk/pull/2401 From pli at openjdk.java.net Fri Apr 2 02:30:24 2021 From: pli at openjdk.java.net (Pengfei Li) Date: Fri, 2 Apr 2021 02:30:24 GMT Subject: RFR: 8264409: AArch64: generate better code for Vector API allTrue In-Reply-To: References: Message-ID: <2OZrusN7mizjpB0a0qOs2WiL4Bi5rxGmMGertHeYxwY=.118d860c-6e46-48af-8767-3df113014e78@github.com> On Thu, 1 Apr 2021 07:58:07 GMT, Ningsheng Jian wrote: > In Vector API NEON implementation, we use a vector register to represent vector mask, where an element value of -1 is a true mask and an element value of 0 is a false mask. The allTrue() api is used to check whether all the elements of current mask are set. > > Currently, the AArch64 NEON allTrue implementation looks like: > > andr $tmp, T16B $src1, $src2\t# src2 is maskAllTrue > notr $tmp, T16B, $tmp > addv $tmp, T16B, $tmp > umov $dst, $tmp, B, 0 > cmp $dst, 0 > cset $dst > > where $src2 is a preset all true (-1) constant. We can optimize it to the code sequence like below, to check whether all bits are set: > > uminv $tmp, T16B, $src1 > umov $dst, $tmp, B, 0 > cmp $dst, 0xff > cset $dst > > With this codegen improvement, we can see about 8%~70% performance uplift on different machines for Alibaba's Vector API bigdata benchmarks [1][2]. > > Tested with tier1 and vector api jtreg tests. > > [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/bigdata/BooleanArrayCheck.java#L61 > [2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/bigdata/ValueRangeCheckAndCastL2I.java#L93 Overall looks good to me. (not a reviewer) src/hotspot/cpu/aarch64/aarch64_neon.ad line 3571: > 3569: format %{ "uminv $tmp, T8B, $src1\n\t" > 3570: "umov $dst, $tmp, B, 0\n\t" > 3571: "cmp $dst, 0xff\n\t" I think we should write "#0xff" here. But it looks that all other immediates in format field of aarch64_neon.ad lose the number sign as well. ------------- Marked as reviewed by pli (Committer). PR: https://git.openjdk.java.net/jdk/pull/3302 From njian at openjdk.java.net Fri Apr 2 03:00:22 2021 From: njian at openjdk.java.net (Ningsheng Jian) Date: Fri, 2 Apr 2021 03:00:22 GMT Subject: RFR: 8264409: AArch64: generate better code for Vector API allTrue In-Reply-To: References: Message-ID: On Thu, 1 Apr 2021 09:02:39 GMT, Andrew Dinn wrote: > It's a clever trick to use uminv for this specific case. > The patch looks good to me. Thank you for the review! @adinn ------------- PR: https://git.openjdk.java.net/jdk/pull/3302 From njian at openjdk.java.net Fri Apr 2 03:00:23 2021 From: njian at openjdk.java.net (Ningsheng Jian) Date: Fri, 2 Apr 2021 03:00:23 GMT Subject: RFR: 8264409: AArch64: generate better code for Vector API allTrue In-Reply-To: <2OZrusN7mizjpB0a0qOs2WiL4Bi5rxGmMGertHeYxwY=.118d860c-6e46-48af-8767-3df113014e78@github.com> References: <2OZrusN7mizjpB0a0qOs2WiL4Bi5rxGmMGertHeYxwY=.118d860c-6e46-48af-8767-3df113014e78@github.com> Message-ID: On Fri, 2 Apr 2021 02:26:52 GMT, Pengfei Li wrote: >> In Vector API NEON implementation, we use a vector register to represent vector mask, where an element value of -1 is a true mask and an element value of 0 is a false mask. The allTrue() api is used to check whether all the elements of current mask are set. >> >> Currently, the AArch64 NEON allTrue implementation looks like: >> >> andr $tmp, T16B $src1, $src2\t# src2 is maskAllTrue >> notr $tmp, T16B, $tmp >> addv $tmp, T16B, $tmp >> umov $dst, $tmp, B, 0 >> cmp $dst, 0 >> cset $dst >> >> where $src2 is a preset all true (-1) constant. We can optimize it to the code sequence like below, to check whether all bits are set: >> >> uminv $tmp, T16B, $src1 >> umov $dst, $tmp, B, 0 >> cmp $dst, 0xff >> cset $dst >> >> With this codegen improvement, we can see about 8%~70% performance uplift on different machines for Alibaba's Vector API bigdata benchmarks [1][2]. >> >> Tested with tier1 and vector api jtreg tests. >> >> [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/bigdata/BooleanArrayCheck.java#L61 >> [2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/bigdata/ValueRangeCheckAndCastL2I.java#L93 > > src/hotspot/cpu/aarch64/aarch64_neon.ad line 3571: > >> 3569: format %{ "uminv $tmp, T8B, $src1\n\t" >> 3570: "umov $dst, $tmp, B, 0\n\t" >> 3571: "cmp $dst, 0xff\n\t" > > I think we should write "#0xff" here. But it looks that all other immediates in format field of aarch64_neon.ad lose the number sign as well. Thanks for the review, but I think both are ok. ------------- PR: https://git.openjdk.java.net/jdk/pull/3302 From dongbo at openjdk.java.net Fri Apr 2 03:10:57 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Fri, 2 Apr 2021 03:10:57 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic [v3] In-Reply-To: References: Message-ID: <96N9BNz7s4JH99-5lQio5uEP8iAa4YEmn9NK-dUyvCQ=.c663f383-f207-45c3-97bf-b5309b624315@github.com> > In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding. > Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea can be found at http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords. > > Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build. > Tests in `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java` runned specially for the correctness of the implementation. > > There can be illegal characters at the start of the input if the data is MIME encoded. > It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now. > > A JMH micro, Base64Decode.java, is added for performance test. > With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro), > we witness ~2.5x improvements with long inputs and no regression with short inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on Kunpeng916. > > The Base64Decode.java JMH micro-benchmark results: > > Benchmark (lineSize) (maxNumBytes) Mode Cnt Score Error Units > > # Kunpeng916, intrinsic > Base64Decode.testBase64Decode 4 1 avgt 5 48.614 ? 0.609 ns/op > Base64Decode.testBase64Decode 4 3 avgt 5 58.199 ? 1.650 ns/op > Base64Decode.testBase64Decode 4 7 avgt 5 69.400 ? 0.931 ns/op > Base64Decode.testBase64Decode 4 32 avgt 5 96.818 ? 1.687 ns/op > Base64Decode.testBase64Decode 4 64 avgt 5 122.856 ? 9.217 ns/op > Base64Decode.testBase64Decode 4 80 avgt 5 130.935 ? 1.667 ns/op > Base64Decode.testBase64Decode 4 96 avgt 5 143.627 ? 1.751 ns/op > Base64Decode.testBase64Decode 4 112 avgt 5 152.311 ? 1.178 ns/op > Base64Decode.testBase64Decode 4 512 avgt 5 342.631 ? 0.584 ns/op > Base64Decode.testBase64Decode 4 1000 avgt 5 573.635 ? 1.050 ns/op > Base64Decode.testBase64Decode 4 20000 avgt 5 9534.136 ? 45.172 ns/op > Base64Decode.testBase64Decode 4 50000 avgt 5 22718.726 ? 192.070 ns/op > Base64Decode.testBase64MIMEDecode 4 1 avgt 10 63.558 ? 0.336 ns/op > Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.504 ? 0.848 ns/op > Base64Decode.testBase64MIMEDecode 4 7 avgt 10 120.591 ? 0.608 ns/op > Base64Decode.testBase64MIMEDecode 4 32 avgt 10 324.314 ? 6.236 ns/op > Base64Decode.testBase64MIMEDecode 4 64 avgt 10 532.678 ? 4.670 ns/op > Base64Decode.testBase64MIMEDecode 4 80 avgt 10 678.126 ? 4.324 ns/op > Base64Decode.testBase64MIMEDecode 4 96 avgt 10 771.603 ? 6.393 ns/op > Base64Decode.testBase64MIMEDecode 4 112 avgt 10 889.608 ? 0.759 ns/op > Base64Decode.testBase64MIMEDecode 4 512 avgt 10 3663.557 ? 3.422 ns/op > Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7017.784 ? 9.128 ns/op > Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 128670.660 ? 7951.521 ns/op > Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 317113.667 ? 161.758 ns/op > > # Kunpeng916, default > Base64Decode.testBase64Decode 4 1 avgt 5 48.455 ? 0.571 ns/op > Base64Decode.testBase64Decode 4 3 avgt 5 57.937 ? 0.505 ns/op > Base64Decode.testBase64Decode 4 7 avgt 5 73.823 ? 1.452 ns/op > Base64Decode.testBase64Decode 4 32 avgt 5 106.484 ? 1.243 ns/op > Base64Decode.testBase64Decode 4 64 avgt 5 141.004 ? 1.188 ns/op > Base64Decode.testBase64Decode 4 80 avgt 5 156.284 ? 0.572 ns/op > Base64Decode.testBase64Decode 4 96 avgt 5 174.137 ? 0.177 ns/op > Base64Decode.testBase64Decode 4 112 avgt 5 188.445 ? 0.572 ns/op > Base64Decode.testBase64Decode 4 512 avgt 5 610.847 ? 1.559 ns/op > Base64Decode.testBase64Decode 4 1000 avgt 5 1155.368 ? 0.813 ns/op > Base64Decode.testBase64Decode 4 20000 avgt 5 19751.477 ? 24.669 ns/op > Base64Decode.testBase64Decode 4 50000 avgt 5 50046.586 ? 523.155 ns/op > Base64Decode.testBase64MIMEDecode 4 1 avgt 10 64.130 ? 0.238 ns/op > Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.096 ? 0.205 ns/op > Base64Decode.testBase64MIMEDecode 4 7 avgt 10 118.849 ? 0.610 ns/op > Base64Decode.testBase64MIMEDecode 4 32 avgt 10 331.177 ? 4.732 ns/op > Base64Decode.testBase64MIMEDecode 4 64 avgt 10 549.117 ? 0.177 ns/op > Base64Decode.testBase64MIMEDecode 4 80 avgt 10 702.951 ? 4.572 ns/op > Base64Decode.testBase64MIMEDecode 4 96 avgt 10 799.566 ? 0.301 ns/op > Base64Decode.testBase64MIMEDecode 4 112 avgt 10 923.749 ? 0.389 ns/op > Base64Decode.testBase64MIMEDecode 4 512 avgt 10 4000.725 ? 2.519 ns/op > Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7674.994 ? 9.281 ns/op > Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 142059.001 ? 157.920 ns/op > Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 355698.369 ? 216.542 ns/op Dong Bo has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: - Merge branch 'master' into aarch64.base64.decode - copyright - trivial fixes - Handling error in SIMD case with loops, combining two non-SIMD cases into one code blob, addressing other comments - Merge branch 'master' into aarch64.base64.decode - 8256245: AArch64: Implement Base64 decoding intrinsic ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3228/files - new: https://git.openjdk.java.net/jdk/pull/3228/files/e658ebf4..16ebc471 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3228&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3228&range=01-02 Stats: 7270 lines in 287 files changed: 5225 ins; 950 del; 1095 mod Patch: https://git.openjdk.java.net/jdk/pull/3228.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3228/head:pull/3228 PR: https://git.openjdk.java.net/jdk/pull/3228 From jbhateja at openjdk.java.net Fri Apr 2 03:19:45 2021 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Fri, 2 Apr 2021 03:19:45 GMT Subject: RFR: 8262355: Support for AVX-512 opmask register allocation. [v18] In-Reply-To: <3NqvqAfKOiHvDo7gvwLvi5_U_9Rz8DFBijVVf1wpXWk=.90d51fb9-c6d0-45be-89b7-60851c7a6681@github.com> References: <3NqvqAfKOiHvDo7gvwLvi5_U_9Rz8DFBijVVf1wpXWk=.90d51fb9-c6d0-45be-89b7-60851c7a6681@github.com> Message-ID: > AVX-512 added 8 new 64 bit opmask registers[1] . These registers allow conditional execution and efficient merging of destination operands. At present cross instruction mask propagation is being done either using a GPR (e.g. vmask_gen patterns in x86.ad) or a vector register (for propagating results of a vector comparison or vector load mask operations). > > This base patch extends the register allocator to support allocation of opmask registers. This will facilitate mask propagation across instructions and thus enable emitting efficient instruction sequence over X86 targets supporting AVX-512 feature. > > We intend to build a robust optimization framework[2] based on this patch to emit optimized instruction sequence for masked/predicated vector operation for X86 targets supporting AVX-512. > > Please review and share your feedback. > > Summary of changes: > > 1) AD side changes: New register definitions, register classes, allocation classes, operand definitions and spill code handling for opmask registers. > > 2) Runtime: Save/restoration for opmask registers in 32 and 64 bit JVM. > a) For 64 bit JVM we were anyways reserving the space in the frame layout but earlier were not saving and restoring at designated offset(1088), hence no extra space overhead apart from save/restore cost. > b) For 32 bit JVM: Additional 64 byte are allocated apart from FXSTORE area on the lines of storage for ZMM(16-31) and YMM-Hi bank. There are few regressions due to extra space allocation which we are investigating. > > 3) Replacing all the hard-coded opmask references from macro-assembly routines: Pulling out the opmask occurrences all the way up to instruction pattern and adding an unbounded opmask operand for them. This exposes these operands to RA and scheduler; this will automatically facilitate spilling of live opmask registers across call sites. > > 4) Register class initializations related to Op_RegVMask during matcher startup. > > 5) Handling for mask generating node: Currently VectorMaskGen node uses a GPR to propagate mask across mask generating DEF instruction to its USER instructions. There are other mask generating nodes like VectorCmpMask, VectorLoadMask which are not handled as the part of this patch. Conditional overriding of two routines, ideal_reg and bottom_type for mask generating IDEAL nodes and modifying the instruction patterns to have new opmask operands enables instruction selector to associate opmask register class with USE/DEF operands for such MachNodes. This will constrain the allocation set for these operands to opmask registers(K1-K7). > > 6) Creation of a new concrete type TypeVectMask for mask generation nodes and a convivence routine Type::makemask which creates a regular vector types (TypeVect[SDXYZ]) for non-AVX-512 targets and TypeVectMask for a AVX-512 targets. > > > [1] : Section 15.1.3 : https://software.intel.com/content/www/us/en/develop/download/intel-64-and-ia-32-architectures-software-developers-manual-volume-1-basic-architecture.html > [2] : http://cr.openjdk.java.net/~jbhateja/avx512_masked_operation_optimization/AVX-512_RA_Opmask_Support_VectorMask_Optimizations.pdf Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 20 commits: - 8262355: Review comments resolutions. - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8262355 - 8262355: Updating copywriter for edited files. - 8262355: Adding missed safety check. - 8262355: Review comments resolution. - 8262355: Extending Type::isa_vect and Type::is_vect routines to TypeVectMask since its a valid vector type. - 8262355: Review comments resolution - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8262355 - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8262355 - 8262355: Review comments resolutions. - ... and 10 more: https://git.openjdk.java.net/jdk/compare/8cf1c62c...5aa07306 ------------- Changes: https://git.openjdk.java.net/jdk/pull/2768/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2768&range=17 Stats: 1560 lines in 41 files changed: 1284 ins; 20 del; 256 mod Patch: https://git.openjdk.java.net/jdk/pull/2768.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2768/head:pull/2768 PR: https://git.openjdk.java.net/jdk/pull/2768 From jbhateja at openjdk.java.net Fri Apr 2 03:23:36 2021 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Fri, 2 Apr 2021 03:23:36 GMT Subject: RFR: 8262355: Support for AVX-512 opmask register allocation. [v17] In-Reply-To: References: <3NqvqAfKOiHvDo7gvwLvi5_U_9Rz8DFBijVVf1wpXWk=.90d51fb9-c6d0-45be-89b7-60851c7a6681@github.com> Message-ID: On Mon, 29 Mar 2021 23:02:29 GMT, Vladimir Kozlov wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> 8262355: Updating copywriter for edited files. > > Few notes. > 1. In `sharedRuntime_*.cpp` you have name `opmask_state_*` and `.ad` files also `opmask` and `opmask_reg_k*'. But in C2 code you call them `vectmask` which is confusing. I prefer to use the same name everywhere - `vectmask`: > const RegMask* Matcher::predicate_reg_mask(void) { > return &_OPMASK_REG_mask; > } > > const TypeVect* Matcher::predicate_reg_type(const Type* elemTy, int length) { > return new TypeVectMask(TypeInt::BOOL, length); > } > 2. I assume that they are used only for vectors in compiled code and in other cases the don't need to be saved/restored at safepoints. > 3. In C2 shared code names are mess too: `VectorM, VMASK, RegVMask, TypeVectMask, VectorMaskCmp`. I suggest to use `VectorMask, VECTMASK, RegVectMask`. Hi @vnkozlov , @iwanowww , your comments have been resolved, kindly share the results of tier4/tier5 testing. ------------- PR: https://git.openjdk.java.net/jdk/pull/2768 From xgong at openjdk.java.net Fri Apr 2 03:26:16 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Fri, 2 Apr 2021 03:26:16 GMT Subject: RFR: 8264109: Add vectorized implementation for VectorMask.andNot() [v2] In-Reply-To: References: Message-ID: On Thu, 1 Apr 2021 15:13:31 GMT, Paul Sandoz wrote: >> Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: >> >> Move the changing to AbstractMask.andNot and revert changes in VectorMask >> >> Change-Id: Ie3ec8f53cb24526c8e1ccc68038355d024910818 >> CustomizedGitHooks: yes > > Looks good. Thanks for your review @PaulSandoz ! ------------- PR: https://git.openjdk.java.net/jdk/pull/3211 From jbhateja at openjdk.java.net Fri Apr 2 03:29:30 2021 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Fri, 2 Apr 2021 03:29:30 GMT Subject: RFR: 8262355: Support for AVX-512 opmask register allocation. [v17] In-Reply-To: References: <3NqvqAfKOiHvDo7gvwLvi5_U_9Rz8DFBijVVf1wpXWk=.90d51fb9-c6d0-45be-89b7-60851c7a6681@github.com> Message-ID: On Fri, 2 Apr 2021 03:20:41 GMT, Jatin Bhateja wrote: >> Few notes. >> 1. In `sharedRuntime_*.cpp` you have name `opmask_state_*` and `.ad` files also `opmask` and `opmask_reg_k*'. But in C2 code you call them `vectmask` which is confusing. I prefer to use the same name everywhere - `vectmask`: >> const RegMask* Matcher::predicate_reg_mask(void) { >> return &_OPMASK_REG_mask; >> } >> >> const TypeVect* Matcher::predicate_reg_type(const Type* elemTy, int length) { >> return new TypeVectMask(TypeInt::BOOL, length); >> } >> 2. I assume that they are used only for vectors in compiled code and in other cases the don't need to be saved/restored at safepoints. >> 3. In C2 shared code names are mess too: `VectorM, VMASK, RegVMask, TypeVectMask, VectorMaskCmp`. I suggest to use `VectorMask, VECTMASK, RegVectMask`. > > Hi @vnkozlov , @iwanowww , your comments have been resolved, kindly share the results of tier4/tier5 testing. > Few notes. > > 1. In `sharedRuntime_*.cpp` you have name `opmask_state_*` and `.ad` files also `opmask` and `opmask_reg_k*'. But in C2 code you call them `vectmask`which is confusing. I prefer to use the same name everywhere -`vectmask`: > > ``` > const RegMask* Matcher::predicate_reg_mask(void) { > return &_OPMASK_REG_mask; > } > > const TypeVect* Matcher::predicate_reg_type(const Type* elemTy, int length) { > return new TypeVectMask(TypeInt::BOOL, length); > } > ``` > > 1. I assume that they are used only for vectors in compiled code and in other cases the don't need to be saved/restored at safepoints. There are mask manipulation APIs exposed by jdk.incubator.vector.VectorMask class. This base patch is using new mask operands (used to propagate mask values thorough opmask register) for VectorMaskGen node used during small copy partial in-lining optimization. > 2. In C2 shared code names are mess too: `VectorM, VMASK, RegVMask, TypeVectMask, VectorMaskCmp`. I suggest to use `VectorMask, VECTMASK, RegVectMask`. ------------- PR: https://git.openjdk.java.net/jdk/pull/2768 From xgong at openjdk.java.net Fri Apr 2 03:47:21 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Fri, 2 Apr 2021 03:47:21 GMT Subject: RFR: 8262355: Support for AVX-512 opmask register allocation. [v17] In-Reply-To: References: <3NqvqAfKOiHvDo7gvwLvi5_U_9Rz8DFBijVVf1wpXWk=.90d51fb9-c6d0-45be-89b7-60851c7a6681@github.com> Message-ID: On Fri, 2 Apr 2021 03:25:55 GMT, Jatin Bhateja wrote: >> Hi @vnkozlov , @iwanowww , your comments have been resolved, kindly share the results of tier4/tier5 testing. > >> Few notes. >> >> 1. In `sharedRuntime_*.cpp` you have name `opmask_state_*` and `.ad` files also `opmask` and `opmask_reg_k*'. But in C2 code you call them `vectmask`which is confusing. I prefer to use the same name everywhere -`vectmask`: >> >> ``` >> const RegMask* Matcher::predicate_reg_mask(void) { >> return &_OPMASK_REG_mask; >> } >> >> const TypeVect* Matcher::predicate_reg_type(const Type* elemTy, int length) { >> return new TypeVectMask(TypeInt::BOOL, length); >> } >> ``` >> >> 1. I assume that they are used only for vectors in compiled code and in other cases the don't need to be saved/restored at safepoints. > There are mask manipulation APIs exposed by jdk.incubator.vector.VectorMask class. This base patch is using new mask operands (used to propagate mask values thorough opmask register) for VectorMaskGen node used during small copy partial in-lining optimization. > >> 2. In C2 shared code names are mess too: `VectorM, VMASK, RegVMask, TypeVectMask, VectorMaskCmp`. I suggest to use `VectorMask, VECTMASK, RegVectMask`. Hi @jatin-bhateja , could you please also rename the `RegVMask` to `RegVectMask` in `src/hotspot/cpu/aarch64/aarch64.ad` ? This could make the following issue when building jdk image on aarch64: Ideal node missing: RegVMask assert fails /home/xiagon01/code/jdk/src/hotspot/share/adlc/archDesc.cpp 484: Failed lookup of ideal node Thanks, Xiaohong Gong ------------- PR: https://git.openjdk.java.net/jdk/pull/2768 From dongbo at openjdk.java.net Fri Apr 2 07:08:33 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Fri, 2 Apr 2021 07:08:33 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic In-Reply-To: References: <_ZrhnM9OyXLckhtT27laLzWPZbCFZTPjm6ePbZdbyOs=.fcc6aaba-1578-443a-aa57-8141a99231f6@github.com> Message-ID: On Tue, 30 Mar 2021 03:24:16 GMT, Dong Bo wrote: >>> I think I can rewrite this part as loops. >>> With an intial implemention, we can have almost half of the code size reduced (1312B -> 748B). Sounds OK to you? >> >> Sounds great, but I'm still somewhat concerned that the non-SIMD case only offers 3-12% performance gain. Make it just 748 bytes, and therefore not icache-hostile, then perhaps the balance of risk and reward is justified. > >> > With an intial implemention, we can have almost half of the code size reduced (1312B -> 748B). Sounds OK to you? >> >> Sounds great, but I'm still somewhat concerned that the non-SIMD case only offers 3-12% performance gain. Make it just 748 bytes, and therefore not icache-hostile, then perhaps the balance of risk and reward is justified. > > Hi, @theRealAph @nick-arm > > The code is updated. The error handling in SIMD case was rewriten as loops. > > Also combined the two non-SIMD code blocks into one. > Due to we have only one non-SIMD loop now, it is moved into `generate_base64_decodeBlock`. > The size of the stub is 692 bytes, the non-SIMD loop takes about 92 bytes if my calculation is right. > > Verified with tests `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java`. > Compared with previous implementation, the performance changes are negligible. > > Other comments are addressed too. Thanks. PING... Any suggestions on the updated commit? ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From aph at openjdk.java.net Fri Apr 2 08:41:34 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 2 Apr 2021 08:41:34 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic [v3] In-Reply-To: <96N9BNz7s4JH99-5lQio5uEP8iAa4YEmn9NK-dUyvCQ=.c663f383-f207-45c3-97bf-b5309b624315@github.com> References: <96N9BNz7s4JH99-5lQio5uEP8iAa4YEmn9NK-dUyvCQ=.c663f383-f207-45c3-97bf-b5309b624315@github.com> Message-ID: On Fri, 2 Apr 2021 03:10:57 GMT, Dong Bo wrote: >> In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding. >> Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea can be found at http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords. >> >> Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build. >> Tests in `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java` runned specially for the correctness of the implementation. >> >> There can be illegal characters at the start of the input if the data is MIME encoded. >> It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now. >> >> A JMH micro, Base64Decode.java, is added for performance test. >> With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro), >> we witness ~2.5x improvements with long inputs and no regression with short inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on Kunpeng916. >> >> The Base64Decode.java JMH micro-benchmark results: >> >> Benchmark (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> >> # Kunpeng916, intrinsic >> Base64Decode.testBase64Decode 4 1 avgt 5 48.614 ? 0.609 ns/op >> Base64Decode.testBase64Decode 4 3 avgt 5 58.199 ? 1.650 ns/op >> Base64Decode.testBase64Decode 4 7 avgt 5 69.400 ? 0.931 ns/op >> Base64Decode.testBase64Decode 4 32 avgt 5 96.818 ? 1.687 ns/op >> Base64Decode.testBase64Decode 4 64 avgt 5 122.856 ? 9.217 ns/op >> Base64Decode.testBase64Decode 4 80 avgt 5 130.935 ? 1.667 ns/op >> Base64Decode.testBase64Decode 4 96 avgt 5 143.627 ? 1.751 ns/op >> Base64Decode.testBase64Decode 4 112 avgt 5 152.311 ? 1.178 ns/op >> Base64Decode.testBase64Decode 4 512 avgt 5 342.631 ? 0.584 ns/op >> Base64Decode.testBase64Decode 4 1000 avgt 5 573.635 ? 1.050 ns/op >> Base64Decode.testBase64Decode 4 20000 avgt 5 9534.136 ? 45.172 ns/op >> Base64Decode.testBase64Decode 4 50000 avgt 5 22718.726 ? 192.070 ns/op >> Base64Decode.testBase64MIMEDecode 4 1 avgt 10 63.558 ? 0.336 ns/op >> Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.504 ? 0.848 ns/op >> Base64Decode.testBase64MIMEDecode 4 7 avgt 10 120.591 ? 0.608 ns/op >> Base64Decode.testBase64MIMEDecode 4 32 avgt 10 324.314 ? 6.236 ns/op >> Base64Decode.testBase64MIMEDecode 4 64 avgt 10 532.678 ? 4.670 ns/op >> Base64Decode.testBase64MIMEDecode 4 80 avgt 10 678.126 ? 4.324 ns/op >> Base64Decode.testBase64MIMEDecode 4 96 avgt 10 771.603 ? 6.393 ns/op >> Base64Decode.testBase64MIMEDecode 4 112 avgt 10 889.608 ? 0.759 ns/op >> Base64Decode.testBase64MIMEDecode 4 512 avgt 10 3663.557 ? 3.422 ns/op >> Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7017.784 ? 9.128 ns/op >> Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 128670.660 ? 7951.521 ns/op >> Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 317113.667 ? 161.758 ns/op >> >> # Kunpeng916, default >> Base64Decode.testBase64Decode 4 1 avgt 5 48.455 ? 0.571 ns/op >> Base64Decode.testBase64Decode 4 3 avgt 5 57.937 ? 0.505 ns/op >> Base64Decode.testBase64Decode 4 7 avgt 5 73.823 ? 1.452 ns/op >> Base64Decode.testBase64Decode 4 32 avgt 5 106.484 ? 1.243 ns/op >> Base64Decode.testBase64Decode 4 64 avgt 5 141.004 ? 1.188 ns/op >> Base64Decode.testBase64Decode 4 80 avgt 5 156.284 ? 0.572 ns/op >> Base64Decode.testBase64Decode 4 96 avgt 5 174.137 ? 0.177 ns/op >> Base64Decode.testBase64Decode 4 112 avgt 5 188.445 ? 0.572 ns/op >> Base64Decode.testBase64Decode 4 512 avgt 5 610.847 ? 1.559 ns/op >> Base64Decode.testBase64Decode 4 1000 avgt 5 1155.368 ? 0.813 ns/op >> Base64Decode.testBase64Decode 4 20000 avgt 5 19751.477 ? 24.669 ns/op >> Base64Decode.testBase64Decode 4 50000 avgt 5 50046.586 ? 523.155 ns/op >> Base64Decode.testBase64MIMEDecode 4 1 avgt 10 64.130 ? 0.238 ns/op >> Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.096 ? 0.205 ns/op >> Base64Decode.testBase64MIMEDecode 4 7 avgt 10 118.849 ? 0.610 ns/op >> Base64Decode.testBase64MIMEDecode 4 32 avgt 10 331.177 ? 4.732 ns/op >> Base64Decode.testBase64MIMEDecode 4 64 avgt 10 549.117 ? 0.177 ns/op >> Base64Decode.testBase64MIMEDecode 4 80 avgt 10 702.951 ? 4.572 ns/op >> Base64Decode.testBase64MIMEDecode 4 96 avgt 10 799.566 ? 0.301 ns/op >> Base64Decode.testBase64MIMEDecode 4 112 avgt 10 923.749 ? 0.389 ns/op >> Base64Decode.testBase64MIMEDecode 4 512 avgt 10 4000.725 ? 2.519 ns/op >> Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7674.994 ? 9.281 ns/op >> Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 142059.001 ? 157.920 ns/op >> Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 355698.369 ? 216.542 ns/op > > Dong Bo has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - Merge branch 'master' into aarch64.base64.decode > - copyright > - trivial fixes > - Handling error in SIMD case with loops, combining two non-SIMD cases into one code blob, addressing other comments > - Merge branch 'master' into aarch64.base64.decode > - 8256245: AArch64: Implement Base64 decoding intrinsic test/micro/org/openjdk/bench/java/util/Base64Decode.java line 85: > 83: } > 84: } > 85: Are there any existing test cases for failing inputs? ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From aph at openjdk.java.net Fri Apr 2 08:49:29 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 2 Apr 2021 08:49:29 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic [v3] In-Reply-To: <96N9BNz7s4JH99-5lQio5uEP8iAa4YEmn9NK-dUyvCQ=.c663f383-f207-45c3-97bf-b5309b624315@github.com> References: <96N9BNz7s4JH99-5lQio5uEP8iAa4YEmn9NK-dUyvCQ=.c663f383-f207-45c3-97bf-b5309b624315@github.com> Message-ID: <8yiJSbnHVrZTDGmKi7oQygWvCR8Gyvv4iSLdEAAez7I=.fa47a77d-54f1-4b60-94af-ad1aefb40470@github.com> On Fri, 2 Apr 2021 03:10:57 GMT, Dong Bo wrote: >> In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding. >> Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea can be found at http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords. >> >> Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build. >> Tests in `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java` runned specially for the correctness of the implementation. >> >> There can be illegal characters at the start of the input if the data is MIME encoded. >> It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now. >> >> A JMH micro, Base64Decode.java, is added for performance test. >> With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro), >> we witness ~2.5x improvements with long inputs and no regression with short inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on Kunpeng916. >> >> The Base64Decode.java JMH micro-benchmark results: >> >> Benchmark (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> >> # Kunpeng916, intrinsic >> Base64Decode.testBase64Decode 4 1 avgt 5 48.614 ? 0.609 ns/op >> Base64Decode.testBase64Decode 4 3 avgt 5 58.199 ? 1.650 ns/op >> Base64Decode.testBase64Decode 4 7 avgt 5 69.400 ? 0.931 ns/op >> Base64Decode.testBase64Decode 4 32 avgt 5 96.818 ? 1.687 ns/op >> Base64Decode.testBase64Decode 4 64 avgt 5 122.856 ? 9.217 ns/op >> Base64Decode.testBase64Decode 4 80 avgt 5 130.935 ? 1.667 ns/op >> Base64Decode.testBase64Decode 4 96 avgt 5 143.627 ? 1.751 ns/op >> Base64Decode.testBase64Decode 4 112 avgt 5 152.311 ? 1.178 ns/op >> Base64Decode.testBase64Decode 4 512 avgt 5 342.631 ? 0.584 ns/op >> Base64Decode.testBase64Decode 4 1000 avgt 5 573.635 ? 1.050 ns/op >> Base64Decode.testBase64Decode 4 20000 avgt 5 9534.136 ? 45.172 ns/op >> Base64Decode.testBase64Decode 4 50000 avgt 5 22718.726 ? 192.070 ns/op >> Base64Decode.testBase64MIMEDecode 4 1 avgt 10 63.558 ? 0.336 ns/op >> Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.504 ? 0.848 ns/op >> Base64Decode.testBase64MIMEDecode 4 7 avgt 10 120.591 ? 0.608 ns/op >> Base64Decode.testBase64MIMEDecode 4 32 avgt 10 324.314 ? 6.236 ns/op >> Base64Decode.testBase64MIMEDecode 4 64 avgt 10 532.678 ? 4.670 ns/op >> Base64Decode.testBase64MIMEDecode 4 80 avgt 10 678.126 ? 4.324 ns/op >> Base64Decode.testBase64MIMEDecode 4 96 avgt 10 771.603 ? 6.393 ns/op >> Base64Decode.testBase64MIMEDecode 4 112 avgt 10 889.608 ? 0.759 ns/op >> Base64Decode.testBase64MIMEDecode 4 512 avgt 10 3663.557 ? 3.422 ns/op >> Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7017.784 ? 9.128 ns/op >> Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 128670.660 ? 7951.521 ns/op >> Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 317113.667 ? 161.758 ns/op >> >> # Kunpeng916, default >> Base64Decode.testBase64Decode 4 1 avgt 5 48.455 ? 0.571 ns/op >> Base64Decode.testBase64Decode 4 3 avgt 5 57.937 ? 0.505 ns/op >> Base64Decode.testBase64Decode 4 7 avgt 5 73.823 ? 1.452 ns/op >> Base64Decode.testBase64Decode 4 32 avgt 5 106.484 ? 1.243 ns/op >> Base64Decode.testBase64Decode 4 64 avgt 5 141.004 ? 1.188 ns/op >> Base64Decode.testBase64Decode 4 80 avgt 5 156.284 ? 0.572 ns/op >> Base64Decode.testBase64Decode 4 96 avgt 5 174.137 ? 0.177 ns/op >> Base64Decode.testBase64Decode 4 112 avgt 5 188.445 ? 0.572 ns/op >> Base64Decode.testBase64Decode 4 512 avgt 5 610.847 ? 1.559 ns/op >> Base64Decode.testBase64Decode 4 1000 avgt 5 1155.368 ? 0.813 ns/op >> Base64Decode.testBase64Decode 4 20000 avgt 5 19751.477 ? 24.669 ns/op >> Base64Decode.testBase64Decode 4 50000 avgt 5 50046.586 ? 523.155 ns/op >> Base64Decode.testBase64MIMEDecode 4 1 avgt 10 64.130 ? 0.238 ns/op >> Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.096 ? 0.205 ns/op >> Base64Decode.testBase64MIMEDecode 4 7 avgt 10 118.849 ? 0.610 ns/op >> Base64Decode.testBase64MIMEDecode 4 32 avgt 10 331.177 ? 4.732 ns/op >> Base64Decode.testBase64MIMEDecode 4 64 avgt 10 549.117 ? 0.177 ns/op >> Base64Decode.testBase64MIMEDecode 4 80 avgt 10 702.951 ? 4.572 ns/op >> Base64Decode.testBase64MIMEDecode 4 96 avgt 10 799.566 ? 0.301 ns/op >> Base64Decode.testBase64MIMEDecode 4 112 avgt 10 923.749 ? 0.389 ns/op >> Base64Decode.testBase64MIMEDecode 4 512 avgt 10 4000.725 ? 2.519 ns/op >> Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7674.994 ? 9.281 ns/op >> Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 142059.001 ? 157.920 ns/op >> Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 355698.369 ? 216.542 ns/op > > Dong Bo has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - Merge branch 'master' into aarch64.base64.decode > - copyright > - trivial fixes > - Handling error in SIMD case with loops, combining two non-SIMD cases into one code blob, addressing other comments > - Merge branch 'master' into aarch64.base64.decode > - 8256245: AArch64: Implement Base64 decoding intrinsic src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5811: > 5809: __ ldrb(r12, __ post(src, 1)); > 5810: __ ldrb(r13, __ post(src, 1)); > 5811: // get the de-code For loads and four post increments rather than one load and a few BFMs? Why? ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From aph at openjdk.java.net Fri Apr 2 08:49:24 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 2 Apr 2021 08:49:24 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic In-Reply-To: <_ZrhnM9OyXLckhtT27laLzWPZbCFZTPjm6ePbZdbyOs=.fcc6aaba-1578-443a-aa57-8141a99231f6@github.com> References: <_ZrhnM9OyXLckhtT27laLzWPZbCFZTPjm6ePbZdbyOs=.fcc6aaba-1578-443a-aa57-8141a99231f6@github.com> Message-ID: On Mon, 29 Mar 2021 03:28:54 GMT, Dong Bo wrote: > > Please consider losing the non-SIMD case. It doesn't result in any significant gain. > > The non-SIMD case is useful for MIME decoding performance. Your test results suggest that it isn't useful for that, surely? ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From jbhateja at openjdk.java.net Fri Apr 2 09:12:54 2021 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Fri, 2 Apr 2021 09:12:54 GMT Subject: RFR: 8262355: Support for AVX-512 opmask register allocation. [v19] In-Reply-To: <3NqvqAfKOiHvDo7gvwLvi5_U_9Rz8DFBijVVf1wpXWk=.90d51fb9-c6d0-45be-89b7-60851c7a6681@github.com> References: <3NqvqAfKOiHvDo7gvwLvi5_U_9Rz8DFBijVVf1wpXWk=.90d51fb9-c6d0-45be-89b7-60851c7a6681@github.com> Message-ID: <3Y2wKFJbWyH2VosIFD2UGusR-pCls2S0Y19w4lG2zRk=.79873d5b-0821-4eeb-a231-622c4c1a6367@github.com> > AVX-512 added 8 new 64 bit opmask registers[1] . These registers allow conditional execution and efficient merging of destination operands. At present cross instruction mask propagation is being done either using a GPR (e.g. vmask_gen patterns in x86.ad) or a vector register (for propagating results of a vector comparison or vector load mask operations). > > This base patch extends the register allocator to support allocation of opmask registers. This will facilitate mask propagation across instructions and thus enable emitting efficient instruction sequence over X86 targets supporting AVX-512 feature. > > We intend to build a robust optimization framework[2] based on this patch to emit optimized instruction sequence for masked/predicated vector operation for X86 targets supporting AVX-512. > > Please review and share your feedback. > > Summary of changes: > > 1) AD side changes: New register definitions, register classes, allocation classes, operand definitions and spill code handling for opmask registers. > > 2) Runtime: Save/restoration for opmask registers in 32 and 64 bit JVM. > a) For 64 bit JVM we were anyways reserving the space in the frame layout but earlier were not saving and restoring at designated offset(1088), hence no extra space overhead apart from save/restore cost. > b) For 32 bit JVM: Additional 64 byte are allocated apart from FXSTORE area on the lines of storage for ZMM(16-31) and YMM-Hi bank. There are few regressions due to extra space allocation which we are investigating. > > 3) Replacing all the hard-coded opmask references from macro-assembly routines: Pulling out the opmask occurrences all the way up to instruction pattern and adding an unbounded opmask operand for them. This exposes these operands to RA and scheduler; this will automatically facilitate spilling of live opmask registers across call sites. > > 4) Register class initializations related to Op_RegVMask during matcher startup. > > 5) Handling for mask generating node: Currently VectorMaskGen node uses a GPR to propagate mask across mask generating DEF instruction to its USER instructions. There are other mask generating nodes like VectorCmpMask, VectorLoadMask which are not handled as the part of this patch. Conditional overriding of two routines, ideal_reg and bottom_type for mask generating IDEAL nodes and modifying the instruction patterns to have new opmask operands enables instruction selector to associate opmask register class with USE/DEF operands for such MachNodes. This will constrain the allocation set for these operands to opmask registers(K1-K7). > > 6) Creation of a new concrete type TypeVectMask for mask generation nodes and a convivence routine Type::makemask which creates a regular vector types (TypeVect[SDXYZ]) for non-AVX-512 targets and TypeVectMask for a AVX-512 targets. > > > [1] : Section 15.1.3 : https://software.intel.com/content/www/us/en/develop/download/intel-64-and-ia-32-architectures-software-developers-manual-volume-1-basic-architecture.html > [2] : http://cr.openjdk.java.net/~jbhateja/avx512_masked_operation_optimization/AVX-512_RA_Opmask_Support_VectorMask_Optimizations.pdf Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: 8262355: Fix AARCH64 build issue ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2768/files - new: https://git.openjdk.java.net/jdk/pull/2768/files/5aa07306..366641ab Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2768&range=18 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2768&range=17-18 Stats: 32 lines in 3 files changed: 11 ins; 2 del; 19 mod Patch: https://git.openjdk.java.net/jdk/pull/2768.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2768/head:pull/2768 PR: https://git.openjdk.java.net/jdk/pull/2768 From njian at openjdk.java.net Fri Apr 2 09:35:29 2021 From: njian at openjdk.java.net (Ningsheng Jian) Date: Fri, 2 Apr 2021 09:35:29 GMT Subject: Integrated: 8264409: AArch64: generate better code for Vector API allTrue In-Reply-To: References: Message-ID: On Thu, 1 Apr 2021 07:58:07 GMT, Ningsheng Jian wrote: > In Vector API NEON implementation, we use a vector register to represent vector mask, where an element value of -1 is a true mask and an element value of 0 is a false mask. The allTrue() api is used to check whether all the elements of current mask are set. > > Currently, the AArch64 NEON allTrue implementation looks like: > > andr $tmp, T16B $src1, $src2\t# src2 is maskAllTrue > notr $tmp, T16B, $tmp > addv $tmp, T16B, $tmp > umov $dst, $tmp, B, 0 > cmp $dst, 0 > cset $dst > > where $src2 is a preset all true (-1) constant. We can optimize it to the code sequence like below, to check whether all bits are set: > > uminv $tmp, T16B, $src1 > umov $dst, $tmp, B, 0 > cmp $dst, 0xff > cset $dst > > With this codegen improvement, we can see about 8%~70% performance uplift on different machines for Alibaba's Vector API bigdata benchmarks [1][2]. > > Tested with tier1 and vector api jtreg tests. > > [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/bigdata/BooleanArrayCheck.java#L61 > [2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/bigdata/ValueRangeCheckAndCastL2I.java#L93 This pull request has now been integrated. Changeset: 0935eaa4 Author: Ningsheng Jian URL: https://git.openjdk.java.net/jdk/commit/0935eaa4 Stats: 409 lines in 5 files changed: 13 ins; 12 del; 384 mod 8264409: AArch64: generate better code for Vector API allTrue Reviewed-by: adinn, pli ------------- PR: https://git.openjdk.java.net/jdk/pull/3302 From njian at openjdk.java.net Fri Apr 2 10:04:17 2021 From: njian at openjdk.java.net (Ningsheng Jian) Date: Fri, 2 Apr 2021 10:04:17 GMT Subject: RFR: 8264109: Add vectorized implementation for VectorMask.andNot() [v2] In-Reply-To: References: Message-ID: On Thu, 1 Apr 2021 03:39:43 GMT, Xiaohong Gong wrote: >> Currently "VectorMask.andNot()" is not vectorized: >> public VectorMask andNot(VectorMask m) { >> // FIXME: Generate good code here. >> return bOp(m, (i, a, b) -> a && !b); >> } >> This can be implemented with` "and(m.not())" `directly. Since `"VectorMask.and()/not()" `have been vectorized in hotspot, `"andNot"` >> can be vectorized as well by calling them. >> >> The performance gain is >100% for such a simple JMH: >> @Benchmark >> public Object andNot(Blackhole bh) { >> boolean[] mask = fm.apply(SPECIES.length()); >> boolean[] r = fmt.apply(SPECIES.length()); >> VectorMask rm = VectorMask.fromArray(SPECIES, r, 0); >> >> for (int ic = 0; ic < INVOC_COUNT; ic++) { >> for (int i = 0; i < mask.length; i += SPECIES.length()) { >> VectorMask vmask = VectorMask.fromArray(SPECIES, mask, i); >> rm = rm.andNot(vmask); >> } >> } >> return rm; >> } > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Move the changing to AbstractMask.andNot and revert changes in VectorMask > > Change-Id: Ie3ec8f53cb24526c8e1ccc68038355d024910818 > CustomizedGitHooks: yes LGTM ------------- Marked as reviewed by njian (Committer). PR: https://git.openjdk.java.net/jdk/pull/3211 From xgong at openjdk.java.net Fri Apr 2 10:04:17 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Fri, 2 Apr 2021 10:04:17 GMT Subject: RFR: 8264109: Add vectorized implementation for VectorMask.andNot() [v2] In-Reply-To: References: Message-ID: <7bsAUC0QJ7brBAXu7bTwWuHV4mV4KKCnuPNH8ymY94k=.e545103d-b5f8-46af-92bf-4a236c75d6b6@github.com> On Fri, 2 Apr 2021 09:59:31 GMT, Ningsheng Jian wrote: >> Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: >> >> Move the changing to AbstractMask.andNot and revert changes in VectorMask >> >> Change-Id: Ie3ec8f53cb24526c8e1ccc68038355d024910818 >> CustomizedGitHooks: yes > > LGTM Thanks for the review @nsjian ! ------------- PR: https://git.openjdk.java.net/jdk/pull/3211 From xgong at openjdk.java.net Fri Apr 2 10:04:18 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Fri, 2 Apr 2021 10:04:18 GMT Subject: Integrated: 8264109: Add vectorized implementation for VectorMask.andNot() In-Reply-To: References: Message-ID: On Fri, 26 Mar 2021 01:50:33 GMT, Xiaohong Gong wrote: > Currently "VectorMask.andNot()" is not vectorized: > public VectorMask andNot(VectorMask m) { > // FIXME: Generate good code here. > return bOp(m, (i, a, b) -> a && !b); > } > This can be implemented with` "and(m.not())" `directly. Since `"VectorMask.and()/not()" `have been vectorized in hotspot, `"andNot"` > can be vectorized as well by calling them. > > The performance gain is >100% for such a simple JMH: > @Benchmark > public Object andNot(Blackhole bh) { > boolean[] mask = fm.apply(SPECIES.length()); > boolean[] r = fmt.apply(SPECIES.length()); > VectorMask rm = VectorMask.fromArray(SPECIES, r, 0); > > for (int ic = 0; ic < INVOC_COUNT; ic++) { > for (int i = 0; i < mask.length; i += SPECIES.length()) { > VectorMask vmask = VectorMask.fromArray(SPECIES, mask, i); > rm = rm.andNot(vmask); > } > } > return rm; > } This pull request has now been integrated. Changeset: 7d0a0bad Author: Xiaohong Gong Committer: Ningsheng Jian URL: https://git.openjdk.java.net/jdk/commit/7d0a0bad Stats: 3 lines in 1 file changed: 0 ins; 1 del; 2 mod 8264109: Add vectorized implementation for VectorMask.andNot() Reviewed-by: psandoz, njian ------------- PR: https://git.openjdk.java.net/jdk/pull/3211 From aph at openjdk.java.net Fri Apr 2 10:20:25 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 2 Apr 2021 10:20:25 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic [v3] In-Reply-To: <96N9BNz7s4JH99-5lQio5uEP8iAa4YEmn9NK-dUyvCQ=.c663f383-f207-45c3-97bf-b5309b624315@github.com> References: <96N9BNz7s4JH99-5lQio5uEP8iAa4YEmn9NK-dUyvCQ=.c663f383-f207-45c3-97bf-b5309b624315@github.com> Message-ID: On Fri, 2 Apr 2021 03:10:57 GMT, Dong Bo wrote: >> In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding. >> Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea can be found at http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords. >> >> Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build. >> Tests in `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java` runned specially for the correctness of the implementation. >> >> There can be illegal characters at the start of the input if the data is MIME encoded. >> It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now. >> >> A JMH micro, Base64Decode.java, is added for performance test. >> With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro), >> we witness ~2.5x improvements with long inputs and no regression with short inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on Kunpeng916. >> >> The Base64Decode.java JMH micro-benchmark results: >> >> Benchmark (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> >> # Kunpeng916, intrinsic >> Base64Decode.testBase64Decode 4 1 avgt 5 48.614 ? 0.609 ns/op >> Base64Decode.testBase64Decode 4 3 avgt 5 58.199 ? 1.650 ns/op >> Base64Decode.testBase64Decode 4 7 avgt 5 69.400 ? 0.931 ns/op >> Base64Decode.testBase64Decode 4 32 avgt 5 96.818 ? 1.687 ns/op >> Base64Decode.testBase64Decode 4 64 avgt 5 122.856 ? 9.217 ns/op >> Base64Decode.testBase64Decode 4 80 avgt 5 130.935 ? 1.667 ns/op >> Base64Decode.testBase64Decode 4 96 avgt 5 143.627 ? 1.751 ns/op >> Base64Decode.testBase64Decode 4 112 avgt 5 152.311 ? 1.178 ns/op >> Base64Decode.testBase64Decode 4 512 avgt 5 342.631 ? 0.584 ns/op >> Base64Decode.testBase64Decode 4 1000 avgt 5 573.635 ? 1.050 ns/op >> Base64Decode.testBase64Decode 4 20000 avgt 5 9534.136 ? 45.172 ns/op >> Base64Decode.testBase64Decode 4 50000 avgt 5 22718.726 ? 192.070 ns/op >> Base64Decode.testBase64MIMEDecode 4 1 avgt 10 63.558 ? 0.336 ns/op >> Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.504 ? 0.848 ns/op >> Base64Decode.testBase64MIMEDecode 4 7 avgt 10 120.591 ? 0.608 ns/op >> Base64Decode.testBase64MIMEDecode 4 32 avgt 10 324.314 ? 6.236 ns/op >> Base64Decode.testBase64MIMEDecode 4 64 avgt 10 532.678 ? 4.670 ns/op >> Base64Decode.testBase64MIMEDecode 4 80 avgt 10 678.126 ? 4.324 ns/op >> Base64Decode.testBase64MIMEDecode 4 96 avgt 10 771.603 ? 6.393 ns/op >> Base64Decode.testBase64MIMEDecode 4 112 avgt 10 889.608 ? 0.759 ns/op >> Base64Decode.testBase64MIMEDecode 4 512 avgt 10 3663.557 ? 3.422 ns/op >> Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7017.784 ? 9.128 ns/op >> Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 128670.660 ? 7951.521 ns/op >> Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 317113.667 ? 161.758 ns/op >> >> # Kunpeng916, default >> Base64Decode.testBase64Decode 4 1 avgt 5 48.455 ? 0.571 ns/op >> Base64Decode.testBase64Decode 4 3 avgt 5 57.937 ? 0.505 ns/op >> Base64Decode.testBase64Decode 4 7 avgt 5 73.823 ? 1.452 ns/op >> Base64Decode.testBase64Decode 4 32 avgt 5 106.484 ? 1.243 ns/op >> Base64Decode.testBase64Decode 4 64 avgt 5 141.004 ? 1.188 ns/op >> Base64Decode.testBase64Decode 4 80 avgt 5 156.284 ? 0.572 ns/op >> Base64Decode.testBase64Decode 4 96 avgt 5 174.137 ? 0.177 ns/op >> Base64Decode.testBase64Decode 4 112 avgt 5 188.445 ? 0.572 ns/op >> Base64Decode.testBase64Decode 4 512 avgt 5 610.847 ? 1.559 ns/op >> Base64Decode.testBase64Decode 4 1000 avgt 5 1155.368 ? 0.813 ns/op >> Base64Decode.testBase64Decode 4 20000 avgt 5 19751.477 ? 24.669 ns/op >> Base64Decode.testBase64Decode 4 50000 avgt 5 50046.586 ? 523.155 ns/op >> Base64Decode.testBase64MIMEDecode 4 1 avgt 10 64.130 ? 0.238 ns/op >> Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.096 ? 0.205 ns/op >> Base64Decode.testBase64MIMEDecode 4 7 avgt 10 118.849 ? 0.610 ns/op >> Base64Decode.testBase64MIMEDecode 4 32 avgt 10 331.177 ? 4.732 ns/op >> Base64Decode.testBase64MIMEDecode 4 64 avgt 10 549.117 ? 0.177 ns/op >> Base64Decode.testBase64MIMEDecode 4 80 avgt 10 702.951 ? 4.572 ns/op >> Base64Decode.testBase64MIMEDecode 4 96 avgt 10 799.566 ? 0.301 ns/op >> Base64Decode.testBase64MIMEDecode 4 112 avgt 10 923.749 ? 0.389 ns/op >> Base64Decode.testBase64MIMEDecode 4 512 avgt 10 4000.725 ? 2.519 ns/op >> Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7674.994 ? 9.281 ns/op >> Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 142059.001 ? 157.920 ns/op >> Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 355698.369 ? 216.542 ns/op > > Dong Bo has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - Merge branch 'master' into aarch64.base64.decode > - copyright > - trivial fixes > - Handling error in SIMD case with loops, combining two non-SIMD cases into one code blob, addressing other comments > - Merge branch 'master' into aarch64.base64.decode > - 8256245: AArch64: Implement Base64 decoding intrinsic src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5802: > 5800: // The 1st character of the input can be illegal if the data is MIME encoded. > 5801: // We cannot benefits from SIMD for this case. The max line size of MIME > 5802: // encoding is 76, with the PreProcess80B blob, we actually use no-simd "cannot benefit" ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From aph at openjdk.java.net Fri Apr 2 10:20:20 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 2 Apr 2021 10:20:20 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic In-Reply-To: References: <_ZrhnM9OyXLckhtT27laLzWPZbCFZTPjm6ePbZdbyOs=.fcc6aaba-1578-443a-aa57-8141a99231f6@github.com> Message-ID: On Fri, 2 Apr 2021 07:05:26 GMT, Dong Bo wrote: > PING... Any suggestions on the updated commit? Once you reply to the comments, sure. ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From jbhateja at openjdk.java.net Fri Apr 2 13:16:53 2021 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Fri, 2 Apr 2021 13:16:53 GMT Subject: RFR: 8262355: Support for AVX-512 opmask register allocation. [v20] In-Reply-To: <3NqvqAfKOiHvDo7gvwLvi5_U_9Rz8DFBijVVf1wpXWk=.90d51fb9-c6d0-45be-89b7-60851c7a6681@github.com> References: <3NqvqAfKOiHvDo7gvwLvi5_U_9Rz8DFBijVVf1wpXWk=.90d51fb9-c6d0-45be-89b7-60851c7a6681@github.com> Message-ID: > AVX-512 added 8 new 64 bit opmask registers[1] . These registers allow conditional execution and efficient merging of destination operands. At present cross instruction mask propagation is being done either using a GPR (e.g. vmask_gen patterns in x86.ad) or a vector register (for propagating results of a vector comparison or vector load mask operations). > > This base patch extends the register allocator to support allocation of opmask registers. This will facilitate mask propagation across instructions and thus enable emitting efficient instruction sequence over X86 targets supporting AVX-512 feature. > > We intend to build a robust optimization framework[2] based on this patch to emit optimized instruction sequence for masked/predicated vector operation for X86 targets supporting AVX-512. > > Please review and share your feedback. > > Summary of changes: > > 1) AD side changes: New register definitions, register classes, allocation classes, operand definitions and spill code handling for opmask registers. > > 2) Runtime: Save/restoration for opmask registers in 32 and 64 bit JVM. > a) For 64 bit JVM we were anyways reserving the space in the frame layout but earlier were not saving and restoring at designated offset(1088), hence no extra space overhead apart from save/restore cost. > b) For 32 bit JVM: Additional 64 byte are allocated apart from FXSTORE area on the lines of storage for ZMM(16-31) and YMM-Hi bank. There are few regressions due to extra space allocation which we are investigating. > > 3) Replacing all the hard-coded opmask references from macro-assembly routines: Pulling out the opmask occurrences all the way up to instruction pattern and adding an unbounded opmask operand for them. This exposes these operands to RA and scheduler; this will automatically facilitate spilling of live opmask registers across call sites. > > 4) Register class initializations related to Op_RegVMask during matcher startup. > > 5) Handling for mask generating node: Currently VectorMaskGen node uses a GPR to propagate mask across mask generating DEF instruction to its USER instructions. There are other mask generating nodes like VectorCmpMask, VectorLoadMask which are not handled as the part of this patch. Conditional overriding of two routines, ideal_reg and bottom_type for mask generating IDEAL nodes and modifying the instruction patterns to have new opmask operands enables instruction selector to associate opmask register class with USE/DEF operands for such MachNodes. This will constrain the allocation set for these operands to opmask registers(K1-K7). > > 6) Creation of a new concrete type TypeVectMask for mask generation nodes and a convivence routine Type::makemask which creates a regular vector types (TypeVect[SDXYZ]) for non-AVX-512 targets and TypeVectMask for a AVX-512 targets. > > > [1] : Section 15.1.3 : https://software.intel.com/content/www/us/en/develop/download/intel-64-and-ia-32-architectures-software-developers-manual-volume-1-basic-architecture.html > [2] : http://cr.openjdk.java.net/~jbhateja/avx512_masked_operation_optimization/AVX-512_RA_Opmask_Support_VectorMask_Optimizations.pdf Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 23 commits: - 8262355: Rebasing patch, 32bit clean-up. - Merge http://github.com/openjdk/jdk into JDK-8262355 - 8262355: Fix AARCH64 build issue - 8262355: Review comments resolutions. - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8262355 - 8262355: Updating copywriter for edited files. - 8262355: Adding missed safety check. - 8262355: Review comments resolution. - 8262355: Extending Type::isa_vect and Type::is_vect routines to TypeVectMask since its a valid vector type. - 8262355: Review comments resolution - ... and 13 more: https://git.openjdk.java.net/jdk/compare/7d0a0bad...b9810d20 ------------- Changes: https://git.openjdk.java.net/jdk/pull/2768/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2768&range=19 Stats: 1588 lines in 41 files changed: 1294 ins; 19 del; 275 mod Patch: https://git.openjdk.java.net/jdk/pull/2768.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2768/head:pull/2768 PR: https://git.openjdk.java.net/jdk/pull/2768 From vlivanov at openjdk.java.net Fri Apr 2 16:12:30 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Fri, 2 Apr 2021 16:12:30 GMT Subject: RFR: 8264548: Dependencies: ClassHierarchyWalker::is_witness() cleanups [v2] In-Reply-To: References: Message-ID: On Thu, 1 Apr 2021 17:23:57 GMT, Vladimir Kozlov wrote: >> Good. > >> I created it as a dependent PR. >> >> I think that's the way it works ([openjdk/skara#1087](https://github.com/openjdk/skara/pull/1087) which was used as an example in [1] looks similar). >> >> [1] https://mail.openjdk.java.net/pipermail/jdk-dev/2021-March/005232.html > > As I understand when you are creating PR which based on an other PR you have to change `base:` from `master` to corresponding `pr/3293` so that only difference of current PR are seen. Thanks, Vladimir. ------------- PR: https://git.openjdk.java.net/jdk/pull/3297 From vlivanov at openjdk.java.net Fri Apr 2 16:12:31 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Fri, 2 Apr 2021 16:12:31 GMT Subject: Integrated: 8264548: Dependencies: ClassHierarchyWalker::is_witness() cleanups In-Reply-To: References: Message-ID: On Wed, 31 Mar 2021 21:47:53 GMT, Vladimir Ivanov wrote: > Simplify `ClassHierarchyWalker::is_witness()`. > > Testing: hs-tier1 - hs-tier4. This pull request has now been integrated. Changeset: f60e81bf Author: Vladimir Ivanov URL: https://git.openjdk.java.net/jdk/commit/f60e81bf Stats: 35 lines in 1 file changed: 16 ins; 6 del; 13 mod 8264548: Dependencies: ClassHierarchyWalker::is_witness() cleanups Reviewed-by: kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/3297 From dcubed at openjdk.java.net Fri Apr 2 20:48:33 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 2 Apr 2021 20:48:33 GMT Subject: RFR: 8264662: ProblemList vmTestbase/jit/escape/AdaptiveBlocking/AdaptiveBlocking001/AdaptiveBlocking001.java on win-x64 with ZGC Message-ID: A trivial fix to ProblemList vmTestbase/jit/escape/AdaptiveBlocking/AdaptiveBlocking001/AdaptiveBlocking001.java on win-x64 with ZGC ------------- Commit messages: - 8264662: ProblemList vmTestbase/jit/escape/AdaptiveBlocking/AdaptiveBlocking001/AdaptiveBlocking001.java on win-x64 with ZGC Changes: https://git.openjdk.java.net/jdk/pull/3331/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3331&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264662 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3331.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3331/head:pull/3331 PR: https://git.openjdk.java.net/jdk/pull/3331 From hseigel at openjdk.java.net Fri Apr 2 20:52:17 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Fri, 2 Apr 2021 20:52:17 GMT Subject: RFR: 8264662: ProblemList vmTestbase/jit/escape/AdaptiveBlocking/AdaptiveBlocking001/AdaptiveBlocking001.java on win-x64 with ZGC In-Reply-To: References: Message-ID: On Fri, 2 Apr 2021 20:41:07 GMT, Daniel D. Daugherty wrote: > A trivial fix to ProblemList vmTestbase/jit/escape/AdaptiveBlocking/AdaptiveBlocking001/AdaptiveBlocking001.java > on win-x64 with ZGC Looks good and trivial. Thanks, Harold ------------- Marked as reviewed by hseigel (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3331 From dcubed at openjdk.java.net Fri Apr 2 21:35:20 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 2 Apr 2021 21:35:20 GMT Subject: RFR: 8264662: ProblemList vmTestbase/jit/escape/AdaptiveBlocking/AdaptiveBlocking001/AdaptiveBlocking001.java on win-x64 with ZGC In-Reply-To: References: Message-ID: On Fri, 2 Apr 2021 20:49:35 GMT, Harold Seigel wrote: >> A trivial fix to ProblemList vmTestbase/jit/escape/AdaptiveBlocking/AdaptiveBlocking001/AdaptiveBlocking001.java >> on win-x64 with ZGC > > Looks good and trivial. > Thanks, Harold @hseigel - Thanks for the review! ------------- PR: https://git.openjdk.java.net/jdk/pull/3331 From dcubed at openjdk.java.net Fri Apr 2 21:35:21 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 2 Apr 2021 21:35:21 GMT Subject: Integrated: 8264662: ProblemList vmTestbase/jit/escape/AdaptiveBlocking/AdaptiveBlocking001/AdaptiveBlocking001.java on win-x64 with ZGC In-Reply-To: References: Message-ID: On Fri, 2 Apr 2021 20:41:07 GMT, Daniel D. Daugherty wrote: > A trivial fix to ProblemList vmTestbase/jit/escape/AdaptiveBlocking/AdaptiveBlocking001/AdaptiveBlocking001.java > on win-x64 with ZGC This pull request has now been integrated. Changeset: 9c283da1 Author: Daniel D. Daugherty URL: https://git.openjdk.java.net/jdk/commit/9c283da1 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod 8264662: ProblemList vmTestbase/jit/escape/AdaptiveBlocking/AdaptiveBlocking001/AdaptiveBlocking001.java on win-x64 with ZGC Reviewed-by: hseigel ------------- PR: https://git.openjdk.java.net/jdk/pull/3331 From iveresov at openjdk.java.net Sat Apr 3 22:48:21 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Sat, 3 Apr 2021 22:48:21 GMT Subject: RFR: 8264626: C1 should be able to inline excluded methods In-Reply-To: References: Message-ID: On Thu, 1 Apr 2021 20:32:53 GMT, Nils Eliasson wrote: > I noticed a behavioral between c1 and c2. In c2 excluded methods can still be inlined, which is the desired behaviour. Inlining is controlled separately. I propose a small change to c1 inlining that make it work in the same way. Looks good. ------------- Marked as reviewed by iveresov (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3315 From kvn at openjdk.java.net Sun Apr 4 03:33:54 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Sun, 4 Apr 2021 03:33:54 GMT Subject: RFR: 8262355: Support for AVX-512 opmask register allocation. [v20] In-Reply-To: References: <3NqvqAfKOiHvDo7gvwLvi5_U_9Rz8DFBijVVf1wpXWk=.90d51fb9-c6d0-45be-89b7-60851c7a6681@github.com> Message-ID: On Fri, 2 Apr 2021 13:16:53 GMT, Jatin Bhateja wrote: >> AVX-512 added 8 new 64 bit opmask registers[1] . These registers allow conditional execution and efficient merging of destination operands. At present cross instruction mask propagation is being done either using a GPR (e.g. vmask_gen patterns in x86.ad) or a vector register (for propagating results of a vector comparison or vector load mask operations). >> >> This base patch extends the register allocator to support allocation of opmask registers. This will facilitate mask propagation across instructions and thus enable emitting efficient instruction sequence over X86 targets supporting AVX-512 feature. >> >> We intend to build a robust optimization framework[2] based on this patch to emit optimized instruction sequence for masked/predicated vector operation for X86 targets supporting AVX-512. >> >> Please review and share your feedback. >> >> Summary of changes: >> >> 1) AD side changes: New register definitions, register classes, allocation classes, operand definitions and spill code handling for opmask registers. >> >> 2) Runtime: Save/restoration for opmask registers in 32 and 64 bit JVM. >> a) For 64 bit JVM we were anyways reserving the space in the frame layout but earlier were not saving and restoring at designated offset(1088), hence no extra space overhead apart from save/restore cost. >> b) For 32 bit JVM: Additional 64 byte are allocated apart from FXSTORE area on the lines of storage for ZMM(16-31) and YMM-Hi bank. There are few regressions due to extra space allocation which we are investigating. >> >> 3) Replacing all the hard-coded opmask references from macro-assembly routines: Pulling out the opmask occurrences all the way up to instruction pattern and adding an unbounded opmask operand for them. This exposes these operands to RA and scheduler; this will automatically facilitate spilling of live opmask registers across call sites. >> >> 4) Register class initializations related to Op_RegVMask during matcher startup. >> >> 5) Handling for mask generating node: Currently VectorMaskGen node uses a GPR to propagate mask across mask generating DEF instruction to its USER instructions. There are other mask generating nodes like VectorCmpMask, VectorLoadMask which are not handled as the part of this patch. Conditional overriding of two routines, ideal_reg and bottom_type for mask generating IDEAL nodes and modifying the instruction patterns to have new opmask operands enables instruction selector to associate opmask register class with USE/DEF operands for such MachNodes. This will constrain the allocation set for these operands to opmask registers(K1-K7). >> >> 6) Creation of a new concrete type TypeVectMask for mask generation nodes and a convivence routine Type::makemask which creates a regular vector types (TypeVect[SDXYZ]) for non-AVX-512 targets and TypeVectMask for a AVX-512 targets. >> >> >> [1] : Section 15.1.3 : https://software.intel.com/content/www/us/en/develop/download/intel-64-and-ia-32-architectures-software-developers-manual-volume-1-basic-architecture.html >> [2] : http://cr.openjdk.java.net/~jbhateja/avx512_masked_operation_optimization/AVX-512_RA_Opmask_Support_VectorMask_Optimizations.pdf > > Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 23 commits: > > - 8262355: Rebasing patch, 32bit clean-up. > - Merge http://github.com/openjdk/jdk into JDK-8262355 > - 8262355: Fix AARCH64 build issue > - 8262355: Review comments resolutions. > - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8262355 > - 8262355: Updating copywriter for edited files. > - 8262355: Adding missed safety check. > - 8262355: Review comments resolution. > - 8262355: Extending Type::isa_vect and Type::is_vect routines to TypeVectMask since its a valid vector type. > - 8262355: Review comments resolution > - ... and 13 more: https://git.openjdk.java.net/jdk/compare/7d0a0bad...b9810d20 In general looks better. But it seems you added new instructions into .ad files for ClearArray and String compare which was not in original changes and should not be. ------------- Changes requested by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2768 From jbhateja at openjdk.java.net Sun Apr 4 10:50:39 2021 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Sun, 4 Apr 2021 10:50:39 GMT Subject: RFR: 8262355: Support for AVX-512 opmask register allocation. [v20] In-Reply-To: References: <3NqvqAfKOiHvDo7gvwLvi5_U_9Rz8DFBijVVf1wpXWk=.90d51fb9-c6d0-45be-89b7-60851c7a6681@github.com> Message-ID: On Sun, 4 Apr 2021 03:30:58 GMT, Vladimir Kozlov wrote: > In general looks better. > But it seems you added new instructions into .ad files for ClearArray and String compare which was not in original changes and should not be. Hi @vnkozlov , Thanks for reviewing the patch, new instruction pattern hold a temporary opmask operand needed only for AVX512 flovour of instruction. Earleir same pattern was being used for non-AVX512 instruction which could have posed issues while accessing the encoding for these temporaries in emit routines using as_KRegister()(as pointed by @iwanowww) which internally asserts is_KRegister() having UseAVX > 2 check. We can look at opportunities to reducing redundancies around these new instruction pattern (if any) in subsiquent patches. ------------- PR: https://git.openjdk.java.net/jdk/pull/2768 From kvn at openjdk.java.net Sun Apr 4 17:26:04 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Sun, 4 Apr 2021 17:26:04 GMT Subject: RFR: 8262355: Support for AVX-512 opmask register allocation. [v20] In-Reply-To: References: <3NqvqAfKOiHvDo7gvwLvi5_U_9Rz8DFBijVVf1wpXWk=.90d51fb9-c6d0-45be-89b7-60851c7a6681@github.com> Message-ID: On Fri, 2 Apr 2021 13:16:53 GMT, Jatin Bhateja wrote: >> AVX-512 added 8 new 64 bit opmask registers[1] . These registers allow conditional execution and efficient merging of destination operands. At present cross instruction mask propagation is being done either using a GPR (e.g. vmask_gen patterns in x86.ad) or a vector register (for propagating results of a vector comparison or vector load mask operations). >> >> This base patch extends the register allocator to support allocation of opmask registers. This will facilitate mask propagation across instructions and thus enable emitting efficient instruction sequence over X86 targets supporting AVX-512 feature. >> >> We intend to build a robust optimization framework[2] based on this patch to emit optimized instruction sequence for masked/predicated vector operation for X86 targets supporting AVX-512. >> >> Please review and share your feedback. >> >> Summary of changes: >> >> 1) AD side changes: New register definitions, register classes, allocation classes, operand definitions and spill code handling for opmask registers. >> >> 2) Runtime: Save/restoration for opmask registers in 32 and 64 bit JVM. >> a) For 64 bit JVM we were anyways reserving the space in the frame layout but earlier were not saving and restoring at designated offset(1088), hence no extra space overhead apart from save/restore cost. >> b) For 32 bit JVM: Additional 64 byte are allocated apart from FXSTORE area on the lines of storage for ZMM(16-31) and YMM-Hi bank. There are few regressions due to extra space allocation which we are investigating. >> >> 3) Replacing all the hard-coded opmask references from macro-assembly routines: Pulling out the opmask occurrences all the way up to instruction pattern and adding an unbounded opmask operand for them. This exposes these operands to RA and scheduler; this will automatically facilitate spilling of live opmask registers across call sites. >> >> 4) Register class initializations related to Op_RegVMask during matcher startup. >> >> 5) Handling for mask generating node: Currently VectorMaskGen node uses a GPR to propagate mask across mask generating DEF instruction to its USER instructions. There are other mask generating nodes like VectorCmpMask, VectorLoadMask which are not handled as the part of this patch. Conditional overriding of two routines, ideal_reg and bottom_type for mask generating IDEAL nodes and modifying the instruction patterns to have new opmask operands enables instruction selector to associate opmask register class with USE/DEF operands for such MachNodes. This will constrain the allocation set for these operands to opmask registers(K1-K7). >> >> 6) Creation of a new concrete type TypeVectMask for mask generation nodes and a convivence routine Type::makemask which creates a regular vector types (TypeVect[SDXYZ]) for non-AVX-512 targets and TypeVectMask for a AVX-512 targets. >> >> >> [1] : Section 15.1.3 : https://software.intel.com/content/www/us/en/develop/download/intel-64-and-ia-32-architectures-software-developers-manual-volume-1-basic-architecture.html >> [2] : http://cr.openjdk.java.net/~jbhateja/avx512_masked_operation_optimization/AVX-512_RA_Opmask_Support_VectorMask_Optimizations.pdf > > Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 23 commits: > > - 8262355: Rebasing patch, 32bit clean-up. > - Merge http://github.com/openjdk/jdk into JDK-8262355 > - 8262355: Fix AARCH64 build issue > - 8262355: Review comments resolutions. > - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8262355 > - 8262355: Updating copywriter for edited files. > - 8262355: Adding missed safety check. > - 8262355: Review comments resolution. > - 8262355: Extending Type::isa_vect and Type::is_vect routines to TypeVectMask since its a valid vector type. > - 8262355: Review comments resolution > - ... and 13 more: https://git.openjdk.java.net/jdk/compare/7d0a0bad...b9810d20 Marked as reviewed by kvn (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2768 From kvn at openjdk.java.net Sun Apr 4 17:26:05 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Sun, 4 Apr 2021 17:26:05 GMT Subject: RFR: 8262355: Support for AVX-512 opmask register allocation. [v20] In-Reply-To: References: <3NqvqAfKOiHvDo7gvwLvi5_U_9Rz8DFBijVVf1wpXWk=.90d51fb9-c6d0-45be-89b7-60851c7a6681@github.com> Message-ID: On Sun, 4 Apr 2021 10:47:02 GMT, Jatin Bhateja wrote: >> In general looks better. >> But it seems you added new instructions into .ad files for ClearArray and String compare which was not in original changes and should not be. > >> In general looks better. >> But it seems you added new instructions into .ad files for ClearArray and String compare which was not in original changes and should not be. > > Hi @vnkozlov , > > Thanks for reviewing the patch, new instruction pattern hold a temporary opmask operand needed only for AVX512 flovour of instruction. > Earleir same pattern was being used for non-AVX512 instruction which could have posed issues while accessing the encoding for these temporaries in emit routines using as_KRegister()(as pointed by @iwanowww) which internally asserts is_KRegister() having UseAVX > 2 check. > We can look at opportunities to reducing redundancies around these new instruction pattern (if any) in subsiquent patches. Got it. ------------- PR: https://git.openjdk.java.net/jdk/pull/2768 From jbhateja at openjdk.java.net Sun Apr 4 17:52:17 2021 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Sun, 4 Apr 2021 17:52:17 GMT Subject: Integrated: 8262355: Support for AVX-512 opmask register allocation. In-Reply-To: <3NqvqAfKOiHvDo7gvwLvi5_U_9Rz8DFBijVVf1wpXWk=.90d51fb9-c6d0-45be-89b7-60851c7a6681@github.com> References: <3NqvqAfKOiHvDo7gvwLvi5_U_9Rz8DFBijVVf1wpXWk=.90d51fb9-c6d0-45be-89b7-60851c7a6681@github.com> Message-ID: On Sun, 28 Feb 2021 18:35:11 GMT, Jatin Bhateja wrote: > AVX-512 added 8 new 64 bit opmask registers[1] . These registers allow conditional execution and efficient merging of destination operands. At present cross instruction mask propagation is being done either using a GPR (e.g. vmask_gen patterns in x86.ad) or a vector register (for propagating results of a vector comparison or vector load mask operations). > > This base patch extends the register allocator to support allocation of opmask registers. This will facilitate mask propagation across instructions and thus enable emitting efficient instruction sequence over X86 targets supporting AVX-512 feature. > > We intend to build a robust optimization framework[2] based on this patch to emit optimized instruction sequence for masked/predicated vector operation for X86 targets supporting AVX-512. > > Please review and share your feedback. > > Summary of changes: > > 1) AD side changes: New register definitions, register classes, allocation classes, operand definitions and spill code handling for opmask registers. > > 2) Runtime: Save/restoration for opmask registers in 32 and 64 bit JVM. > a) For 64 bit JVM we were anyways reserving the space in the frame layout but earlier were not saving and restoring at designated offset(1088), hence no extra space overhead apart from save/restore cost. > b) For 32 bit JVM: Additional 64 byte are allocated apart from FXSTORE area on the lines of storage for ZMM(16-31) and YMM-Hi bank. There are few regressions due to extra space allocation which we are investigating. > > 3) Replacing all the hard-coded opmask references from macro-assembly routines: Pulling out the opmask occurrences all the way up to instruction pattern and adding an unbounded opmask operand for them. This exposes these operands to RA and scheduler; this will automatically facilitate spilling of live opmask registers across call sites. > > 4) Register class initializations related to Op_RegVMask during matcher startup. > > 5) Handling for mask generating node: Currently VectorMaskGen node uses a GPR to propagate mask across mask generating DEF instruction to its USER instructions. There are other mask generating nodes like VectorCmpMask, VectorLoadMask which are not handled as the part of this patch. Conditional overriding of two routines, ideal_reg and bottom_type for mask generating IDEAL nodes and modifying the instruction patterns to have new opmask operands enables instruction selector to associate opmask register class with USE/DEF operands for such MachNodes. This will constrain the allocation set for these operands to opmask registers(K1-K7). > > 6) Creation of a new concrete type TypeVectMask for mask generation nodes and a convivence routine Type::makemask which creates a regular vector types (TypeVect[SDXYZ]) for non-AVX-512 targets and TypeVectMask for a AVX-512 targets. > > > [1] : Section 15.1.3 : https://software.intel.com/content/www/us/en/develop/download/intel-64-and-ia-32-architectures-software-developers-manual-volume-1-basic-architecture.html > [2] : http://cr.openjdk.java.net/~jbhateja/avx512_masked_operation_optimization/AVX-512_RA_Opmask_Support_VectorMask_Optimizations.pdf This pull request has now been integrated. Changeset: f084bd2f Author: Jatin Bhateja URL: https://git.openjdk.java.net/jdk/commit/f084bd2f Stats: 1588 lines in 41 files changed: 1294 ins; 19 del; 275 mod 8262355: Support for AVX-512 opmask register allocation. Reviewed-by: vlivanov, njian, kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/2768 From ogatak at openjdk.java.net Mon Apr 5 03:14:05 2021 From: ogatak at openjdk.java.net (Kazunori Ogata) Date: Mon, 5 Apr 2021 03:14:05 GMT Subject: RFR: 8259822: [PPC64] Support the prefixed instruction format added in POWER10 [v8] In-Reply-To: References: Message-ID: On Thu, 25 Mar 2021 10:50:45 GMT, Martin Doerr wrote: >> Kazunori Ogata has updated the pull request incrementally with one additional commit since the last revision: >> >> Clean up compute_padding() and fix grammatical errors > > Looks good to me, now. Is the latest version substantially tested? We need to rely on IBMs testing because nobody else has Power10. @TheRealMDoerr @CoreyAshford Thank you for your review. I think this is now ready to be integrated. @TheRealMDoerr Could you sponsor this pull request? ------------- PR: https://git.openjdk.java.net/jdk/pull/2095 From thartmann at openjdk.java.net Tue Apr 6 05:58:18 2021 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Tue, 6 Apr 2021 05:58:18 GMT Subject: RFR: 8264359: Compiler directives should enable DebugNonSafepoints when PrintAssembly is requested In-Reply-To: References: Message-ID: On Thu, 1 Apr 2021 20:57:04 GMT, Nils Eliasson wrote: > DebugNonSafepoints should be set when PrintAssembly is requested. This only happened for compile commands but not for compiler directives. This PR moves the check to compiler directives - that code path is used for both directives and commands. I am leaving the check and update in arguments.cpp - there might be some need for using flag PrintAssembly for stubs or wrappers, in a code path that doesn't use commands or directives. Looks good. ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3316 From thartmann at openjdk.java.net Tue Apr 6 05:59:23 2021 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Tue, 6 Apr 2021 05:59:23 GMT Subject: RFR: 8264626: C1 should be able to inline excluded methods In-Reply-To: References: Message-ID: On Thu, 1 Apr 2021 20:32:53 GMT, Nils Eliasson wrote: > I noticed a behavioral between c1 and c2. In c2 excluded methods can still be inlined, which is the desired behaviour. Inlining is controlled separately. I propose a small change to c1 inlining that make it work in the same way. Looks good to me. ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3315 From xgong at openjdk.java.net Tue Apr 6 06:00:18 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Tue, 6 Apr 2021 06:00:18 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask In-Reply-To: References: Message-ID: <4LM1zt2vONG29YFtBYQ9nzi6UJqfQP_22vNULK3JbV0=.92a9e621-ca59-4ad0-8a84-3bb6719dfa0b@github.com> On Mon, 29 Mar 2021 06:00:46 GMT, Xiaohong Gong wrote: > The Vector API defines different element types for floating point VectorMask. For example, the bitwise related APIs use "`long/int`", while data related APIs use "`double/float`". This makes some optimizations that based on the type checking cannot work well. > > For example, the VectorBox/Unbox elimination like `"VectorUnbox (VectorBox v) ==> v"` requires the types of output and > input are equal. Normally this is necessary. However, due to the different element type for floating point VectorMask with the same species, the VectorBox/Unbox pattern is optimized to: > VectorLoadMask (VectorStoreMask vmask) > Actually the types can be treated as the same one for such cases. And considering the vector mask representation is the same for > vectors with the same element size and vector length, it's safe to do the optimization: > VectorLoadMask (VectorStoreMask vmask) ==> vmask > The same issue exists for GVN which is based on the type of a node. Making the mask node's` hash()/cmp()` methods depends on the element size instead of the element type can fix it. Hi, could anyone please help to look at this PR? Thanks so much! ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From neliasso at openjdk.java.net Tue Apr 6 06:49:33 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Tue, 6 Apr 2021 06:49:33 GMT Subject: RFR: 8264626: C1 should be able to inline excluded methods In-Reply-To: References: Message-ID: On Tue, 6 Apr 2021 05:56:34 GMT, Tobias Hartmann wrote: >> I noticed a behavioral between c1 and c2. In c2 excluded methods can still be inlined, which is the desired behaviour. Inlining is controlled separately. I propose a small change to c1 inlining that make it work in the same way. > > Looks good to me. Thanks for the reviews Igor and Tobias! ------------- PR: https://git.openjdk.java.net/jdk/pull/3315 From neliasso at openjdk.java.net Tue Apr 6 06:49:34 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Tue, 6 Apr 2021 06:49:34 GMT Subject: Integrated: 8264626: C1 should be able to inline excluded methods In-Reply-To: References: Message-ID: On Thu, 1 Apr 2021 20:32:53 GMT, Nils Eliasson wrote: > I noticed a behavioral between c1 and c2. In c2 excluded methods can still be inlined, which is the desired behaviour. Inlining is controlled separately. I propose a small change to c1 inlining that make it work in the same way. This pull request has now been integrated. Changeset: ec7b0028 Author: Nils Eliasson URL: https://git.openjdk.java.net/jdk/commit/ec7b0028 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod 8264626: C1 should be able to inline excluded methods Reviewed-by: iveresov, thartmann ------------- PR: https://git.openjdk.java.net/jdk/pull/3315 From neliasso at openjdk.java.net Tue Apr 6 06:50:33 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Tue, 6 Apr 2021 06:50:33 GMT Subject: RFR: 8264359: Compiler directives should enable DebugNonSafepoints when PrintAssembly is requested In-Reply-To: References: Message-ID: <25Ge_E5uerbg2dfWiYKbYOoIgkNkjWNH9L6IxqxouIc=.9847bcfd-12d4-456e-9499-d6c66ae11e9b@github.com> On Tue, 6 Apr 2021 05:55:08 GMT, Tobias Hartmann wrote: >> DebugNonSafepoints should be set when PrintAssembly is requested. This only happened for compile commands but not for compiler directives. This PR moves the check to compiler directives - that code path is used for both directives and commands. I am leaving the check and update in arguments.cpp - there might be some need for using flag PrintAssembly for stubs or wrappers, in a code path that doesn't use commands or directives. > > Looks good. Thanks for the reviews Vladimir and Tobias! ------------- PR: https://git.openjdk.java.net/jdk/pull/3316 From neliasso at openjdk.java.net Tue Apr 6 06:50:34 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Tue, 6 Apr 2021 06:50:34 GMT Subject: Integrated: 8264359: Compiler directives should enable DebugNonSafepoints when PrintAssembly is requested In-Reply-To: References: Message-ID: On Thu, 1 Apr 2021 20:57:04 GMT, Nils Eliasson wrote: > DebugNonSafepoints should be set when PrintAssembly is requested. This only happened for compile commands but not for compiler directives. This PR moves the check to compiler directives - that code path is used for both directives and commands. I am leaving the check and update in arguments.cpp - there might be some need for using flag PrintAssembly for stubs or wrappers, in a code path that doesn't use commands or directives. This pull request has now been integrated. Changeset: 81325483 Author: Nils Eliasson URL: https://git.openjdk.java.net/jdk/commit/81325483 Stats: 8 lines in 2 files changed: 5 ins; 3 del; 0 mod 8264359: Compiler directives should enable DebugNonSafepoints when PrintAssembly is requested Reviewed-by: kvn, thartmann ------------- PR: https://git.openjdk.java.net/jdk/pull/3316 From dongbo at openjdk.java.net Tue Apr 6 06:58:03 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Tue, 6 Apr 2021 06:58:03 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic [v4] In-Reply-To: References: Message-ID: > In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding. > Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea can be found at http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords. > > Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build. > Tests in `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java` runned specially for the correctness of the implementation. > > There can be illegal characters at the start of the input if the data is MIME encoded. > It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now. > > A JMH micro, Base64Decode.java, is added for performance test. > With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro), > we witness ~2.5x improvements with long inputs and no regression with short inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on Kunpeng916. > > The Base64Decode.java JMH micro-benchmark results: > > Benchmark (lineSize) (maxNumBytes) Mode Cnt Score Error Units > > # Kunpeng916, intrinsic > Base64Decode.testBase64Decode 4 1 avgt 5 48.614 ? 0.609 ns/op > Base64Decode.testBase64Decode 4 3 avgt 5 58.199 ? 1.650 ns/op > Base64Decode.testBase64Decode 4 7 avgt 5 69.400 ? 0.931 ns/op > Base64Decode.testBase64Decode 4 32 avgt 5 96.818 ? 1.687 ns/op > Base64Decode.testBase64Decode 4 64 avgt 5 122.856 ? 9.217 ns/op > Base64Decode.testBase64Decode 4 80 avgt 5 130.935 ? 1.667 ns/op > Base64Decode.testBase64Decode 4 96 avgt 5 143.627 ? 1.751 ns/op > Base64Decode.testBase64Decode 4 112 avgt 5 152.311 ? 1.178 ns/op > Base64Decode.testBase64Decode 4 512 avgt 5 342.631 ? 0.584 ns/op > Base64Decode.testBase64Decode 4 1000 avgt 5 573.635 ? 1.050 ns/op > Base64Decode.testBase64Decode 4 20000 avgt 5 9534.136 ? 45.172 ns/op > Base64Decode.testBase64Decode 4 50000 avgt 5 22718.726 ? 192.070 ns/op > Base64Decode.testBase64MIMEDecode 4 1 avgt 10 63.558 ? 0.336 ns/op > Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.504 ? 0.848 ns/op > Base64Decode.testBase64MIMEDecode 4 7 avgt 10 120.591 ? 0.608 ns/op > Base64Decode.testBase64MIMEDecode 4 32 avgt 10 324.314 ? 6.236 ns/op > Base64Decode.testBase64MIMEDecode 4 64 avgt 10 532.678 ? 4.670 ns/op > Base64Decode.testBase64MIMEDecode 4 80 avgt 10 678.126 ? 4.324 ns/op > Base64Decode.testBase64MIMEDecode 4 96 avgt 10 771.603 ? 6.393 ns/op > Base64Decode.testBase64MIMEDecode 4 112 avgt 10 889.608 ? 0.759 ns/op > Base64Decode.testBase64MIMEDecode 4 512 avgt 10 3663.557 ? 3.422 ns/op > Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7017.784 ? 9.128 ns/op > Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 128670.660 ? 7951.521 ns/op > Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 317113.667 ? 161.758 ns/op > > # Kunpeng916, default > Base64Decode.testBase64Decode 4 1 avgt 5 48.455 ? 0.571 ns/op > Base64Decode.testBase64Decode 4 3 avgt 5 57.937 ? 0.505 ns/op > Base64Decode.testBase64Decode 4 7 avgt 5 73.823 ? 1.452 ns/op > Base64Decode.testBase64Decode 4 32 avgt 5 106.484 ? 1.243 ns/op > Base64Decode.testBase64Decode 4 64 avgt 5 141.004 ? 1.188 ns/op > Base64Decode.testBase64Decode 4 80 avgt 5 156.284 ? 0.572 ns/op > Base64Decode.testBase64Decode 4 96 avgt 5 174.137 ? 0.177 ns/op > Base64Decode.testBase64Decode 4 112 avgt 5 188.445 ? 0.572 ns/op > Base64Decode.testBase64Decode 4 512 avgt 5 610.847 ? 1.559 ns/op > Base64Decode.testBase64Decode 4 1000 avgt 5 1155.368 ? 0.813 ns/op > Base64Decode.testBase64Decode 4 20000 avgt 5 19751.477 ? 24.669 ns/op > Base64Decode.testBase64Decode 4 50000 avgt 5 50046.586 ? 523.155 ns/op > Base64Decode.testBase64MIMEDecode 4 1 avgt 10 64.130 ? 0.238 ns/op > Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.096 ? 0.205 ns/op > Base64Decode.testBase64MIMEDecode 4 7 avgt 10 118.849 ? 0.610 ns/op > Base64Decode.testBase64MIMEDecode 4 32 avgt 10 331.177 ? 4.732 ns/op > Base64Decode.testBase64MIMEDecode 4 64 avgt 10 549.117 ? 0.177 ns/op > Base64Decode.testBase64MIMEDecode 4 80 avgt 10 702.951 ? 4.572 ns/op > Base64Decode.testBase64MIMEDecode 4 96 avgt 10 799.566 ? 0.301 ns/op > Base64Decode.testBase64MIMEDecode 4 112 avgt 10 923.749 ? 0.389 ns/op > Base64Decode.testBase64MIMEDecode 4 512 avgt 10 4000.725 ? 2.519 ns/op > Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7674.994 ? 9.281 ns/op > Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 142059.001 ? 157.920 ns/op > Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 355698.369 ? 216.542 ns/op Dong Bo has updated the pull request incrementally with one additional commit since the last revision: load data with one ldrw, add JMH tests for error inputs ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3228/files - new: https://git.openjdk.java.net/jdk/pull/3228/files/16ebc471..54a75f05 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3228&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3228&range=02-03 Stats: 37 lines in 2 files changed: 30 ins; 0 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/3228.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3228/head:pull/3228 PR: https://git.openjdk.java.net/jdk/pull/3228 From dongbo at openjdk.java.net Tue Apr 6 07:25:57 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Tue, 6 Apr 2021 07:25:57 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic [v5] In-Reply-To: References: Message-ID: <0aGgD88Mxj7nICTvKhNtkEYlYlP7TUMG082EgaEHU04=.ba1d4468-5c74-4a06-b8c6-d66c7b0c394d@github.com> > In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding. > Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea can be found at http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords. > > Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build. > Tests in `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java` runned specially for the correctness of the implementation. > > There can be illegal characters at the start of the input if the data is MIME encoded. > It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now. > > A JMH micro, Base64Decode.java, is added for performance test. > With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro), > we witness ~2.5x improvements with long inputs and no regression with short inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on Kunpeng916. > > The Base64Decode.java JMH micro-benchmark results: > > Benchmark (lineSize) (maxNumBytes) Mode Cnt Score Error Units > > # Kunpeng916, intrinsic > Base64Decode.testBase64Decode 4 1 avgt 5 48.614 ? 0.609 ns/op > Base64Decode.testBase64Decode 4 3 avgt 5 58.199 ? 1.650 ns/op > Base64Decode.testBase64Decode 4 7 avgt 5 69.400 ? 0.931 ns/op > Base64Decode.testBase64Decode 4 32 avgt 5 96.818 ? 1.687 ns/op > Base64Decode.testBase64Decode 4 64 avgt 5 122.856 ? 9.217 ns/op > Base64Decode.testBase64Decode 4 80 avgt 5 130.935 ? 1.667 ns/op > Base64Decode.testBase64Decode 4 96 avgt 5 143.627 ? 1.751 ns/op > Base64Decode.testBase64Decode 4 112 avgt 5 152.311 ? 1.178 ns/op > Base64Decode.testBase64Decode 4 512 avgt 5 342.631 ? 0.584 ns/op > Base64Decode.testBase64Decode 4 1000 avgt 5 573.635 ? 1.050 ns/op > Base64Decode.testBase64Decode 4 20000 avgt 5 9534.136 ? 45.172 ns/op > Base64Decode.testBase64Decode 4 50000 avgt 5 22718.726 ? 192.070 ns/op > Base64Decode.testBase64MIMEDecode 4 1 avgt 10 63.558 ? 0.336 ns/op > Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.504 ? 0.848 ns/op > Base64Decode.testBase64MIMEDecode 4 7 avgt 10 120.591 ? 0.608 ns/op > Base64Decode.testBase64MIMEDecode 4 32 avgt 10 324.314 ? 6.236 ns/op > Base64Decode.testBase64MIMEDecode 4 64 avgt 10 532.678 ? 4.670 ns/op > Base64Decode.testBase64MIMEDecode 4 80 avgt 10 678.126 ? 4.324 ns/op > Base64Decode.testBase64MIMEDecode 4 96 avgt 10 771.603 ? 6.393 ns/op > Base64Decode.testBase64MIMEDecode 4 112 avgt 10 889.608 ? 0.759 ns/op > Base64Decode.testBase64MIMEDecode 4 512 avgt 10 3663.557 ? 3.422 ns/op > Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7017.784 ? 9.128 ns/op > Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 128670.660 ? 7951.521 ns/op > Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 317113.667 ? 161.758 ns/op > > # Kunpeng916, default > Base64Decode.testBase64Decode 4 1 avgt 5 48.455 ? 0.571 ns/op > Base64Decode.testBase64Decode 4 3 avgt 5 57.937 ? 0.505 ns/op > Base64Decode.testBase64Decode 4 7 avgt 5 73.823 ? 1.452 ns/op > Base64Decode.testBase64Decode 4 32 avgt 5 106.484 ? 1.243 ns/op > Base64Decode.testBase64Decode 4 64 avgt 5 141.004 ? 1.188 ns/op > Base64Decode.testBase64Decode 4 80 avgt 5 156.284 ? 0.572 ns/op > Base64Decode.testBase64Decode 4 96 avgt 5 174.137 ? 0.177 ns/op > Base64Decode.testBase64Decode 4 112 avgt 5 188.445 ? 0.572 ns/op > Base64Decode.testBase64Decode 4 512 avgt 5 610.847 ? 1.559 ns/op > Base64Decode.testBase64Decode 4 1000 avgt 5 1155.368 ? 0.813 ns/op > Base64Decode.testBase64Decode 4 20000 avgt 5 19751.477 ? 24.669 ns/op > Base64Decode.testBase64Decode 4 50000 avgt 5 50046.586 ? 523.155 ns/op > Base64Decode.testBase64MIMEDecode 4 1 avgt 10 64.130 ? 0.238 ns/op > Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.096 ? 0.205 ns/op > Base64Decode.testBase64MIMEDecode 4 7 avgt 10 118.849 ? 0.610 ns/op > Base64Decode.testBase64MIMEDecode 4 32 avgt 10 331.177 ? 4.732 ns/op > Base64Decode.testBase64MIMEDecode 4 64 avgt 10 549.117 ? 0.177 ns/op > Base64Decode.testBase64MIMEDecode 4 80 avgt 10 702.951 ? 4.572 ns/op > Base64Decode.testBase64MIMEDecode 4 96 avgt 10 799.566 ? 0.301 ns/op > Base64Decode.testBase64MIMEDecode 4 112 avgt 10 923.749 ? 0.389 ns/op > Base64Decode.testBase64MIMEDecode 4 512 avgt 10 4000.725 ? 2.519 ns/op > Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7674.994 ? 9.281 ns/op > Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 142059.001 ? 157.920 ns/op > Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 355698.369 ? 216.542 ns/op Dong Bo has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: - conflicts resolved - Merge branch 'master' of https://git.openjdk.java.net/jdk into aarch64.base64.decode - resovling conflicts - load data with one ldrw, add JMH tests for error inputs - Merge branch 'master' into aarch64.base64.decode - copyright - trivial fixes - Handling error in SIMD case with loops, combining two non-SIMD cases into one code blob, addressing other comments - Merge branch 'master' into aarch64.base64.decode - 8256245: AArch64: Implement Base64 decoding intrinsic ------------- Changes: https://git.openjdk.java.net/jdk/pull/3228/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3228&range=04 Stats: 438 lines in 3 files changed: 438 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3228.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3228/head:pull/3228 PR: https://git.openjdk.java.net/jdk/pull/3228 From rcastanedalo at openjdk.java.net Tue Apr 6 07:57:42 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Tue, 6 Apr 2021 07:57:42 GMT Subject: RFR: 8263227: C2: inconsistent spilling due to dead nodes in exception block Message-ID: <9y8Bw6bniQX2kF659D_tyTfbZQc1W3ptA7xSr7fmsc8=.77b3fb6e-6e49-4861-9f1d-3ffbf0b075fd@github.com> This change eliminates dead multi-nodes created by call-catch cleanup after GCM. Eliminating all dead code created by call-catch cleanup avoids potential issues when splitting the live range of call result values, see the analysis in the [bug report](https://bugs.openjdk.java.net/browse/JDK-8263227) for details. This solution is the least invasive of the three alternatives proposed in the bug report (the other two are constraining global code motion and extending live-range splitting). The change also extends the control-flow graph verification pass to catch similar live-range splitting issues earlier (with `+VerifyRegisterAllocator`). Tested on: - original bug reproducer - hs-tier1-5 (windows-x64, linux-x64, linux-aarch64, and macosx-x64) with `+VerifyRegisterAllocator` - hs-tier1-3 (windows-x64, linux-x64, linux-aarch64, and macosx-x64) with `+VerifyRegisterAllocator` and `+StressGCM` (5 repetitions) ------------- Commit messages: - Update comments in test case - Exclude MachMerge nodes from dominance assertion - Remove cloned multi-nodes in call-catch cleanup - Document assumptions in live-range splitting - Add basic definition dominance assertion - Add test case Changes: https://git.openjdk.java.net/jdk/pull/3303/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3303&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263227 Stats: 102 lines in 4 files changed: 95 ins; 0 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/3303.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3303/head:pull/3303 PR: https://git.openjdk.java.net/jdk/pull/3303 From dongbo at openjdk.java.net Tue Apr 6 08:04:12 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Tue, 6 Apr 2021 08:04:12 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic In-Reply-To: References: <_ZrhnM9OyXLckhtT27laLzWPZbCFZTPjm6ePbZdbyOs=.fcc6aaba-1578-443a-aa57-8141a99231f6@github.com> Message-ID: On Fri, 2 Apr 2021 10:17:57 GMT, Andrew Haley wrote: >> PING... Any suggestions on the updated commit? > >> PING... Any suggestions on the updated commit? > > Once you reply to the comments, sure. > > Are there any existing test cases for failing inputs? > I added one, the error character is injected at the paramized index of the encoded data. There are no big differences for small error injected index, seems too much time is took by exception handing. Witnessed ~2x performance improvements as expected. The JMH tests: ### Kunpeng 916, intrinsic?tested with `-jar benchmarks.jar testBase64WithErrorInputsDecode -p errorIndex=3,64,144,208,272,1000,20000 -p maxNumBytes=1` Base64Decode.testBase64WithErrorInputsDecode 3 4 1 avgt 10 3696.151 ? 202.783 ns/op Base64Decode.testBase64WithErrorInputsDecode 64 4 1 avgt 10 3899.269 ? 178.289 ns/op Base64Decode.testBase64WithErrorInputsDecode 144 4 1 avgt 10 3902.022 ? 163.611 ns/op Base64Decode.testBase64WithErrorInputsDecode 208 4 1 avgt 10 3982.423 ? 256.638 ns/op Base64Decode.testBase64WithErrorInputsDecode 272 4 1 avgt 10 3984.545 ? 144.282 ns/op Base64Decode.testBase64WithErrorInputsDecode 1000 4 1 avgt 10 4532.959 ? 310.068 ns/op Base64Decode.testBase64WithErrorInputsDecode 20000 4 1 avgt 10 17578.148 ? 631.600 ns/op ### Kunpeng 916, default?tested with `-XX:-UseBASE64Intrinsics -jar benchmarks.jar testBase64WithErrorInputsDecode -p errorIndex=3,64,144,208,272,1000,20000 -p maxNumBytes=1` Base64Decode.testBase64WithErrorInputsDecode 3 4 1 avgt 10 3760.330 ? 261.672 ns/op Base64Decode.testBase64WithErrorInputsDecode 64 4 1 avgt 10 3900.326 ? 121.632 ns/op Base64Decode.testBase64WithErrorInputsDecode 144 4 1 avgt 10 4041.428 ? 174.435 ns/op Base64Decode.testBase64WithErrorInputsDecode 208 4 1 avgt 10 4177.670 ? 214.433 ns/op Base64Decode.testBase64WithErrorInputsDecode 272 4 1 avgt 10 4324.020 ? 106.826 ns/op Base64Decode.testBase64WithErrorInputsDecode 1000 4 1 avgt 10 5476.469 ? 171.647 ns/op Base64Decode.testBase64WithErrorInputsDecode 20000 4 1 avgt 10 34163.743 ? 162.263 ns/op > > Your test results suggest that it isn't useful for that, surely? > The results suggest non-SIMD code provides ~11.9% improvements for MIME decoding. Furthermore, according to local tests, we may have about ~30% performance regression for MIME decoding without non-SIMD code. In worst case, a MIME line has only 4 base64 encoded characters and a newline string consisted of error inputs, e.g. `\r\n`. When the instrinsic encounter an illegal character (`\r`), it has to exit. Then the Java code will pass the next illegal source byte (`\n`) to the intrinsic. With only SIMD code, it will execute too much wasty instructions before it can detect the error. Whie with non-SIMD code, the instrinsic will execute only one non-SIMD round for this error input. > > For loads and four post increments rather than one load and a few BFMs? Why? > Nice suggestion. Done, thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From whuang at openjdk.java.net Tue Apr 6 09:01:32 2021 From: whuang at openjdk.java.net (Wang Huang) Date: Tue, 6 Apr 2021 09:01:32 GMT Subject: RFR: 8261137: Optimization of Box nodes in uncommon_trap [v13] In-Reply-To: References: <8Riu9VCQLM7_vDp5DOMtLZK3yMLQzAkwlIKo4ab0F7Q=.662dbffe-c320-47ea-bc67-508e2c382b12@github.com> Message-ID: <0NVZ-QNbbnPhWLtP8QyXeHAa7NIV50nStefQ6hNCc9w=.0cb67ad5-7076-461e-8261-fd9f0d69d647@github.com> On Thu, 1 Apr 2021 17:44:15 GMT, Vladimir Kozlov wrote: >> Wang Huang has updated the pull request incrementally with one additional commit since the last revision: >> >> trivial fix > > Good. @vnkozlov Can you do me a favor to `/sponsor` this patch? ------------- PR: https://git.openjdk.java.net/jdk/pull/2401 From shade at openjdk.java.net Tue Apr 6 09:36:45 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 6 Apr 2021 09:36:45 GMT Subject: RFR: 8264759: x86_32 Minimal VM build failure after JDK-8262355 Message-ID: * For target hotspot_variant-minimal_libjvm_objs_sharedRuntime_x86_32.o: ------------- Commit messages: - 8264759: x86_32 Minimal VM build failure after JDK-8262355 Changes: https://git.openjdk.java.net/jdk/pull/3353/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3353&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264759 Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3353.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3353/head:pull/3353 PR: https://git.openjdk.java.net/jdk/pull/3353 From neliasso at openjdk.java.net Tue Apr 6 09:45:40 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Tue, 6 Apr 2021 09:45:40 GMT Subject: RFR: 8263227: C2: inconsistent spilling due to dead nodes in exception block In-Reply-To: <9y8Bw6bniQX2kF659D_tyTfbZQc1W3ptA7xSr7fmsc8=.77b3fb6e-6e49-4861-9f1d-3ffbf0b075fd@github.com> References: <9y8Bw6bniQX2kF659D_tyTfbZQc1W3ptA7xSr7fmsc8=.77b3fb6e-6e49-4861-9f1d-3ffbf0b075fd@github.com> Message-ID: <5JuMKHvQeAR147_GiEJFmWNFxpU00DKY3b9Wo1YelQM=.d45c40cd-dd32-498f-a1fe-99aab5163cec@github.com> On Thu, 1 Apr 2021 10:02:39 GMT, Roberto Casta?eda Lozano wrote: > This change eliminates dead multi-nodes created by call-catch cleanup after GCM. Eliminating all dead code created by call-catch cleanup avoids potential issues when splitting the live range of call result values, see the analysis in the [bug report](https://bugs.openjdk.java.net/browse/JDK-8263227) for details. This solution is the least invasive of the three alternatives proposed in the bug report (the other two are constraining global code motion and extending live-range splitting). > > The change also extends the control-flow graph verification pass to catch similar live-range splitting issues earlier (with `+VerifyRegisterAllocator`). > > Tested on: > - original bug reproducer > - hs-tier1-5 (windows-x64, linux-x64, linux-aarch64, and macosx-x64) with `+VerifyRegisterAllocator` > - hs-tier1-3 (windows-x64, linux-x64, linux-aarch64, and macosx-x64) with `+VerifyRegisterAllocator` and `+StressGCM` (5 repetitions) Looks good! ------------- Marked as reviewed by neliasso (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3303 From aph at openjdk.java.net Tue Apr 6 09:47:31 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 6 Apr 2021 09:47:31 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic In-Reply-To: References: Message-ID: <2JT6v2dKf3XicURmOEsdociN7j0SgmVwYdQMr8PaU-c=.9963e17e-44de-4576-9726-c1b15bd7f4e7@github.com> On Sat, 27 Mar 2021 08:58:03 GMT, Dong Bo wrote: > There can be illegal characters at the start of the input if the data is MIME encoded. > It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now. What is the reasoning here? Sure, there can be illegal characters at the start, but what if there are not? The generic logic uses decodeBlock() even in the MIME case, because we don't know that there certainly will be illegal characters. ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From neliasso at openjdk.java.net Tue Apr 6 09:54:21 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Tue, 6 Apr 2021 09:54:21 GMT Subject: RFR: 8263227: C2: inconsistent spilling due to dead nodes in exception block In-Reply-To: <5JuMKHvQeAR147_GiEJFmWNFxpU00DKY3b9Wo1YelQM=.d45c40cd-dd32-498f-a1fe-99aab5163cec@github.com> References: <9y8Bw6bniQX2kF659D_tyTfbZQc1W3ptA7xSr7fmsc8=.77b3fb6e-6e49-4861-9f1d-3ffbf0b075fd@github.com> <5JuMKHvQeAR147_GiEJFmWNFxpU00DKY3b9Wo1YelQM=.d45c40cd-dd32-498f-a1fe-99aab5163cec@github.com> Message-ID: On Tue, 6 Apr 2021 09:42:01 GMT, Nils Eliasson wrote: >> This change eliminates dead multi-nodes created by call-catch cleanup after GCM. Eliminating all dead code created by call-catch cleanup avoids potential issues when splitting the live range of call result values, see the analysis in the [bug report](https://bugs.openjdk.java.net/browse/JDK-8263227) for details. This solution is the least invasive of the three alternatives proposed in the bug report (the other two are constraining global code motion and extending live-range splitting). >> >> The change also extends the control-flow graph verification pass to catch similar live-range splitting issues earlier (with `+VerifyRegisterAllocator`). >> >> Tested on: >> - original bug reproducer >> - hs-tier1-5 (windows-x64, linux-x64, linux-aarch64, and macosx-x64) with `+VerifyRegisterAllocator` >> - hs-tier1-3 (windows-x64, linux-x64, linux-aarch64, and macosx-x64) with `+VerifyRegisterAllocator` and `+StressGCM` (5 repetitions) > > Looks good! And thanks for the PDF with the excellent walk through! ------------- PR: https://git.openjdk.java.net/jdk/pull/3303 From rcastanedalo at openjdk.java.net Tue Apr 6 09:54:21 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Tue, 6 Apr 2021 09:54:21 GMT Subject: RFR: 8263227: C2: inconsistent spilling due to dead nodes in exception block In-Reply-To: <5JuMKHvQeAR147_GiEJFmWNFxpU00DKY3b9Wo1YelQM=.d45c40cd-dd32-498f-a1fe-99aab5163cec@github.com> References: <9y8Bw6bniQX2kF659D_tyTfbZQc1W3ptA7xSr7fmsc8=.77b3fb6e-6e49-4861-9f1d-3ffbf0b075fd@github.com> <5JuMKHvQeAR147_GiEJFmWNFxpU00DKY3b9Wo1YelQM=.d45c40cd-dd32-498f-a1fe-99aab5163cec@github.com> Message-ID: On Tue, 6 Apr 2021 09:42:01 GMT, Nils Eliasson wrote: > Looks good! Thanks for reviewing, Nils! ------------- PR: https://git.openjdk.java.net/jdk/pull/3303 From thartmann at openjdk.java.net Tue Apr 6 10:15:24 2021 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Tue, 6 Apr 2021 10:15:24 GMT Subject: RFR: 8264759: x86_32 Minimal VM build failure after JDK-8262355 In-Reply-To: References: Message-ID: <85S5rOUs-6I2mblRBp-NnPlYVqM3q-zW8b37C7LAsO0=.8e027e0a-a2ed-4cd0-ad75-332544cf5432@github.com> On Tue, 6 Apr 2021 09:29:33 GMT, Aleksey Shipilev wrote: > * For target hotspot_variant-minimal_libjvm_objs_sharedRuntime_x86_32.o: Looks good and trivial to me. ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3353 From dongbo at openjdk.java.net Tue Apr 6 11:19:36 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Tue, 6 Apr 2021 11:19:36 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic In-Reply-To: <2JT6v2dKf3XicURmOEsdociN7j0SgmVwYdQMr8PaU-c=.9963e17e-44de-4576-9726-c1b15bd7f4e7@github.com> References: <2JT6v2dKf3XicURmOEsdociN7j0SgmVwYdQMr8PaU-c=.9963e17e-44de-4576-9726-c1b15bd7f4e7@github.com> Message-ID: On Tue, 6 Apr 2021 09:44:28 GMT, Andrew Haley wrote: > > It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now. > > What is the reasoning here? Sure, there can be illegal characters at the start, but what if there are not? The generic logic uses decodeBlock() even in the MIME case, because we don't know that there certainly will be illegal characters. This code block only process 80B of the inputs. If no illegal characters were found, the stub will use the SIMD instructions to process the rest of the inputs if the data length is large enough, i.e. >= 64B, to form up at least one SIMD round. ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From thartmann at openjdk.java.net Tue Apr 6 12:34:26 2021 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Tue, 6 Apr 2021 12:34:26 GMT Subject: RFR: 8261137: Optimization of Box nodes in uncommon_trap [v13] In-Reply-To: References: <8Riu9VCQLM7_vDp5DOMtLZK3yMLQzAkwlIKo4ab0F7Q=.662dbffe-c320-47ea-bc67-508e2c382b12@github.com> Message-ID: On Tue, 30 Mar 2021 01:53:12 GMT, Wang Huang wrote: >> JDK-8075052 has removed useless autobox. However, in some cases, the box is still saved. For instance: >> @Benchmark >> public void testMethod(Blackhole bh) { >> int sum = 0; >> for (int i = 0; i < data.length; i++) { >> Integer ii = Integer.valueOf(data[i]); >> if (i < data.length) { >> sum += ii.intValue(); >> } >> } >> bh.consume(sum); >> } >> Although the variable ii is only used at ii.intValue(), it cannot be eliminated as a result of being used by a hidden uncommon_trap. >> The uncommon_trap is generated by the optimized "if", because its condition is always true. >> >> We can postpone box in uncommon_trap in this situation. We treat box as a scalarized object by adding a SafePointScalarObjectNode in the uncommon_trap node, >> and deleting the use of box: >> >> There is no additional fail/error(s) of jtreg after this patch. >> >> I adjust my codes and add a new benchmark >> >> public class MyBenchmark { >> >> static int[] data = new int[10000]; >> >> static { >> for(int i = 0; i < data.length; ++i) { >> data[i] = i * 1337 % 7331; >> } >> } >> >> @Benchmark >> public void testMethod(Blackhole bh) { >> int sum = 0; >> for (int i = 0; i < data.length; i++) { >> Integer ii = Integer.valueOf(data[i]); >> black(); >> if (i < 100000) { >> sum += ii.intValue(); >> } >> } >> bh.consume(sum); >> } >> >> public void black(){} >> } >> >> >> aarch64: >> base line? >> Benchmark Mode Samples Score Score error Units >> o.s.MyBenchmark.testMethod avgt 30 88.513 1.111 us/op >> >> opt? >> Benchmark Mode Samples Score Score error Units >> o.s.MyBenchmark.testMethod avgt 30 52.776 0.096 us/op >> >> x86: >> base line? >> Benchmark Mode Samples Score Score error Units >> o.s.MyBenchmark.testMethod avgt 30 81.066 3.156 us/op >> >> opt: >> Benchmark Mode Samples Score Score error Units >> o.s.MyBenchmark.testMethod avgt 30 55.596 0.775 us/op > > Wang Huang has updated the pull request incrementally with one additional commit since the last revision: > > trivial fix Looks good to me too. ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2401 From aph at openjdk.java.net Tue Apr 6 14:07:26 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 6 Apr 2021 14:07:26 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic [v5] In-Reply-To: <0aGgD88Mxj7nICTvKhNtkEYlYlP7TUMG082EgaEHU04=.ba1d4468-5c74-4a06-b8c6-d66c7b0c394d@github.com> References: <0aGgD88Mxj7nICTvKhNtkEYlYlP7TUMG082EgaEHU04=.ba1d4468-5c74-4a06-b8c6-d66c7b0c394d@github.com> Message-ID: On Tue, 6 Apr 2021 07:25:57 GMT, Dong Bo wrote: >> In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding. >> Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea can be found at http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords. >> >> Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build. >> Tests in `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java` runned specially for the correctness of the implementation. >> >> There can be illegal characters at the start of the input if the data is MIME encoded. >> It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now. >> >> A JMH micro, Base64Decode.java, is added for performance test. >> With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro), >> we witness ~2.5x improvements with long inputs and no regression with short inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on Kunpeng916. >> >> The Base64Decode.java JMH micro-benchmark results: >> >> Benchmark (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> >> # Kunpeng916, intrinsic >> Base64Decode.testBase64Decode 4 1 avgt 5 48.614 ? 0.609 ns/op >> Base64Decode.testBase64Decode 4 3 avgt 5 58.199 ? 1.650 ns/op >> Base64Decode.testBase64Decode 4 7 avgt 5 69.400 ? 0.931 ns/op >> Base64Decode.testBase64Decode 4 32 avgt 5 96.818 ? 1.687 ns/op >> Base64Decode.testBase64Decode 4 64 avgt 5 122.856 ? 9.217 ns/op >> Base64Decode.testBase64Decode 4 80 avgt 5 130.935 ? 1.667 ns/op >> Base64Decode.testBase64Decode 4 96 avgt 5 143.627 ? 1.751 ns/op >> Base64Decode.testBase64Decode 4 112 avgt 5 152.311 ? 1.178 ns/op >> Base64Decode.testBase64Decode 4 512 avgt 5 342.631 ? 0.584 ns/op >> Base64Decode.testBase64Decode 4 1000 avgt 5 573.635 ? 1.050 ns/op >> Base64Decode.testBase64Decode 4 20000 avgt 5 9534.136 ? 45.172 ns/op >> Base64Decode.testBase64Decode 4 50000 avgt 5 22718.726 ? 192.070 ns/op >> Base64Decode.testBase64MIMEDecode 4 1 avgt 10 63.558 ? 0.336 ns/op >> Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.504 ? 0.848 ns/op >> Base64Decode.testBase64MIMEDecode 4 7 avgt 10 120.591 ? 0.608 ns/op >> Base64Decode.testBase64MIMEDecode 4 32 avgt 10 324.314 ? 6.236 ns/op >> Base64Decode.testBase64MIMEDecode 4 64 avgt 10 532.678 ? 4.670 ns/op >> Base64Decode.testBase64MIMEDecode 4 80 avgt 10 678.126 ? 4.324 ns/op >> Base64Decode.testBase64MIMEDecode 4 96 avgt 10 771.603 ? 6.393 ns/op >> Base64Decode.testBase64MIMEDecode 4 112 avgt 10 889.608 ? 0.759 ns/op >> Base64Decode.testBase64MIMEDecode 4 512 avgt 10 3663.557 ? 3.422 ns/op >> Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7017.784 ? 9.128 ns/op >> Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 128670.660 ? 7951.521 ns/op >> Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 317113.667 ? 161.758 ns/op >> >> # Kunpeng916, default >> Base64Decode.testBase64Decode 4 1 avgt 5 48.455 ? 0.571 ns/op >> Base64Decode.testBase64Decode 4 3 avgt 5 57.937 ? 0.505 ns/op >> Base64Decode.testBase64Decode 4 7 avgt 5 73.823 ? 1.452 ns/op >> Base64Decode.testBase64Decode 4 32 avgt 5 106.484 ? 1.243 ns/op >> Base64Decode.testBase64Decode 4 64 avgt 5 141.004 ? 1.188 ns/op >> Base64Decode.testBase64Decode 4 80 avgt 5 156.284 ? 0.572 ns/op >> Base64Decode.testBase64Decode 4 96 avgt 5 174.137 ? 0.177 ns/op >> Base64Decode.testBase64Decode 4 112 avgt 5 188.445 ? 0.572 ns/op >> Base64Decode.testBase64Decode 4 512 avgt 5 610.847 ? 1.559 ns/op >> Base64Decode.testBase64Decode 4 1000 avgt 5 1155.368 ? 0.813 ns/op >> Base64Decode.testBase64Decode 4 20000 avgt 5 19751.477 ? 24.669 ns/op >> Base64Decode.testBase64Decode 4 50000 avgt 5 50046.586 ? 523.155 ns/op >> Base64Decode.testBase64MIMEDecode 4 1 avgt 10 64.130 ? 0.238 ns/op >> Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.096 ? 0.205 ns/op >> Base64Decode.testBase64MIMEDecode 4 7 avgt 10 118.849 ? 0.610 ns/op >> Base64Decode.testBase64MIMEDecode 4 32 avgt 10 331.177 ? 4.732 ns/op >> Base64Decode.testBase64MIMEDecode 4 64 avgt 10 549.117 ? 0.177 ns/op >> Base64Decode.testBase64MIMEDecode 4 80 avgt 10 702.951 ? 4.572 ns/op >> Base64Decode.testBase64MIMEDecode 4 96 avgt 10 799.566 ? 0.301 ns/op >> Base64Decode.testBase64MIMEDecode 4 112 avgt 10 923.749 ? 0.389 ns/op >> Base64Decode.testBase64MIMEDecode 4 512 avgt 10 4000.725 ? 2.519 ns/op >> Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7674.994 ? 9.281 ns/op >> Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 142059.001 ? 157.920 ns/op >> Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 355698.369 ? 216.542 ns/op > > Dong Bo has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: > > - conflicts resolved > - Merge branch 'master' of https://git.openjdk.java.net/jdk into aarch64.base64.decode > - resovling conflicts > - load data with one ldrw, add JMH tests for error inputs > - Merge branch 'master' into aarch64.base64.decode > - copyright > - trivial fixes > - Handling error in SIMD case with loops, combining two non-SIMD cases into one code blob, addressing other comments > - Merge branch 'master' into aarch64.base64.decode > - 8256245: AArch64: Implement Base64 decoding intrinsic src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5800: > 5798: __ br(Assembler::LT, Process4B); > 5799: > 5800: // The 1st character of the input can be illegal if the data is MIME encoded. Why is this sentence here? It is very misleading. ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From aph at openjdk.java.net Tue Apr 6 14:07:28 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 6 Apr 2021 14:07:28 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic [v3] In-Reply-To: References: <96N9BNz7s4JH99-5lQio5uEP8iAa4YEmn9NK-dUyvCQ=.c663f383-f207-45c3-97bf-b5309b624315@github.com> Message-ID: On Fri, 2 Apr 2021 10:01:27 GMT, Andrew Haley wrote: >> Dong Bo has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: >> >> - Merge branch 'master' into aarch64.base64.decode >> - copyright >> - trivial fixes >> - Handling error in SIMD case with loops, combining two non-SIMD cases into one code blob, addressing other comments >> - Merge branch 'master' into aarch64.base64.decode >> - 8256245: AArch64: Implement Base64 decoding intrinsic > > src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5802: > >> 5800: // The 1st character of the input can be illegal if the data is MIME encoded. >> 5801: // We cannot benefits from SIMD for this case. The max line size of MIME >> 5802: // encoding is 76, with the PreProcess80B blob, we actually use no-simd > > "cannot benefit" OK, so I now understand what is actually going on here, and it has nothing to do with illegal first characters. The problem is that the maximum block length the decode will be supplied with is 76 bytes, and there isn't enough time for the SIMD to be worthwhile. So the comment should be "In the MIME case, the line length cannot be more than 76 bytes (see RFC 2045.) This is too short a block for SIMD to be worthwhile, so we use non-SIMD here." ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From kvn at openjdk.java.net Tue Apr 6 15:57:20 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 6 Apr 2021 15:57:20 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v2] In-Reply-To: References: Message-ID: On Mon, 29 Mar 2021 06:28:42 GMT, Xiaohong Gong wrote: >> The Vector API defines different element types for floating point VectorMask. For example, the bitwise related APIs use "`long/int`", while data related APIs use "`double/float`". This makes some optimizations that based on the type checking cannot work well. >> >> For example, the VectorBox/Unbox elimination like `"VectorUnbox (VectorBox v) ==> v"` requires the types of output and >> input are equal. Normally this is necessary. However, due to the different element type for floating point VectorMask with the same species, the VectorBox/Unbox pattern is optimized to: >> VectorLoadMask (VectorStoreMask vmask) >> Actually the types can be treated as the same one for such cases. And considering the vector mask representation is the same for >> vectors with the same element size and vector length, it's safe to do the optimization: >> VectorLoadMask (VectorStoreMask vmask) ==> vmask >> The same issue exists for GVN which is based on the type of a node. Making the mask node's` hash()/cmp()` methods depends on the element size instead of the element type can fix it. > > Xiaohong Gong has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask @iwanowww should confirm correctness of such optimization. Regarding changes - they seem fine to me. I notice that VectorNode and its subclasses do not check for TOP inputs. Since Vector API introduce vectors in graph before SuperWord transformation their input could become dead. How such cases handled? And why we did not hit them yet? is_vect() should hit assert. ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From kvn at openjdk.java.net Tue Apr 6 16:11:32 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 6 Apr 2021 16:11:32 GMT Subject: RFR: 8261137: Optimization of Box nodes in uncommon_trap [v13] In-Reply-To: References: <8Riu9VCQLM7_vDp5DOMtLZK3yMLQzAkwlIKo4ab0F7Q=.662dbffe-c320-47ea-bc67-508e2c382b12@github.com> Message-ID: On Tue, 6 Apr 2021 12:31:22 GMT, Tobias Hartmann wrote: >> Wang Huang has updated the pull request incrementally with one additional commit since the last revision: >> >> trivial fix > > Looks good to me too. I will wait results of testing Vladimir I. is running. ------------- PR: https://git.openjdk.java.net/jdk/pull/2401 From kvn at openjdk.java.net Tue Apr 6 17:58:34 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 6 Apr 2021 17:58:34 GMT Subject: RFR: 8263227: C2: inconsistent spilling due to dead nodes in exception block In-Reply-To: <9y8Bw6bniQX2kF659D_tyTfbZQc1W3ptA7xSr7fmsc8=.77b3fb6e-6e49-4861-9f1d-3ffbf0b075fd@github.com> References: <9y8Bw6bniQX2kF659D_tyTfbZQc1W3ptA7xSr7fmsc8=.77b3fb6e-6e49-4861-9f1d-3ffbf0b075fd@github.com> Message-ID: On Thu, 1 Apr 2021 10:02:39 GMT, Roberto Casta?eda Lozano wrote: > This change eliminates dead multi-nodes created by call-catch cleanup after GCM. Eliminating all dead code created by call-catch cleanup avoids potential issues when splitting the live range of call result values, see the analysis in the [bug report](https://bugs.openjdk.java.net/browse/JDK-8263227) for details. This solution is the least invasive of the three alternatives proposed in the bug report (the other two are constraining global code motion and extending live-range splitting). > > The change also extends the control-flow graph verification pass to catch similar live-range splitting issues earlier (with `+VerifyRegisterAllocator`). > > Tested on: > - original bug reproducer > - hs-tier1-5 (windows-x64, linux-x64, linux-aarch64, and macosx-x64) with `+VerifyRegisterAllocator` > - hs-tier1-3 (windows-x64, linux-x64, linux-aarch64, and macosx-x64) with `+VerifyRegisterAllocator` and `+StressGCM` (5 repetitions) src/hotspot/share/opto/lcm.cpp line 1415: > 1413: if (dead) { > 1414: // Remove projections if n is a dead multi-node. > 1415: for (uint k = j + n->outcnt(); sb->get_node(k)->is_Proj(); k--) { I don't get this logic. The loop is not executed if sb->get_node(j + n->outcnt()) is not Proj node. src/hotspot/share/opto/lcm.cpp line 1419: > 1417: "dead projection should correspond to current node"); > 1418: sb->get_node(k)->disconnect_inputs(C); > 1419: sb->remove_node(k); If you remove node here then `j` could be incorrect in `sb->remove_node(j)` at line #1424 ------------- PR: https://git.openjdk.java.net/jdk/pull/3303 From kvn at openjdk.java.net Tue Apr 6 18:01:26 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 6 Apr 2021 18:01:26 GMT Subject: RFR: 8263227: C2: inconsistent spilling due to dead nodes in exception block In-Reply-To: <9y8Bw6bniQX2kF659D_tyTfbZQc1W3ptA7xSr7fmsc8=.77b3fb6e-6e49-4861-9f1d-3ffbf0b075fd@github.com> References: <9y8Bw6bniQX2kF659D_tyTfbZQc1W3ptA7xSr7fmsc8=.77b3fb6e-6e49-4861-9f1d-3ffbf0b075fd@github.com> Message-ID: On Thu, 1 Apr 2021 10:02:39 GMT, Roberto Casta?eda Lozano wrote: > This change eliminates dead multi-nodes created by call-catch cleanup after GCM. Eliminating all dead code created by call-catch cleanup avoids potential issues when splitting the live range of call result values, see the analysis in the [bug report](https://bugs.openjdk.java.net/browse/JDK-8263227) for details. This solution is the least invasive of the three alternatives proposed in the bug report (the other two are constraining global code motion and extending live-range splitting). > > The change also extends the control-flow graph verification pass to catch similar live-range splitting issues earlier (with `+VerifyRegisterAllocator`). > > Tested on: > - original bug reproducer > - hs-tier1-5 (windows-x64, linux-x64, linux-aarch64, and macosx-x64) with `+VerifyRegisterAllocator` > - hs-tier1-3 (windows-x64, linux-x64, linux-aarch64, and macosx-x64) with `+VerifyRegisterAllocator` and `+StressGCM` (5 repetitions) Also where is guarantee that all Proj users are in the same block. ------------- PR: https://git.openjdk.java.net/jdk/pull/3303 From kvn at openjdk.java.net Tue Apr 6 18:02:29 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 6 Apr 2021 18:02:29 GMT Subject: RFR: 8264759: x86_32 Minimal VM build failure after JDK-8262355 In-Reply-To: References: Message-ID: On Tue, 6 Apr 2021 09:29:33 GMT, Aleksey Shipilev wrote: > * For target hotspot_variant-minimal_libjvm_objs_sharedRuntime_x86_32.o: Trivial. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3353 From shade at openjdk.java.net Tue Apr 6 18:09:58 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 6 Apr 2021 18:09:58 GMT Subject: RFR: 8264759: x86_32 Minimal VM build failure after JDK-8262355 [v2] In-Reply-To: References: Message-ID: > * For target hotspot_variant-minimal_libjvm_objs_sharedRuntime_x86_32.o: Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Avoid sweeping vzeroupper into save_vectors block ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3353/files - new: https://git.openjdk.java.net/jdk/pull/3353/files/86a5d8fd..4cf15f25 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3353&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3353&range=00-01 Stats: 3 lines in 1 file changed: 2 ins; 1 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3353.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3353/head:pull/3353 PR: https://git.openjdk.java.net/jdk/pull/3353 From shade at openjdk.java.net Tue Apr 6 18:10:00 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 6 Apr 2021 18:10:00 GMT Subject: RFR: 8264759: x86_32 Minimal VM build failure after JDK-8262355 [v2] In-Reply-To: References: Message-ID: On Tue, 6 Apr 2021 17:59:11 GMT, Vladimir Kozlov wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> Avoid sweeping vzeroupper into save_vectors block > > Trivial. Thanks! I realized that moving `vzeroupper` into `COMPILER2` block was not quite right, as we "handle" only the `save_vectors` branch with that `#ifdef`. Reverted that part in new commit. ------------- PR: https://git.openjdk.java.net/jdk/pull/3353 From kvn at openjdk.java.net Tue Apr 6 18:17:45 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 6 Apr 2021 18:17:45 GMT Subject: RFR: 8264759: x86_32 Minimal VM build failure after JDK-8262355 [v2] In-Reply-To: References: Message-ID: On Tue, 6 Apr 2021 18:09:58 GMT, Aleksey Shipilev wrote: >> * For target hotspot_variant-minimal_libjvm_objs_sharedRuntime_x86_32.o: > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Avoid sweeping vzeroupper into save_vectors block Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3353 From shade at openjdk.java.net Tue Apr 6 18:17:46 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 6 Apr 2021 18:17:46 GMT Subject: Integrated: 8264759: x86_32 Minimal VM build failure after JDK-8262355 In-Reply-To: References: Message-ID: <-d-WgBnrBJaCk90mldezidDgHr6mtv2CmAwpb2UNdWE=.48bb9cef-a21c-44d6-9043-e81d2e3405e9@github.com> On Tue, 6 Apr 2021 09:29:33 GMT, Aleksey Shipilev wrote: > * For target hotspot_variant-minimal_libjvm_objs_sharedRuntime_x86_32.o: This pull request has now been integrated. Changeset: a756d8d7 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/a756d8d7 Stats: 5 lines in 1 file changed: 5 ins; 0 del; 0 mod 8264759: x86_32 Minimal VM build failure after JDK-8262355 Reviewed-by: thartmann, kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/3353 From rcastanedalo at openjdk.java.net Tue Apr 6 19:17:35 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Tue, 6 Apr 2021 19:17:35 GMT Subject: RFR: 8264795: IGV: Upgrade NetBeans platform Message-ID: This change upgrades the NetBeans platform on which IGV is based to its latest version ([12.3](https://netbeans.apache.org/download/nb123/index.html)) and switches IGV's build system from Ant to Maven. The upgrade introduces support for a wide range of JDK versions (from 8 to 15, the latest version supported by NetBeans 12.3), and the switch from Ant to Maven makes the IGV build simpler, faster (first-time build is approximately 5x faster), and more stable (all dependencies are fetched directly from the Maven central repository). The change also fixes broken unit tests in the Data module and runs them by default when building. #### Testing Regression-tested the following use cases manually on all combinations of (Linux, Windows, macOS) and (JDK 8, JDK 11, JDK 15): - build - open graph file (small.xml in [test-input.zip](https://bugs.openjdk.java.net/secure/attachment/93988/test-input.zip)) - import graphs via network (localhost) - expand groups in outline - open a graph - open a clone - zoom in and out - search a node - apply filters - extract a node - show all nodes - select nodes corresponding to a bytecode - view control flow - select nodes corresponding to a basic block - cluster nodes - show satellite view - compute the difference of two graphs - change node text - remove a group - save groups into a file - open a large graph file (large.xml in [test-input.zip](https://bugs.openjdk.java.net/secure/attachment/93988/test-input.zip)) - open a large graph ("After Escape Analysis" in large.xml) Thanks to Vladimir Ivanov for helping with testing on macOS. ------------- Commit messages: - Update TemplatesAction name in layer file - Remove template integration test - Turn off schema validation, since the main input files are not specifying it anyway - Fix test errors - Configure Maven to run Data unit tests - Update copyright years of touched files - Indent all XML files consistently - Convert all files to UNIX format - Add copyright headers to Maven files - Remove unnecessary comments - ... and 35 more: https://git.openjdk.java.net/jdk/compare/011f6d13...11fb43fa Changes: https://git.openjdk.java.net/jdk/pull/3361/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3361&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264795 Stats: 5950 lines in 499 files changed: 2737 ins; 3110 del; 103 mod Patch: https://git.openjdk.java.net/jdk/pull/3361.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3361/head:pull/3361 PR: https://git.openjdk.java.net/jdk/pull/3361 From kvn at openjdk.java.net Tue Apr 6 20:05:33 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 6 Apr 2021 20:05:33 GMT Subject: RFR: 8264795: IGV: Upgrade NetBeans platform In-Reply-To: References: Message-ID: On Tue, 6 Apr 2021 18:34:54 GMT, Roberto Casta?eda Lozano wrote: > This change upgrades the NetBeans platform on which IGV is based to its latest version ([12.3](https://netbeans.apache.org/download/nb123/index.html)) and switches IGV's build system from Ant to Maven. The upgrade introduces support for a wide range of JDK versions (from 8 to 15, the latest version supported by NetBeans 12.3), and the switch from Ant to Maven makes the IGV build simpler, faster (first-time build is approximately 5x faster), and more stable (all dependencies are fetched directly from the Maven central repository). > > The change also fixes broken unit tests in the Data module and runs them by default when building. > > #### Testing > > Regression-tested the following use cases manually on all combinations of (Linux, Windows, macOS) and (JDK 8, JDK 11, JDK 15): > > - build > - open graph file (small.xml in [test-input.zip](https://bugs.openjdk.java.net/secure/attachment/93988/test-input.zip)) > - import graphs via network (localhost) > - expand groups in outline > - open a graph > - open a clone > - zoom in and out > - search a node > - apply filters > - extract a node > - show all nodes > - select nodes corresponding to a bytecode > - view control flow > - select nodes corresponding to a basic block > - cluster nodes > - show satellite view > - compute the difference of two graphs > - change node text > - remove a group > - save groups into a file > - open a large graph file (large.xml in [test-input.zip](https://bugs.openjdk.java.net/secure/attachment/93988/test-input.zip)) > - open a large graph ("After Escape Analysis" in large.xml) > > Thanks to Vladimir Ivanov for helping with testing on macOS. Great work! After this is pushed consider to update wiki page too: https://wiki.openjdk.java.net/display/HotSpot/IdealGraphVisualizer ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3361 From kvn at openjdk.java.net Wed Apr 7 02:36:42 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 7 Apr 2021 02:36:42 GMT Subject: RFR: 8264063: Outer Safepoint poll load should not reference the head of inner strip mined loop. Message-ID: When loop is "strip mined" polling address load ([parse1.cpp#L2280](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/parse1.cpp#L2280)) should be cloned together with safepoint node and pinned outside inner loop. Otherwise we have issues like [8263352](https://bugs.openjdk.java.net/browse/JDK-8263352) I also remove leftover (unused needs_polling_address_input() method) from [8220051](https://bugs.openjdk.java.net/browse/JDK-8220051) changes. Tested hs-tier1-4 ------------- Commit messages: - 8264063: Outer Safepoint poll load should not reference the head of inner strip mined loop. Changes: https://git.openjdk.java.net/jdk/pull/3365/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3365&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264063 Stats: 47 lines in 6 files changed: 10 ins; 36 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3365.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3365/head:pull/3365 PR: https://git.openjdk.java.net/jdk/pull/3365 From pli at openjdk.java.net Wed Apr 7 03:51:26 2021 From: pli at openjdk.java.net (Pengfei Li) Date: Wed, 7 Apr 2021 03:51:26 GMT Subject: RFR: 8264063: Outer Safepoint poll load should not reference the head of inner strip mined loop. In-Reply-To: References: Message-ID: On Wed, 7 Apr 2021 02:29:20 GMT, Vladimir Kozlov wrote: > When loop is "strip mined" polling address load ([parse1.cpp#L2280](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/parse1.cpp#L2280)) should be cloned together with safepoint node and pinned outside inner loop. Otherwise we have issues like [8263352](https://bugs.openjdk.java.net/browse/JDK-8263352) > > I also remove leftover (unused needs_polling_address_input() method) from [8220051](https://bugs.openjdk.java.net/browse/JDK-8220051) changes. > > Tested hs-tier1-4 Should we also remove below part of code in `loopTransform.cpp` in this patch? 3704 if (n->is_CountedLoop() && n->as_CountedLoop()->is_strip_mined()) { 3705 // In strip-mined counted loops, the CountedLoopNode may be 3706 // used by the address polling node of the outer safepoint. 3707 // Skip this use because it's safe. 3708 Node* sfpt = n->as_CountedLoop()->outer_safepoint(); 3709 Node* polladr = sfpt->in(TypeFunc::Parms+0); 3710 if (use == polladr) { 3711 continue; 3712 } 3713 } ------------- PR: https://git.openjdk.java.net/jdk/pull/3365 From kvn at openjdk.java.net Wed Apr 7 04:56:29 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 7 Apr 2021 04:56:29 GMT Subject: RFR: 8264063: Outer Safepoint poll load should not reference the head of inner strip mined loop. In-Reply-To: References: Message-ID: <7YL-VL_1xzDmZH5jqlr5ZKnJK2tZBFtD_lyM5QUSfTo=.b65bfa0c-37b7-4885-bce7-7517ee6fa0fc@github.com> On Wed, 7 Apr 2021 03:48:15 GMT, Pengfei Li wrote: > Should we also remove below part of code in `loopTransform.cpp` in this patch? > > ``` > 3704 if (n->is_CountedLoop() && n->as_CountedLoop()->is_strip_mined()) { > 3705 // In strip-mined counted loops, the CountedLoopNode may be > 3706 // used by the address polling node of the outer safepoint. > 3707 // Skip this use because it's safe. > 3708 Node* sfpt = n->as_CountedLoop()->outer_safepoint(); > 3709 Node* polladr = sfpt->in(TypeFunc::Parms+0); > 3710 if (use == polladr) { > 3711 continue; > 3712 } > 3713 } > ``` I thought that it does not harm to have this code but on other hand it will be not executed anymore. I will removed it and I will run ArrayFill.java to make sure optimization works with strip mined loops. ------------- PR: https://git.openjdk.java.net/jdk/pull/3365 From dongbo at openjdk.java.net Wed Apr 7 05:51:02 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Wed, 7 Apr 2021 05:51:02 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic [v6] In-Reply-To: References: Message-ID: > In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding. > Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea can be found at http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords. > > Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build. > Tests in `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java` runned specially for the correctness of the implementation. > > There can be illegal characters at the start of the input if the data is MIME encoded. > It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now. > > A JMH micro, Base64Decode.java, is added for performance test. > With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro), > we witness ~2.5x improvements with long inputs and no regression with short inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on Kunpeng916. > > The Base64Decode.java JMH micro-benchmark results: > > Benchmark (lineSize) (maxNumBytes) Mode Cnt Score Error Units > > # Kunpeng916, intrinsic > Base64Decode.testBase64Decode 4 1 avgt 5 48.614 ? 0.609 ns/op > Base64Decode.testBase64Decode 4 3 avgt 5 58.199 ? 1.650 ns/op > Base64Decode.testBase64Decode 4 7 avgt 5 69.400 ? 0.931 ns/op > Base64Decode.testBase64Decode 4 32 avgt 5 96.818 ? 1.687 ns/op > Base64Decode.testBase64Decode 4 64 avgt 5 122.856 ? 9.217 ns/op > Base64Decode.testBase64Decode 4 80 avgt 5 130.935 ? 1.667 ns/op > Base64Decode.testBase64Decode 4 96 avgt 5 143.627 ? 1.751 ns/op > Base64Decode.testBase64Decode 4 112 avgt 5 152.311 ? 1.178 ns/op > Base64Decode.testBase64Decode 4 512 avgt 5 342.631 ? 0.584 ns/op > Base64Decode.testBase64Decode 4 1000 avgt 5 573.635 ? 1.050 ns/op > Base64Decode.testBase64Decode 4 20000 avgt 5 9534.136 ? 45.172 ns/op > Base64Decode.testBase64Decode 4 50000 avgt 5 22718.726 ? 192.070 ns/op > Base64Decode.testBase64MIMEDecode 4 1 avgt 10 63.558 ? 0.336 ns/op > Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.504 ? 0.848 ns/op > Base64Decode.testBase64MIMEDecode 4 7 avgt 10 120.591 ? 0.608 ns/op > Base64Decode.testBase64MIMEDecode 4 32 avgt 10 324.314 ? 6.236 ns/op > Base64Decode.testBase64MIMEDecode 4 64 avgt 10 532.678 ? 4.670 ns/op > Base64Decode.testBase64MIMEDecode 4 80 avgt 10 678.126 ? 4.324 ns/op > Base64Decode.testBase64MIMEDecode 4 96 avgt 10 771.603 ? 6.393 ns/op > Base64Decode.testBase64MIMEDecode 4 112 avgt 10 889.608 ? 0.759 ns/op > Base64Decode.testBase64MIMEDecode 4 512 avgt 10 3663.557 ? 3.422 ns/op > Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7017.784 ? 9.128 ns/op > Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 128670.660 ? 7951.521 ns/op > Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 317113.667 ? 161.758 ns/op > > # Kunpeng916, default > Base64Decode.testBase64Decode 4 1 avgt 5 48.455 ? 0.571 ns/op > Base64Decode.testBase64Decode 4 3 avgt 5 57.937 ? 0.505 ns/op > Base64Decode.testBase64Decode 4 7 avgt 5 73.823 ? 1.452 ns/op > Base64Decode.testBase64Decode 4 32 avgt 5 106.484 ? 1.243 ns/op > Base64Decode.testBase64Decode 4 64 avgt 5 141.004 ? 1.188 ns/op > Base64Decode.testBase64Decode 4 80 avgt 5 156.284 ? 0.572 ns/op > Base64Decode.testBase64Decode 4 96 avgt 5 174.137 ? 0.177 ns/op > Base64Decode.testBase64Decode 4 112 avgt 5 188.445 ? 0.572 ns/op > Base64Decode.testBase64Decode 4 512 avgt 5 610.847 ? 1.559 ns/op > Base64Decode.testBase64Decode 4 1000 avgt 5 1155.368 ? 0.813 ns/op > Base64Decode.testBase64Decode 4 20000 avgt 5 19751.477 ? 24.669 ns/op > Base64Decode.testBase64Decode 4 50000 avgt 5 50046.586 ? 523.155 ns/op > Base64Decode.testBase64MIMEDecode 4 1 avgt 10 64.130 ? 0.238 ns/op > Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.096 ? 0.205 ns/op > Base64Decode.testBase64MIMEDecode 4 7 avgt 10 118.849 ? 0.610 ns/op > Base64Decode.testBase64MIMEDecode 4 32 avgt 10 331.177 ? 4.732 ns/op > Base64Decode.testBase64MIMEDecode 4 64 avgt 10 549.117 ? 0.177 ns/op > Base64Decode.testBase64MIMEDecode 4 80 avgt 10 702.951 ? 4.572 ns/op > Base64Decode.testBase64MIMEDecode 4 96 avgt 10 799.566 ? 0.301 ns/op > Base64Decode.testBase64MIMEDecode 4 112 avgt 10 923.749 ? 0.389 ns/op > Base64Decode.testBase64MIMEDecode 4 512 avgt 10 4000.725 ? 2.519 ns/op > Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7674.994 ? 9.281 ns/op > Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 142059.001 ? 157.920 ns/op > Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 355698.369 ? 216.542 ns/op Dong Bo has updated the pull request incrementally with one additional commit since the last revision: fix misleading annotations ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3228/files - new: https://git.openjdk.java.net/jdk/pull/3228/files/faa830cf..a342ad1e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3228&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3228&range=04-05 Stats: 4 lines in 1 file changed: 0 ins; 1 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/3228.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3228/head:pull/3228 PR: https://git.openjdk.java.net/jdk/pull/3228 From dongbo at openjdk.java.net Wed Apr 7 06:00:45 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Wed, 7 Apr 2021 06:00:45 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic [v5] In-Reply-To: References: <0aGgD88Mxj7nICTvKhNtkEYlYlP7TUMG082EgaEHU04=.ba1d4468-5c74-4a06-b8c6-d66c7b0c394d@github.com> Message-ID: On Tue, 6 Apr 2021 14:04:07 GMT, Andrew Haley wrote: >> Dong Bo has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: >> >> - conflicts resolved >> - Merge branch 'master' of https://git.openjdk.java.net/jdk into aarch64.base64.decode >> - resovling conflicts >> - load data with one ldrw, add JMH tests for error inputs >> - Merge branch 'master' into aarch64.base64.decode >> - copyright >> - trivial fixes >> - Handling error in SIMD case with loops, combining two non-SIMD cases into one code blob, addressing other comments >> - Merge branch 'master' into aarch64.base64.decode >> - 8256245: AArch64: Implement Base64 decoding intrinsic > > src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5800: > >> 5798: __ br(Assembler::LT, Process4B); >> 5799: >> 5800: // The 1st character of the input can be illegal if the data is MIME encoded. > > Why is this sentence here? It is very misleading. This sentence was used to describe the worst case observed frequently so that readers can understand more easily why the pre-processing non-SIMD code is necessary. I apologize for being unclear and misleading. The annotations have been modified as suggested. ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From xgong at openjdk.java.net Wed Apr 7 06:01:02 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Wed, 7 Apr 2021 06:01:02 GMT Subject: RFR: 8264352: AArch64: Optimize vector "not/andNot" for NEON and SVE Message-ID: <2RRMDFhq5Eo8gVfh1Mrn343KFLKPJh08oHx4TMGgbcw=.d8ea6de7-8efa-424b-abb2-8b7135e4675d@github.com> Since the vector bitwise `"andNot"` is implemented with `"v1.and(v2.xor(-1))"`, the generated codes with SVE look like: mov z16.b, #-1 eor z17.d, z20.d, z16.d and z18.d, z18.d, z17.d This could be improved with a single instruction: bic z16.d, z16.d, z18.d Similarly, the following optimization for NEON is also needed: not v21.16b, v21.16b and v21.16b, v21.16b, v18.16b ==> bic v21.16b, v18.16b, v21.16b This patch also adds the following optimization to vector` "not"` for SVE which has already been added for NEON: mov z16.b, #-1 eor z17.d, z20.d, z16.d ==> not z17.d, p7/m, z20.d The performance can improve about `16% ~ 36%` with NEON for the `"AND_NOT"` benchmark [1]. [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/ByteMaxVector.java#L343 Tested tier1 and jdk:tier3. ------------- Commit messages: - 8264352: AArch64: Optimize vector "not/andNot" for NEON and SVE Changes: https://git.openjdk.java.net/jdk/pull/3370/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3370&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264352 Stats: 219 lines in 7 files changed: 185 ins; 0 del; 34 mod Patch: https://git.openjdk.java.net/jdk/pull/3370.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3370/head:pull/3370 PR: https://git.openjdk.java.net/jdk/pull/3370 From eliu at openjdk.java.net Wed Apr 7 07:37:51 2021 From: eliu at openjdk.java.net (Eric Liu) Date: Wed, 7 Apr 2021 07:37:51 GMT Subject: RFR: 8262916: Merge LShiftCntV and RShiftCntV into a single node Message-ID: The vector shift count was defined by two separate nodes(LShiftCntV and RShiftCntV), which would prevent them from being shared when the shift counts are the same. public static void test_shiftv(int sh) { for (int i = 0; i < N; i+=1) { a0[i] = a1[i] << sh; b0[i] = b1[i] >> sh; } } Given the example above, by merging the same shift counts into one node, they could be shared by shift nodes(RShiftV or LShiftV) like below: Before: 1184 LShiftCntV === _ 1189 [[ 1185 ... ]] 1190 RShiftCntV === _ 1189 [[ 1191 ... ]] 1185 LShiftVI === _ 1181 1184 [[ 1186 ]] 1191 RShiftVI === _ 1187 1190 [[ 1192 ]] After: 1190 ShiftCntV === _ 1189 [[ 1191 1204 ... ]] 1204 LShiftVI === _ 1211 1190 [[ 1203 ]] 1191 RShiftVI === _ 1187 1190 [[ 1192 ]] The final code could remove one redundant ?dup?(scalar->vector), with one register saved. Before: dup v16.16b, w12 dup v17.16b, w12 ... ldr q18, [x13, #16] sshl v18.4s, v18.4s, v16.4s add x18, x16, x12 ; iaload add x4, x15, x12 str q18, [x4, #16] ; iastore ldr q18, [x18, #16] add x12, x14, x12 neg v19.16b, v17.16b sshl v18.4s, v18.4s, v19.4s str q18, [x12, #16] ; iastore After: dup v16.16b, w11 ... ldr q17, [x13, #16] sshl v17.4s, v17.4s, v16.4s add x2, x22, x11 ; iaload add x4, x16, x11 str q17, [x4, #16] ; iastore ldr q17, [x2, #16] add x11, x21, x11 neg v18.16b, v16.16b sshl v17.4s, v17.4s, v18.4s str q17, [x11, #16] ; iastore ------------- Commit messages: - 8262916: Merge LShiftCntV and RShiftCntV into a single node Changes: https://git.openjdk.java.net/jdk/pull/3371/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3371&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8262916 Stats: 1494 lines in 14 files changed: 988 ins; 282 del; 224 mod Patch: https://git.openjdk.java.net/jdk/pull/3371.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3371/head:pull/3371 PR: https://git.openjdk.java.net/jdk/pull/3371 From rcastanedalo at openjdk.java.net Wed Apr 7 07:39:26 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Wed, 7 Apr 2021 07:39:26 GMT Subject: RFR: 8264795: IGV: Upgrade NetBeans platform In-Reply-To: References: Message-ID: On Tue, 6 Apr 2021 20:02:10 GMT, Vladimir Kozlov wrote: > Great work! > After this is pushed consider to update wiki page too: https://wiki.openjdk.java.net/display/HotSpot/IdealGraphVisualizer Thanks for reviewing, Vladimir! Yes, I plan to do that. ------------- PR: https://git.openjdk.java.net/jdk/pull/3361 From chagedorn at openjdk.java.net Wed Apr 7 08:00:32 2021 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Wed, 7 Apr 2021 08:00:32 GMT Subject: RFR: 8264795: IGV: Upgrade NetBeans platform In-Reply-To: References: Message-ID: On Tue, 6 Apr 2021 18:34:54 GMT, Roberto Casta?eda Lozano wrote: > This change upgrades the NetBeans platform on which IGV is based to its latest version ([12.3](https://netbeans.apache.org/download/nb123/index.html)) and switches IGV's build system from Ant to Maven. The upgrade introduces support for a wide range of JDK versions (from 8 to 15, the latest version supported by NetBeans 12.3), and the switch from Ant to Maven makes the IGV build simpler, faster (first-time build is approximately 5x faster), and more stable (all dependencies are fetched directly from the Maven central repository). > > The change also fixes broken unit tests in the Data module and runs them by default when building. > > #### Testing > > Regression-tested the following use cases manually on all combinations of (Linux, Windows, macOS) and (JDK 8, JDK 11, JDK 15): > > - build > - open graph file (small.xml in [test-input.zip](https://bugs.openjdk.java.net/secure/attachment/93988/test-input.zip)) > - import graphs via network (localhost) > - expand groups in outline > - open a graph > - open a clone > - zoom in and out > - search a node > - apply filters > - extract a node > - show all nodes > - select nodes corresponding to a bytecode > - view control flow > - select nodes corresponding to a basic block > - cluster nodes > - show satellite view > - compute the difference of two graphs > - change node text > - remove a group > - save groups into a file > - open a large graph file (large.xml in [test-input.zip](https://bugs.openjdk.java.net/secure/attachment/93988/test-input.zip)) > - open a large graph ("After Escape Analysis" in large.xml) > > Thanks to Vladimir Ivanov for helping with testing on macOS. Very nice work! src/utils/IdealGraphVisualizer/Data/src/main/java/com/sun/hotspot/igv/data/serialization/Parser.java line 154: > 152: parent.addElement(group); > 153: } > 154: }; You could directly use `Runnable addToParent = () -> parent.addElement(group);` and also for the other ones below. src/utils/IdealGraphVisualizer/README.md line 10: > 8: elements. > 9: > 10: The tool is built on top of the NetBeans Platform, and requires Java 8 or later. Maybe also mention here that the current NetBeans version supports up to JDK 15. ------------- Marked as reviewed by chagedorn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3361 From aph at openjdk.java.net Wed Apr 7 08:34:26 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 7 Apr 2021 08:34:26 GMT Subject: RFR: 8264352: AArch64: Optimize vector "not/andNot" for NEON and SVE In-Reply-To: <2RRMDFhq5Eo8gVfh1Mrn343KFLKPJh08oHx4TMGgbcw=.d8ea6de7-8efa-424b-abb2-8b7135e4675d@github.com> References: <2RRMDFhq5Eo8gVfh1Mrn343KFLKPJh08oHx4TMGgbcw=.d8ea6de7-8efa-424b-abb2-8b7135e4675d@github.com> Message-ID: On Wed, 7 Apr 2021 05:53:46 GMT, Xiaohong Gong wrote: > Since the vector bitwise `"andNot"` is implemented with `"v1.and(v2.xor(-1))"`, the generated codes with SVE look like: > mov z16.b, #-1 > eor z17.d, z20.d, z16.d > and z18.d, z18.d, z17.d > This could be improved with a single instruction: > bic z16.d, z16.d, z18.d > Similarly, the following optimization for NEON is also needed: > not v21.16b, v21.16b > and v21.16b, v21.16b, v18.16b ==> bic v21.16b, v18.16b, v21.16b > This patch also adds the following optimization to vector` "not"` for SVE which has already been added for NEON: > mov z16.b, #-1 > eor z17.d, z20.d, z16.d ==> not z17.d, p7/m, z20.d > The performance can improve about `16% ~ 36%` with NEON for the `"AND_NOT"` benchmark [1]. > > [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/ByteMaxVector.java#L343 > > Tested tier1 and jdk:tier3. Looks OK. Is there any test code for this is mainline? ------------- PR: https://git.openjdk.java.net/jdk/pull/3370 From neliasso at openjdk.java.net Wed Apr 7 08:43:29 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Wed, 7 Apr 2021 08:43:29 GMT Subject: RFR: 8264795: IGV: Upgrade NetBeans platform In-Reply-To: References: Message-ID: On Tue, 6 Apr 2021 18:34:54 GMT, Roberto Casta?eda Lozano wrote: > This change upgrades the NetBeans platform on which IGV is based to its latest version ([12.3](https://netbeans.apache.org/download/nb123/index.html)) and switches IGV's build system from Ant to Maven. The upgrade introduces support for a wide range of JDK versions (from 8 to 15, the latest version supported by NetBeans 12.3), and the switch from Ant to Maven makes the IGV build simpler, faster (first-time build is approximately 5x faster), and more stable (all dependencies are fetched directly from the Maven central repository). > > The change also fixes broken unit tests in the Data module and runs them by default when building. > > #### Testing > > Regression-tested the following use cases manually on all combinations of (Linux, Windows, macOS) and (JDK 8, JDK 11, JDK 15): > > - build > - open graph file (small.xml in [test-input.zip](https://bugs.openjdk.java.net/secure/attachment/93988/test-input.zip)) > - import graphs via network (localhost) > - expand groups in outline > - open a graph > - open a clone > - zoom in and out > - search a node > - apply filters > - extract a node > - show all nodes > - select nodes corresponding to a bytecode > - view control flow > - select nodes corresponding to a basic block > - cluster nodes > - show satellite view > - compute the difference of two graphs > - change node text > - remove a group > - save groups into a file > - open a large graph file (large.xml in [test-input.zip](https://bugs.openjdk.java.net/secure/attachment/93988/test-input.zip)) > - open a large graph ("After Escape Analysis" in large.xml) > > Thanks to Vladimir Ivanov for helping with testing on macOS. I have tested IGV on a High-DPI screen (4K). The screens are attached to the bug-report. Compared to baseline the new version improved DPI scaling on both JDK8 and JDK15. Some elements still doesn't scale. Remaining issues identified: 1) The default zoom level is to small 2) The filter view has broken scaling - the font scales but not the line height. 3) The buttons above the graph view and file list doesn't scale - they are still tiny. 4) The splash screen doesn't scale Still this is a big improvement and the remaining issues can be solved in separate PRs. Looks good - Reviewed! ------------- Marked as reviewed by neliasso (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3361 From xgong at openjdk.java.net Wed Apr 7 08:48:32 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Wed, 7 Apr 2021 08:48:32 GMT Subject: RFR: 8264352: AArch64: Optimize vector "not/andNot" for NEON and SVE In-Reply-To: References: <2RRMDFhq5Eo8gVfh1Mrn343KFLKPJh08oHx4TMGgbcw=.d8ea6de7-8efa-424b-abb2-8b7135e4675d@github.com> Message-ID: On Wed, 7 Apr 2021 08:31:19 GMT, Andrew Haley wrote: > Looks OK. Is there any test code for this is mainline? Hi @theRealAph , thanks for looking at this PR. Yes, there is the Vector API jtreg tests that have covered the opcode `NOT/AND_NOT`. Please see the tests for byte vector: https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/incubator/vector/ByteMaxVectorTests.java#L1708 and https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/incubator/vector/ByteMaxVectorTests.java#L4602 ------------- PR: https://git.openjdk.java.net/jdk/pull/3370 From aph at openjdk.java.net Wed Apr 7 09:07:22 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 7 Apr 2021 09:07:22 GMT Subject: RFR: 8264352: AArch64: Optimize vector "not/andNot" for NEON and SVE In-Reply-To: <2RRMDFhq5Eo8gVfh1Mrn343KFLKPJh08oHx4TMGgbcw=.d8ea6de7-8efa-424b-abb2-8b7135e4675d@github.com> References: <2RRMDFhq5Eo8gVfh1Mrn343KFLKPJh08oHx4TMGgbcw=.d8ea6de7-8efa-424b-abb2-8b7135e4675d@github.com> Message-ID: On Wed, 7 Apr 2021 05:53:46 GMT, Xiaohong Gong wrote: > Since the vector bitwise `"andNot"` is implemented with `"v1.and(v2.xor(-1))"`, the generated codes with SVE look like: > mov z16.b, #-1 > eor z17.d, z20.d, z16.d > and z18.d, z18.d, z17.d > This could be improved with a single instruction: > bic z16.d, z16.d, z18.d > Similarly, the following optimization for NEON is also needed: > not v21.16b, v21.16b > and v21.16b, v21.16b, v18.16b ==> bic v21.16b, v18.16b, v21.16b > This patch also adds the following optimization to vector` "not"` for SVE which has already been added for NEON: > mov z16.b, #-1 > eor z17.d, z20.d, z16.d ==> not z17.d, p7/m, z20.d > The performance can improve about `16% ~ 36%` with NEON for the `"AND_NOT"` benchmark [1]. > > [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/ByteMaxVector.java#L343 > > Tested tier1 and jdk:tier3. Marked as reviewed by aph (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3370 From aph at openjdk.java.net Wed Apr 7 09:56:20 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 7 Apr 2021 09:56:20 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic [v6] In-Reply-To: References: Message-ID: On Wed, 7 Apr 2021 05:51:02 GMT, Dong Bo wrote: >> In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding. >> Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea can be found at http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords. >> >> Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build. >> Tests in `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java` runned specially for the correctness of the implementation. >> >> There can be illegal characters at the start of the input if the data is MIME encoded. >> It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now. >> >> A JMH micro, Base64Decode.java, is added for performance test. >> With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro), >> we witness ~2.5x improvements with long inputs and no regression with short inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on Kunpeng916. >> >> The Base64Decode.java JMH micro-benchmark results: >> >> Benchmark (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> >> # Kunpeng916, intrinsic >> Base64Decode.testBase64Decode 4 1 avgt 5 48.614 ? 0.609 ns/op >> Base64Decode.testBase64Decode 4 3 avgt 5 58.199 ? 1.650 ns/op >> Base64Decode.testBase64Decode 4 7 avgt 5 69.400 ? 0.931 ns/op >> Base64Decode.testBase64Decode 4 32 avgt 5 96.818 ? 1.687 ns/op >> Base64Decode.testBase64Decode 4 64 avgt 5 122.856 ? 9.217 ns/op >> Base64Decode.testBase64Decode 4 80 avgt 5 130.935 ? 1.667 ns/op >> Base64Decode.testBase64Decode 4 96 avgt 5 143.627 ? 1.751 ns/op >> Base64Decode.testBase64Decode 4 112 avgt 5 152.311 ? 1.178 ns/op >> Base64Decode.testBase64Decode 4 512 avgt 5 342.631 ? 0.584 ns/op >> Base64Decode.testBase64Decode 4 1000 avgt 5 573.635 ? 1.050 ns/op >> Base64Decode.testBase64Decode 4 20000 avgt 5 9534.136 ? 45.172 ns/op >> Base64Decode.testBase64Decode 4 50000 avgt 5 22718.726 ? 192.070 ns/op >> Base64Decode.testBase64MIMEDecode 4 1 avgt 10 63.558 ? 0.336 ns/op >> Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.504 ? 0.848 ns/op >> Base64Decode.testBase64MIMEDecode 4 7 avgt 10 120.591 ? 0.608 ns/op >> Base64Decode.testBase64MIMEDecode 4 32 avgt 10 324.314 ? 6.236 ns/op >> Base64Decode.testBase64MIMEDecode 4 64 avgt 10 532.678 ? 4.670 ns/op >> Base64Decode.testBase64MIMEDecode 4 80 avgt 10 678.126 ? 4.324 ns/op >> Base64Decode.testBase64MIMEDecode 4 96 avgt 10 771.603 ? 6.393 ns/op >> Base64Decode.testBase64MIMEDecode 4 112 avgt 10 889.608 ? 0.759 ns/op >> Base64Decode.testBase64MIMEDecode 4 512 avgt 10 3663.557 ? 3.422 ns/op >> Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7017.784 ? 9.128 ns/op >> Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 128670.660 ? 7951.521 ns/op >> Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 317113.667 ? 161.758 ns/op >> >> # Kunpeng916, default >> Base64Decode.testBase64Decode 4 1 avgt 5 48.455 ? 0.571 ns/op >> Base64Decode.testBase64Decode 4 3 avgt 5 57.937 ? 0.505 ns/op >> Base64Decode.testBase64Decode 4 7 avgt 5 73.823 ? 1.452 ns/op >> Base64Decode.testBase64Decode 4 32 avgt 5 106.484 ? 1.243 ns/op >> Base64Decode.testBase64Decode 4 64 avgt 5 141.004 ? 1.188 ns/op >> Base64Decode.testBase64Decode 4 80 avgt 5 156.284 ? 0.572 ns/op >> Base64Decode.testBase64Decode 4 96 avgt 5 174.137 ? 0.177 ns/op >> Base64Decode.testBase64Decode 4 112 avgt 5 188.445 ? 0.572 ns/op >> Base64Decode.testBase64Decode 4 512 avgt 5 610.847 ? 1.559 ns/op >> Base64Decode.testBase64Decode 4 1000 avgt 5 1155.368 ? 0.813 ns/op >> Base64Decode.testBase64Decode 4 20000 avgt 5 19751.477 ? 24.669 ns/op >> Base64Decode.testBase64Decode 4 50000 avgt 5 50046.586 ? 523.155 ns/op >> Base64Decode.testBase64MIMEDecode 4 1 avgt 10 64.130 ? 0.238 ns/op >> Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.096 ? 0.205 ns/op >> Base64Decode.testBase64MIMEDecode 4 7 avgt 10 118.849 ? 0.610 ns/op >> Base64Decode.testBase64MIMEDecode 4 32 avgt 10 331.177 ? 4.732 ns/op >> Base64Decode.testBase64MIMEDecode 4 64 avgt 10 549.117 ? 0.177 ns/op >> Base64Decode.testBase64MIMEDecode 4 80 avgt 10 702.951 ? 4.572 ns/op >> Base64Decode.testBase64MIMEDecode 4 96 avgt 10 799.566 ? 0.301 ns/op >> Base64Decode.testBase64MIMEDecode 4 112 avgt 10 923.749 ? 0.389 ns/op >> Base64Decode.testBase64MIMEDecode 4 512 avgt 10 4000.725 ? 2.519 ns/op >> Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7674.994 ? 9.281 ns/op >> Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 142059.001 ? 157.920 ns/op >> Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 355698.369 ? 216.542 ns/op > > Dong Bo has updated the pull request incrementally with one additional commit since the last revision: > > fix misleading annotations src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5829: > 5827: __ strb(r14, __ post(dst, 1)); > 5828: __ strb(r15, __ post(dst, 1)); > 5829: __ strb(r13, __ post(dst, 1)); I think this sequence should be 4 BFMs, STRW, BFM, STRW. That's the best we can do, I think. ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From aph at openjdk.java.net Wed Apr 7 09:56:21 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 7 Apr 2021 09:56:21 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic [v6] In-Reply-To: References: Message-ID: On Wed, 7 Apr 2021 09:50:45 GMT, Andrew Haley wrote: >> Dong Bo has updated the pull request incrementally with one additional commit since the last revision: >> >> fix misleading annotations > > src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5829: > >> 5827: __ strb(r14, __ post(dst, 1)); >> 5828: __ strb(r15, __ post(dst, 1)); >> 5829: __ strb(r13, __ post(dst, 1)); > > I think this sequence should be 4 BFMs, STRW, BFM, STRW. That's the best we can do, I think. Sorry, that's not quite right, but you get the idea: let's not generate unnecessary memory traffic. ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From vlivanov at openjdk.java.net Wed Apr 7 10:51:47 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Wed, 7 Apr 2021 10:51:47 GMT Subject: RFR: 8261137: Optimization of Box nodes in uncommon_trap [v13] In-Reply-To: References: <8Riu9VCQLM7_vDp5DOMtLZK3yMLQzAkwlIKo4ab0F7Q=.662dbffe-c320-47ea-bc67-508e2c382b12@github.com> Message-ID: On Tue, 6 Apr 2021 16:08:30 GMT, Vladimir Kozlov wrote: >> Looks good to me too. > > I will wait results of testing Vladimir I. is running. Test results (hs-tier1 - hs-tier4) are clean. ------------- PR: https://git.openjdk.java.net/jdk/pull/2401 From whuang at openjdk.java.net Wed Apr 7 10:51:52 2021 From: whuang at openjdk.java.net (Wang Huang) Date: Wed, 7 Apr 2021 10:51:52 GMT Subject: Integrated: 8261137: Optimization of Box nodes in uncommon_trap In-Reply-To: <8Riu9VCQLM7_vDp5DOMtLZK3yMLQzAkwlIKo4ab0F7Q=.662dbffe-c320-47ea-bc67-508e2c382b12@github.com> References: <8Riu9VCQLM7_vDp5DOMtLZK3yMLQzAkwlIKo4ab0F7Q=.662dbffe-c320-47ea-bc67-508e2c382b12@github.com> Message-ID: On Thu, 4 Feb 2021 08:43:35 GMT, Wang Huang wrote: > JDK-8075052 has removed useless autobox. However, in some cases, the box is still saved. For instance: > @Benchmark > public void testMethod(Blackhole bh) { > int sum = 0; > for (int i = 0; i < data.length; i++) { > Integer ii = Integer.valueOf(data[i]); > if (i < data.length) { > sum += ii.intValue(); > } > } > bh.consume(sum); > } > Although the variable ii is only used at ii.intValue(), it cannot be eliminated as a result of being used by a hidden uncommon_trap. > The uncommon_trap is generated by the optimized "if", because its condition is always true. > > We can postpone box in uncommon_trap in this situation. We treat box as a scalarized object by adding a SafePointScalarObjectNode in the uncommon_trap node, > and deleting the use of box: > > There is no additional fail/error(s) of jtreg after this patch. > > I adjust my codes and add a new benchmark > > public class MyBenchmark { > > static int[] data = new int[10000]; > > static { > for(int i = 0; i < data.length; ++i) { > data[i] = i * 1337 % 7331; > } > } > > @Benchmark > public void testMethod(Blackhole bh) { > int sum = 0; > for (int i = 0; i < data.length; i++) { > Integer ii = Integer.valueOf(data[i]); > black(); > if (i < 100000) { > sum += ii.intValue(); > } > } > bh.consume(sum); > } > > public void black(){} > } > > > aarch64: > base line? > Benchmark Mode Samples Score Score error Units > o.s.MyBenchmark.testMethod avgt 30 88.513 1.111 us/op > > opt? > Benchmark Mode Samples Score Score error Units > o.s.MyBenchmark.testMethod avgt 30 52.776 0.096 us/op > > x86: > base line? > Benchmark Mode Samples Score Score error Units > o.s.MyBenchmark.testMethod avgt 30 81.066 3.156 us/op > > opt: > Benchmark Mode Samples Score Score error Units > o.s.MyBenchmark.testMethod avgt 30 55.596 0.775 us/op This pull request has now been integrated. Changeset: eab84554 Author: Wang Huang Committer: Vladimir Ivanov URL: https://git.openjdk.java.net/jdk/commit/eab84554 Stats: 308 lines in 10 files changed: 284 ins; 2 del; 22 mod 8261137: Optimization of Box nodes in uncommon_trap Co-authored-by: Wu Yan Co-authored-by: Ai Jiaming Reviewed-by: kvn, vlivanov, thartmann ------------- PR: https://git.openjdk.java.net/jdk/pull/2401 From thartmann at openjdk.java.net Wed Apr 7 11:33:37 2021 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Wed, 7 Apr 2021 11:33:37 GMT Subject: RFR: 8264649: runtime/InternalApi/ThreadCpuTimesDeadlock.java crash in fastdebug C2 with -XX:-UseTLAB In-Reply-To: References: Message-ID: <6BN6FtOuI_DuZS8Zpuy2k5xTSu10uXZjrfxMEt6e978=.b5e1ee6b-7dba-413c-9c04-bbf66098b588@github.com> On Sat, 3 Apr 2021 00:53:54 GMT, Hui Shi wrote: > Please help review this fix. Detailed analysis in https://bugs.openjdk.java.net/browse/JDK-8264649 > > Tier1/2 release /fastdebug passed Changes requested by thartmann (Reviewer). src/hotspot/share/opto/phaseX.cpp line 1481: > 1479: // Smash all inputs to 'old', isolating him completely > 1480: Node *temp = new Node(1); > 1481: temp->init_req(0,nn); // Add a use to nn to prevent him from dying Just wondering, why do we even need this? Without that code, `remove_dead_node(old)` would kill `nn` as well if it becomes dead. test/hotspot/jtreg/runtime/InternalApi/ThreadCpuTimesDeadlock.java line 36: > 34: * @test > 35: * @bug 8264649 > 36: * @summary OSR compiled metthod crash when UseTLAB is off Typo `metthod`. ------------- PR: https://git.openjdk.java.net/jdk/pull/3336 From rcastanedalo at openjdk.java.net Wed Apr 7 11:59:28 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Wed, 7 Apr 2021 11:59:28 GMT Subject: RFR: 8264795: IGV: Upgrade NetBeans platform [v2] In-Reply-To: References: Message-ID: > This change upgrades the NetBeans platform on which IGV is based to its latest version ([12.3](https://netbeans.apache.org/download/nb123/index.html)) and switches IGV's build system from Ant to Maven. The upgrade introduces support for a wide range of JDK versions (from 8 to 15, the latest version supported by NetBeans 12.3), and the switch from Ant to Maven makes the IGV build simpler, faster (first-time build is approximately 5x faster), and more stable (all dependencies are fetched directly from the Maven central repository). > > The change also fixes broken unit tests in the Data module and runs them by default when building. > > #### Testing > > Regression-tested the following use cases manually on all combinations of (Linux, Windows, macOS) and (JDK 8, JDK 11, JDK 15): > > - build > - open graph file (small.xml in [test-input.zip](https://bugs.openjdk.java.net/secure/attachment/93988/test-input.zip)) > - import graphs via network (localhost) > - expand groups in outline > - open a graph > - open a clone > - zoom in and out > - search a node > - apply filters > - extract a node > - show all nodes > - select nodes corresponding to a bytecode > - view control flow > - select nodes corresponding to a basic block > - cluster nodes > - show satellite view > - compute the difference of two graphs > - change node text > - remove a group > - save groups into a file > - open a large graph file (large.xml in [test-input.zip](https://bugs.openjdk.java.net/secure/attachment/93988/test-input.zip)) > - open a large graph ("After Escape Analysis" in large.xml) > > Thanks to Vladimir Ivanov for helping with testing on macOS. Roberto Casta?eda Lozano has updated the pull request incrementally with three additional commits since the last revision: - Document how to build and run on a specific JDK - Use lambdas to define runnables - Document latest JDK version supported ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3361/files - new: https://git.openjdk.java.net/jdk/pull/3361/files/11fb43fa..e84d171e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3361&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3361&range=00-01 Stats: 24 lines in 2 files changed: 5 ins; 15 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/3361.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3361/head:pull/3361 PR: https://git.openjdk.java.net/jdk/pull/3361 From rcastanedalo at openjdk.java.net Wed Apr 7 11:59:30 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Wed, 7 Apr 2021 11:59:30 GMT Subject: RFR: 8264795: IGV: Upgrade NetBeans platform [v2] In-Reply-To: References: Message-ID: On Wed, 7 Apr 2021 07:55:58 GMT, Christian Hagedorn wrote: >> Roberto Casta?eda Lozano has updated the pull request incrementally with three additional commits since the last revision: >> >> - Document how to build and run on a specific JDK >> - Use lambdas to define runnables >> - Document latest JDK version supported > > src/utils/IdealGraphVisualizer/Data/src/main/java/com/sun/hotspot/igv/data/serialization/Parser.java line 154: > >> 152: parent.addElement(group); >> 153: } >> 154: }; > > You could directly use `Runnable addToParent = () -> parent.addElement(group);` and also for the other ones below. Good suggestion, thanks, done! > src/utils/IdealGraphVisualizer/README.md line 10: > >> 8: elements. >> 9: >> 10: The tool is built on top of the NetBeans Platform, and requires Java 8 or later. > > Maybe also mention here that the current NetBeans version supports up to JDK 15. Thanks, done! ------------- PR: https://git.openjdk.java.net/jdk/pull/3361 From rcastanedalo at openjdk.java.net Wed Apr 7 12:00:31 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Wed, 7 Apr 2021 12:00:31 GMT Subject: RFR: 8264795: IGV: Upgrade NetBeans platform [v2] In-Reply-To: References: Message-ID: On Wed, 7 Apr 2021 07:57:19 GMT, Christian Hagedorn wrote: > Very nice work! Thanks for reviewing, Christian! ------------- PR: https://git.openjdk.java.net/jdk/pull/3361 From rcastanedalo at openjdk.java.net Wed Apr 7 12:00:31 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Wed, 7 Apr 2021 12:00:31 GMT Subject: RFR: 8264795: IGV: Upgrade NetBeans platform [v2] In-Reply-To: References: Message-ID: <_767rfKcRydco8vGfpcM6NNnzfpmFGrV8bQNzXgiR-E=.376c2613-17f3-4b0c-ace3-2d8572f392ee@github.com> On Wed, 7 Apr 2021 08:40:02 GMT, Nils Eliasson wrote: > I have tested IGV on a High-DPI screen (4K). The screens are attached to the bug-report. > > Compared to baseline the new version improved DPI scaling on both JDK8 and JDK15. Some elements still doesn't scale. > > Remaining issues identified: > > 1. The default zoom level is to small > > 2. The filter view has broken scaling - the font scales but not the line height. > > 3. The buttons above the graph view and file list doesn't scale - they are still tiny. > > 4. The splash screen doesn't scale > > > Still this is a big improvement and the remaining issues can be solved in separate PRs. > > Looks good - Reviewed! Thanks for reviewing, Nils! I will create a separate RFE for DPI scaling issues. I also added a clarification to the README file (https://github.com/openjdk/jdk/pull/3361/commits/e84d171e12a238cf0a1905a4bc8107ac6141a540) based on our offline discussion. ------------- PR: https://git.openjdk.java.net/jdk/pull/3361 From hshi at openjdk.java.net Wed Apr 7 12:09:09 2021 From: hshi at openjdk.java.net (Hui Shi) Date: Wed, 7 Apr 2021 12:09:09 GMT Subject: RFR: 8264649: runtime/InternalApi/ThreadCpuTimesDeadlock.java crash in fastdebug C2 with -XX:-UseTLAB [v2] In-Reply-To: <6BN6FtOuI_DuZS8Zpuy2k5xTSu10uXZjrfxMEt6e978=.b5e1ee6b-7dba-413c-9c04-bbf66098b588@github.com> References: <6BN6FtOuI_DuZS8Zpuy2k5xTSu10uXZjrfxMEt6e978=.b5e1ee6b-7dba-413c-9c04-bbf66098b588@github.com> Message-ID: <03pdWufgmVqBPCnhdEvHJnRHkblBlxSiAUsG4Zjpg30=.a17a5efa-8f65-4933-994a-f176f9c45cfb@github.com> On Wed, 7 Apr 2021 11:29:06 GMT, Tobias Hartmann wrote: >> Hui Shi has updated the pull request incrementally with one additional commit since the last revision: >> >> fix typo in test description > > src/hotspot/share/opto/phaseX.cpp line 1481: > >> 1479: // Smash all inputs to 'old', isolating him completely >> 1480: Node *temp = new Node(1); >> 1481: temp->init_req(0,nn); // Add a use to nn to prevent him from dying > > Just wondering, why do we even need this? Without that code, `remove_dead_node(old)` would kill `nn` as well if it becomes dead. Thanks for your review! Checking code history, this code is quite old. From comments around, it doesn't want nn removed directly in PhaseIterGVN::subsume_node and leaves optimization to next GVN iteration. In my understanding, it might save callers to insert codes checking if "nn" is removed/dead at every subsume_node/replace_node callsite, simplifies implementation. 8153779a hotspot/src/share/vm/opto/phaseX.cpp (J. Duke 2007-12-01 00:00:00 +0000 1479) // Smash all inputs to 'old', isolating him completely 2a0815a5 hotspot/src/share/vm/opto/phaseX.cpp (Tobias Hartmann 2014-06-02 08:07:29 +0200 1480) Node *temp = new Node(1); 8153779a hotspot/src/share/vm/opto/phaseX.cpp (J. Duke 2007-12-01 00:00:00 +0000 1481) temp->init_req(0,nn); // Add a use to nn to prevent him from dying 8153779a hotspot/src/share/vm/opto/phaseX.cpp (J. Duke 2007-12-01 00:00:00 +0000 1482) remove_dead_node( old ); 8153779a hotspot/src/share/vm/opto/phaseX.cpp (J. Duke 2007-12-01 00:00:00 +0000 1483) temp->del_req(0); // Yank bogus edge > test/hotspot/jtreg/runtime/InternalApi/ThreadCpuTimesDeadlock.java line 36: > >> 34: * @test >> 35: * @bug 8264649 >> 36: * @summary OSR compiled metthod crash when UseTLAB is off > > Typo `metthod`. Fixed. ------------- PR: https://git.openjdk.java.net/jdk/pull/3336 From hshi at openjdk.java.net Wed Apr 7 12:09:05 2021 From: hshi at openjdk.java.net (Hui Shi) Date: Wed, 7 Apr 2021 12:09:05 GMT Subject: RFR: 8264649: runtime/InternalApi/ThreadCpuTimesDeadlock.java crash in fastdebug C2 with -XX:-UseTLAB [v2] In-Reply-To: References: Message-ID: > Please help review this fix. Detailed analysis in https://bugs.openjdk.java.net/browse/JDK-8264649 > > Tier1/2 release /fastdebug passed Hui Shi has updated the pull request incrementally with one additional commit since the last revision: fix typo in test description ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3336/files - new: https://git.openjdk.java.net/jdk/pull/3336/files/08292fe1..2f6b68df Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3336&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3336&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3336.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3336/head:pull/3336 PR: https://git.openjdk.java.net/jdk/pull/3336 From chagedorn at openjdk.java.net Wed Apr 7 12:10:59 2021 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Wed, 7 Apr 2021 12:10:59 GMT Subject: RFR: 8264795: IGV: Upgrade NetBeans platform [v2] In-Reply-To: References: Message-ID: On Wed, 7 Apr 2021 11:59:28 GMT, Roberto Casta?eda Lozano wrote: >> This change upgrades the NetBeans platform on which IGV is based to its latest version ([12.3](https://netbeans.apache.org/download/nb123/index.html)) and switches IGV's build system from Ant to Maven. The upgrade introduces support for a wide range of JDK versions (from 8 to 15, the latest version supported by NetBeans 12.3), and the switch from Ant to Maven makes the IGV build simpler, faster (first-time build is approximately 5x faster), and more stable (all dependencies are fetched directly from the Maven central repository). >> >> The change also fixes broken unit tests in the Data module and runs them by default when building. >> >> #### Testing >> >> Regression-tested the following use cases manually on all combinations of (Linux, Windows, macOS) and (JDK 8, JDK 11, JDK 15): >> >> - build >> - open graph file (small.xml in [test-input.zip](https://bugs.openjdk.java.net/secure/attachment/93988/test-input.zip)) >> - import graphs via network (localhost) >> - expand groups in outline >> - open a graph >> - open a clone >> - zoom in and out >> - search a node >> - apply filters >> - extract a node >> - show all nodes >> - select nodes corresponding to a bytecode >> - view control flow >> - select nodes corresponding to a basic block >> - cluster nodes >> - show satellite view >> - compute the difference of two graphs >> - change node text >> - remove a group >> - save groups into a file >> - open a large graph file (large.xml in [test-input.zip](https://bugs.openjdk.java.net/secure/attachment/93988/test-input.zip)) >> - open a large graph ("After Escape Analysis" in large.xml) >> >> Thanks to Vladimir Ivanov for helping with testing on macOS. > > Roberto Casta?eda Lozano has updated the pull request incrementally with three additional commits since the last revision: > > - Document how to build and run on a specific JDK > - Use lambdas to define runnables > - Document latest JDK version supported Marked as reviewed by chagedorn (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3361 From kvn at openjdk.java.net Wed Apr 7 16:05:01 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 7 Apr 2021 16:05:01 GMT Subject: RFR: 8264063: Outer Safepoint poll load should not reference the head of inner strip mined loop. [v2] In-Reply-To: References: Message-ID: > When loop is "strip mined" polling address load ([parse1.cpp#L2280](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/parse1.cpp#L2280)) should be cloned together with safepoint node and pinned outside inner loop. Otherwise we have issues like [8263352](https://bugs.openjdk.java.net/browse/JDK-8263352) > > I also remove leftover (unused needs_polling_address_input() method) from [8220051](https://bugs.openjdk.java.net/browse/JDK-8220051) changes. > > Tested hs-tier1-4 Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: Removed unneeded anymore code in match_fill_loop() ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3365/files - new: https://git.openjdk.java.net/jdk/pull/3365/files/1f645337..58a4ce1d Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3365&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3365&range=00-01 Stats: 10 lines in 1 file changed: 0 ins; 10 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3365.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3365/head:pull/3365 PR: https://git.openjdk.java.net/jdk/pull/3365 From kvn at openjdk.java.net Wed Apr 7 16:05:02 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 7 Apr 2021 16:05:02 GMT Subject: RFR: 8264063: Outer Safepoint poll load should not reference the head of inner strip mined loop. In-Reply-To: <7YL-VL_1xzDmZH5jqlr5ZKnJK2tZBFtD_lyM5QUSfTo=.b65bfa0c-37b7-4885-bce7-7517ee6fa0fc@github.com> References: <7YL-VL_1xzDmZH5jqlr5ZKnJK2tZBFtD_lyM5QUSfTo=.b65bfa0c-37b7-4885-bce7-7517ee6fa0fc@github.com> Message-ID: <4sfEfc52fns5swIw2OozuKaqB8Cy82fS4vQ-5890904=.0a2222ff-707a-4741-95f8-819d4db33f86@github.com> On Wed, 7 Apr 2021 04:53:44 GMT, Vladimir Kozlov wrote: >> Should we also remove below part of code in `loopTransform.cpp` in this patch? >> >> 3704 if (n->is_CountedLoop() && n->as_CountedLoop()->is_strip_mined()) { >> 3705 // In strip-mined counted loops, the CountedLoopNode may be >> 3706 // used by the address polling node of the outer safepoint. >> 3707 // Skip this use because it's safe. >> 3708 Node* sfpt = n->as_CountedLoop()->outer_safepoint(); >> 3709 Node* polladr = sfpt->in(TypeFunc::Parms+0); >> 3710 if (use == polladr) { >> 3711 continue; >> 3712 } >> 3713 } > >> Should we also remove below part of code in `loopTransform.cpp` in this patch? >> >> ``` >> 3704 if (n->is_CountedLoop() && n->as_CountedLoop()->is_strip_mined()) { >> 3705 // In strip-mined counted loops, the CountedLoopNode may be >> 3706 // used by the address polling node of the outer safepoint. >> 3707 // Skip this use because it's safe. >> 3708 Node* sfpt = n->as_CountedLoop()->outer_safepoint(); >> 3709 Node* polladr = sfpt->in(TypeFunc::Parms+0); >> 3710 if (use == polladr) { >> 3711 continue; >> 3712 } >> 3713 } >> ``` > > I thought that it does not harm to have this code but on other hand it will be not executed anymore. > I will removed it and I will run ArrayFill.java to make sure optimization works with strip mined loops. I removed pointed code and verified that arrayfill optimization still works with strip mined loops. ------------- PR: https://git.openjdk.java.net/jdk/pull/3365 From vlivanov at openjdk.java.net Wed Apr 7 16:09:41 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Wed, 7 Apr 2021 16:09:41 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v2] In-Reply-To: References: Message-ID: On Tue, 6 Apr 2021 15:54:33 GMT, Vladimir Kozlov wrote: >> Xiaohong Gong has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: >> >> 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask > > @iwanowww should confirm correctness of such optimization. > Regarding changes - they seem fine to me. I notice that VectorNode and its subclasses do not check for TOP inputs. Since Vector API introduce vectors in graph before SuperWord transformation their input could become dead. How such cases handled? And why we did not hit them yet? is_vect() should hit assert. I'm not fond of the proposed approach. It hard-codes some implicit assumptions about vector mask representation. Also, vector type mismatch can trigger asserts since some vector-related code expects that types of vector inputs match precisely). I suggest to introduce artificial cast nodes (`VectorLoadMask (VectorStoreMask vmask) ==> VectorMaskCast (vmask)`) which are then lowered into no-ops on backend side. ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From goetz at openjdk.java.net Wed Apr 7 16:20:56 2021 From: goetz at openjdk.java.net (Goetz Lindenmaier) Date: Wed, 7 Apr 2021 16:20:56 GMT Subject: RFR: 8264173: [s390] Improve Hardware Feature Detection And Reporting [v2] In-Reply-To: <5OhEhVbnzUEotih1ykgz3Omnt3jQVEYG4B2uMFbCROY=.cbe91475-a6a5-41c3-a11c-4e23d9df9937@github.com> References: <-OjFHEcBr4ajS6JQWPsPHXm2w8MNQ5b028UlabrDv84=.174c9149-ca67-4929-a3b5-0bc6f561df5e@github.com> <5OhEhVbnzUEotih1ykgz3Omnt3jQVEYG4B2uMFbCROY=.cbe91475-a6a5-41c3-a11c-4e23d9df9937@github.com> Message-ID: On Mon, 29 Mar 2021 10:32:40 GMT, Lutz Schmidt wrote: >> I didn't expect the change to become that large. But it looks good to me. The lengthy output only gets generated with -XX:+Verbose. That's fine. > > Thank you for the review, Martin! LGTM ------------- PR: https://git.openjdk.java.net/jdk/pull/3196 From goetz at openjdk.java.net Wed Apr 7 16:20:54 2021 From: goetz at openjdk.java.net (Goetz Lindenmaier) Date: Wed, 7 Apr 2021 16:20:54 GMT Subject: RFR: 8264173: [s390] Improve Hardware Feature Detection And Reporting [v3] In-Reply-To: <-dA2uhXwH7eME201-lzohAOpXigVwsUjtGACZZWMRXc=.0a4e611f-a3c0-470f-a310-5d17715492e1@github.com> References: <-OjFHEcBr4ajS6JQWPsPHXm2w8MNQ5b028UlabrDv84=.174c9149-ca67-4929-a3b5-0bc6f561df5e@github.com> <-dA2uhXwH7eME201-lzohAOpXigVwsUjtGACZZWMRXc=.0a4e611f-a3c0-470f-a310-5d17715492e1@github.com> Message-ID: <7Urlz3zuxm7aJgwjId9mUtSG_2qeeGrX4mNGqDtJ5aI=.d881dc3b-b102-4427-b43c-9d7b73abeb60@github.com> On Mon, 29 Mar 2021 10:36:47 GMT, Lutz Schmidt wrote: >> This enhancement is intended to improve the hardware feature detection and reporting, in particular for more recently introduced hardware. The enhancement is a prerequisite for possible future feature exploitation. >> >> Reviews are highly welcome and appreciated. > > Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: > > cleaned up some spaces Marked as reviewed by goetz (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3196 From lucy at openjdk.java.net Wed Apr 7 16:20:57 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Wed, 7 Apr 2021 16:20:57 GMT Subject: RFR: 8264173: [s390] Improve Hardware Feature Detection And Reporting [v3] In-Reply-To: <7Urlz3zuxm7aJgwjId9mUtSG_2qeeGrX4mNGqDtJ5aI=.d881dc3b-b102-4427-b43c-9d7b73abeb60@github.com> References: <-OjFHEcBr4ajS6JQWPsPHXm2w8MNQ5b028UlabrDv84=.174c9149-ca67-4929-a3b5-0bc6f561df5e@github.com> <-dA2uhXwH7eME201-lzohAOpXigVwsUjtGACZZWMRXc=.0a4e611f-a3c0-470f-a310-5d17715492e1@github.com> <7Urlz3zuxm7aJgwjId9mUtSG_2qeeGrX4mNGqDtJ5aI=.d881dc3b-b102-4427-b43c-9d7b73abeb60@github.com> Message-ID: On Wed, 7 Apr 2021 16:17:07 GMT, Goetz Lindenmaier wrote: >> Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: >> >> cleaned up some spaces > > Marked as reviewed by goetz (Reviewer). Thank you Goetz! I will now proceed and integrate the PR. ------------- PR: https://git.openjdk.java.net/jdk/pull/3196 From lucy at openjdk.java.net Wed Apr 7 16:23:44 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Wed, 7 Apr 2021 16:23:44 GMT Subject: Integrated: 8264173: [s390] Improve Hardware Feature Detection And Reporting In-Reply-To: <-OjFHEcBr4ajS6JQWPsPHXm2w8MNQ5b028UlabrDv84=.174c9149-ca67-4929-a3b5-0bc6f561df5e@github.com> References: <-OjFHEcBr4ajS6JQWPsPHXm2w8MNQ5b028UlabrDv84=.174c9149-ca67-4929-a3b5-0bc6f561df5e@github.com> Message-ID: On Thu, 25 Mar 2021 15:28:21 GMT, Lutz Schmidt wrote: > This enhancement is intended to improve the hardware feature detection and reporting, in particular for more recently introduced hardware. The enhancement is a prerequisite for possible future feature exploitation. > > Reviews are highly welcome and appreciated. This pull request has now been integrated. Changeset: d3fdd739 Author: Lutz Schmidt URL: https://git.openjdk.java.net/jdk/commit/d3fdd739 Stats: 557 lines in 4 files changed: 340 ins; 74 del; 143 mod 8264173: [s390] Improve Hardware Feature Detection And Reporting Reviewed-by: mdoerr, goetz ------------- PR: https://git.openjdk.java.net/jdk/pull/3196 From mark.reinhold at oracle.com Wed Apr 7 21:47:19 2021 From: mark.reinhold at oracle.com (mark.reinhold at oracle.com) Date: Wed, 7 Apr 2021 14:47:19 -0700 (PDT) Subject: New candidate JEP: 410: Remove the Experimental AOT and JIT Compiler Message-ID: <20210407214719.2E6553DF0DA@eggemoggin.niobe.net> https://openjdk.java.net/jeps/410 Summary: Remove the experimental Java-based ahead-of-time (AOT) and just-in-time (JIT) compiler. This compiler has seen little use since its introduction and the effort required to maintain it is significant. Retain the experimental Java-level JVM compiler interface (JVMCI) so that developers can continue to use externally-built versions of the compiler for JIT compilation. - Mark From vlivanov at openjdk.java.net Wed Apr 7 22:07:21 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Wed, 7 Apr 2021 22:07:21 GMT Subject: RFR: 8264871: Dependencies: Miscellaneous cleanups in dependencies.cpp Message-ID: Miscellaneous cleanups in dependencies.cpp. Testing: * [x] hs-tier1 - hs-tier4 ------------- Commit messages: - KlassDepChange::involves_context - KlassDepChange::_new_type - Dependencies::is_concrete_method - Dependencies::verify_method_context - int -> uint Changes: https://git.openjdk.java.net/jdk/pull/3385/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3385&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264871 Stats: 156 lines in 2 files changed: 70 ins; 58 del; 28 mod Patch: https://git.openjdk.java.net/jdk/pull/3385.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3385/head:pull/3385 PR: https://git.openjdk.java.net/jdk/pull/3385 From vlivanov at openjdk.java.net Wed Apr 7 22:17:12 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Wed, 7 Apr 2021 22:17:12 GMT Subject: RFR: 8264872: Dependencies: Migrate to PerfData counters Message-ID: <6G6POLSvOArZJ0iXvZW2PDx1ps6xJLTnNe20nwcOyVA=.a254144b-8fc9-494d-9655-ba68638469da@github.com> Migrate performance counters on `Dependencies` (`deps_find_witness_*`) to `PerfData`. Testing: - [x] hs-tier1 - hs-tier2 ------------- Depends on: https://git.openjdk.java.net/jdk/pull/3385 Commit messages: - CountingClassHierarchyIterator - Migrate to PerfData - Depenencies perf counters Changes: https://git.openjdk.java.net/jdk/pull/3386/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3386&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264872 Stats: 121 lines in 4 files changed: 61 ins; 37 del; 23 mod Patch: https://git.openjdk.java.net/jdk/pull/3386.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3386/head:pull/3386 PR: https://git.openjdk.java.net/jdk/pull/3386 From kvn at openjdk.java.net Wed Apr 7 22:25:20 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 7 Apr 2021 22:25:20 GMT Subject: RFR: 8264872: Dependencies: Migrate to PerfData counters In-Reply-To: <6G6POLSvOArZJ0iXvZW2PDx1ps6xJLTnNe20nwcOyVA=.a254144b-8fc9-494d-9655-ba68638469da@github.com> References: <6G6POLSvOArZJ0iXvZW2PDx1ps6xJLTnNe20nwcOyVA=.a254144b-8fc9-494d-9655-ba68638469da@github.com> Message-ID: <4yiaYDeeRWpCxqMVjbOAWmukMD1yq1jx3BJjrRAVtDE=.d8299195-b67e-47dc-998d-5fdc41836d2b@github.com> On Wed, 7 Apr 2021 22:07:18 GMT, Vladimir Ivanov wrote: > Migrate performance counters on `Dependencies` (`deps_find_witness_*`) to `PerfData`. > > Testing: > - [x] hs-tier1 - hs-tier2 Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3386 From vlivanov at openjdk.java.net Wed Apr 7 22:27:07 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Wed, 7 Apr 2021 22:27:07 GMT Subject: RFR: 8264873: Dependencies: Split ClassHierarchyWalker Message-ID: Split ClassHierarchyWalker into ConcreteMethodFinder and ConcreteSubtypeFinder. Testing: - [x] hs-tier1 - hs-tier4 ------------- Depends on: https://git.openjdk.java.net/jdk/pull/3386 Commit messages: - Split ClassHierarchyWalker into ConcreteMethodFinder and ConcreteSubtypeFinder Changes: https://git.openjdk.java.net/jdk/pull/3387/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3387&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264873 Stats: 438 lines in 1 file changed: 134 ins; 144 del; 160 mod Patch: https://git.openjdk.java.net/jdk/pull/3387.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3387/head:pull/3387 PR: https://git.openjdk.java.net/jdk/pull/3387 From vlivanov at openjdk.java.net Wed Apr 7 22:38:58 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Wed, 7 Apr 2021 22:38:58 GMT Subject: RFR: 8264873: Dependencies: Split ClassHierarchyWalker [v2] In-Reply-To: References: Message-ID: > Split ClassHierarchyWalker into ConcreteMethodFinder and ConcreteSubtypeFinder. > > Testing: > - [x] hs-tier1 - hs-tier4 Vladimir Ivanov has updated the pull request incrementally with one additional commit since the last revision: Fix formatting ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3387/files - new: https://git.openjdk.java.net/jdk/pull/3387/files/6b905d78..071de4cd Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3387&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3387&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/3387.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3387/head:pull/3387 PR: https://git.openjdk.java.net/jdk/pull/3387 From njian at openjdk.java.net Thu Apr 8 01:48:42 2021 From: njian at openjdk.java.net (Ningsheng Jian) Date: Thu, 8 Apr 2021 01:48:42 GMT Subject: RFR: 8264352: AArch64: Optimize vector "not/andNot" for NEON and SVE In-Reply-To: <2RRMDFhq5Eo8gVfh1Mrn343KFLKPJh08oHx4TMGgbcw=.d8ea6de7-8efa-424b-abb2-8b7135e4675d@github.com> References: <2RRMDFhq5Eo8gVfh1Mrn343KFLKPJh08oHx4TMGgbcw=.d8ea6de7-8efa-424b-abb2-8b7135e4675d@github.com> Message-ID: On Wed, 7 Apr 2021 05:53:46 GMT, Xiaohong Gong wrote: > Since the vector bitwise `"andNot"` is implemented with `"v1.and(v2.xor(-1))"`, the generated codes with SVE look like: > mov z16.b, #-1 > eor z17.d, z20.d, z16.d > and z18.d, z18.d, z17.d > This could be improved with a single instruction: > bic z16.d, z16.d, z18.d > Similarly, the following optimization for NEON is also needed: > not v21.16b, v21.16b > and v21.16b, v21.16b, v18.16b ==> bic v21.16b, v18.16b, v21.16b > This patch also adds the following optimization to vector` "not"` for SVE which has already been added for NEON: > mov z16.b, #-1 > eor z17.d, z20.d, z16.d ==> not z17.d, p7/m, z20.d > The performance can improve about `16% ~ 36%` with NEON for the `"AND_NOT"` benchmark [1]. > > [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/ByteMaxVector.java#L343 > > Tested tier1 and jdk:tier3. Looks good. ------------- Marked as reviewed by njian (Committer). PR: https://git.openjdk.java.net/jdk/pull/3370 From xgong at openjdk.java.net Thu Apr 8 01:52:08 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Thu, 8 Apr 2021 01:52:08 GMT Subject: RFR: 8264352: AArch64: Optimize vector "not/andNot" for NEON and SVE In-Reply-To: References: <2RRMDFhq5Eo8gVfh1Mrn343KFLKPJh08oHx4TMGgbcw=.d8ea6de7-8efa-424b-abb2-8b7135e4675d@github.com> Message-ID: <3x-7Bl7IUNLh_RV3l6f8DT0lmLrbPtWznqZgV9UFVDE=.5bc933c5-255a-4725-9ea8-f96c00d09002@github.com> On Wed, 7 Apr 2021 09:03:55 GMT, Andrew Haley wrote: >> Since the vector bitwise `"andNot"` is implemented with `"v1.and(v2.xor(-1))"`, the generated codes with SVE look like: >> mov z16.b, #-1 >> eor z17.d, z20.d, z16.d >> and z18.d, z18.d, z17.d >> This could be improved with a single instruction: >> bic z16.d, z16.d, z18.d >> Similarly, the following optimization for NEON is also needed: >> not v21.16b, v21.16b >> and v21.16b, v21.16b, v18.16b ==> bic v21.16b, v18.16b, v21.16b >> This patch also adds the following optimization to vector` "not"` for SVE which has already been added for NEON: >> mov z16.b, #-1 >> eor z17.d, z20.d, z16.d ==> not z17.d, p7/m, z20.d >> The performance can improve about `16% ~ 36%` with NEON for the `"AND_NOT"` benchmark [1]. >> >> [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/ByteMaxVector.java#L343 >> >> Tested tier1 and jdk:tier3. > > Marked as reviewed by aph (Reviewer). Thanks for the review @theRealAph @nsjian ! ------------- PR: https://git.openjdk.java.net/jdk/pull/3370 From kvn at openjdk.java.net Thu Apr 8 03:19:32 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 8 Apr 2021 03:19:32 GMT Subject: RFR: 8264873: Dependencies: Split ClassHierarchyWalker [v2] In-Reply-To: References: Message-ID: On Wed, 7 Apr 2021 22:38:58 GMT, Vladimir Ivanov wrote: >> Split ClassHierarchyWalker into ConcreteMethodFinder and ConcreteSubtypeFinder. >> >> Testing: >> - [x] hs-tier1 - hs-tier4 > > Vladimir Ivanov has updated the pull request incrementally with one additional commit since the last revision: > > Fix formatting Marked as reviewed by kvn (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3387 From xgong at openjdk.java.net Thu Apr 8 03:45:42 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Thu, 8 Apr 2021 03:45:42 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v2] In-Reply-To: References: Message-ID: <5dkX8tZwS9ZWQ7t3d130spNBF54dICgn-XLKW-OuLCs=.7d319f40-1c32-4ebd-8a6b-f11780292d2a@github.com> On Wed, 7 Apr 2021 16:06:44 GMT, Vladimir Ivanov wrote: >> @iwanowww should confirm correctness of such optimization. >> Regarding changes - they seem fine to me. I notice that VectorNode and its subclasses do not check for TOP inputs. Since Vector API introduce vectors in graph before SuperWord transformation their input could become dead. How such cases handled? And why we did not hit them yet? is_vect() should hit assert. > > I'm not fond of the proposed approach. > > It hard-codes some implicit assumptions about vector mask representation. > Also, vector type mismatch can trigger asserts since some vector-related code expects that types of vector inputs match precisely). > > I suggest to introduce artificial cast nodes (`VectorLoadMask (VectorStoreMask vmask) ==> VectorMaskCast (vmask)`) which are then lowered into no-ops on backend side. Hi @iwanowww , thanks for looking at this PR. > It hard-codes some implicit assumptions about vector mask representation. > Also, vector type mismatch can trigger asserts since some vector-related code expects that types of vector inputs match precisely). Agree. I'm also anxious about the potential assertion although I didn't met the issue currently. > I suggest to introduce artificial cast nodes (`VectorLoadMask (VectorStoreMask vmask) ==> VectorMaskCast (vmask)`) which are then lowered into no-ops on backend side. It's a good idea to add a cast node for mask. I tried with it, and it can work well for the casting with same element size and vector length. However, for the different element size (i.g. short->int) casting, I think the original `VectorLoadMask (VectorStoreMask vmask)` is better since there are some optimizations existed. Also using `VectorMaskCast` for all cases needs more match rules in the backend which I think is duplicated with `VectorLoadMask+VectorStoreMask`. So does it look good for you if the `VectorMaskCast` is limited to the masking cast with the same element size? The changes might look like: diff --git a/src/hotspot/share/opto/vectornode.cpp b/src/hotspot/share/opto/vectornode.cpp index 326b9c4a5a0..d7b53c247fb 100644 --- a/src/hotspot/share/opto/vectornode.cpp +++ b/src/hotspot/share/opto/vectornode.cpp @@ -1232,6 +1232,10 @@ Node* VectorUnboxNode::Ideal(PhaseGVN* phase, bool can_reshape) { bool is_vector_mask = vbox_klass->is_subclass_of(ciEnv::current()->vector_VectorMask_klass()); bool is_vector_shuffle = vbox_klass->is_subclass_of(ciEnv::current()->vector_VectorShuffle_klass()); if (is_vector_mask) { + if (in_vt->length_in_bytes() == out_vt->length_in_bytes() && + Matcher::match_rule_supported(Op_VectorMaskCast)) { + return new VectorMaskCastNode(value, out_vt); + } // VectorUnbox (VectorBox vmask) ==> VectorLoadMask (VectorStoreMask vmask) value = phase->transform(VectorStoreMaskNode::make(*phase, value, in_vt->element_basic_type(), in_vt->length())); return new VectorLoadMaskNode(value, out_vt); ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From xgong at openjdk.java.net Thu Apr 8 03:51:28 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Thu, 8 Apr 2021 03:51:28 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v2] In-Reply-To: References: Message-ID: On Wed, 7 Apr 2021 16:06:44 GMT, Vladimir Ivanov wrote: >> @iwanowww should confirm correctness of such optimization. >> Regarding changes - they seem fine to me. I notice that VectorNode and its subclasses do not check for TOP inputs. Since Vector API introduce vectors in graph before SuperWord transformation their input could become dead. How such cases handled? And why we did not hit them yet? is_vect() should hit assert. > > I'm not fond of the proposed approach. > > It hard-codes some implicit assumptions about vector mask representation. > Also, vector type mismatch can trigger asserts since some vector-related code expects that types of vector inputs match precisely). > > I suggest to introduce artificial cast nodes (`VectorLoadMask (VectorStoreMask vmask) ==> VectorMaskCast (vmask)`) which are then lowered into no-ops on backend side. > @iwanowww should confirm correctness of such optimization. > Regarding changes - they seem fine to me. I notice that VectorNode and its subclasses do not check for TOP inputs. Since Vector API introduce vectors in graph before SuperWord transformation their input could become dead. How such cases handled? And why we did not hit them yet? is_vect() should hit assert. Thanks for looking at this PR @vnkozlov . To be honest, I'v no idea about the TOP checking issue to the inputs of the VectorNode. Hope @iwanowww could explain more. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From dlong at openjdk.java.net Thu Apr 8 05:21:37 2021 From: dlong at openjdk.java.net (Dean Long) Date: Thu, 8 Apr 2021 05:21:37 GMT Subject: RFR: 8262916: Merge LShiftCntV and RShiftCntV into a single node In-Reply-To: References: Message-ID: On Wed, 7 Apr 2021 07:28:08 GMT, Eric Liu wrote: > The vector shift count was defined by two separate nodes(LShiftCntV and > RShiftCntV), which would prevent them from being shared when the shift > counts are the same. > > public static void test_shiftv(int sh) { > for (int i = 0; i < N; i+=1) { > a0[i] = a1[i] << sh; > b0[i] = b1[i] >> sh; > } > } > > Given the example above, by merging the same shift counts into one > node, they could be shared by shift nodes(RShiftV or LShiftV) like > below: > > Before: > 1184 LShiftCntV === _ 1189 [[ 1185 ... ]] > 1190 RShiftCntV === _ 1189 [[ 1191 ... ]] > 1185 LShiftVI === _ 1181 1184 [[ 1186 ]] > 1191 RShiftVI === _ 1187 1190 [[ 1192 ]] > > After: > 1190 ShiftCntV === _ 1189 [[ 1191 1204 ... ]] > 1204 LShiftVI === _ 1211 1190 [[ 1203 ]] > 1191 RShiftVI === _ 1187 1190 [[ 1192 ]] > > The final code could remove one redundant ?dup?(scalar->vector), > with one register saved. > > Before: > dup v16.16b, w12 > dup v17.16b, w12 > ... > ldr q18, [x13, #16] > sshl v18.4s, v18.4s, v16.4s > add x18, x16, x12 ; iaload > > add x4, x15, x12 > str q18, [x4, #16] ; iastore > > ldr q18, [x18, #16] > add x12, x14, x12 > neg v19.16b, v17.16b > sshl v18.4s, v18.4s, v19.4s > str q18, [x12, #16] ; iastore > > After: > dup v16.16b, w11 > ... > ldr q17, [x13, #16] > sshl v17.4s, v17.4s, v16.4s > add x2, x22, x11 ; iaload > > add x4, x16, x11 > str q17, [x4, #16] ; iastore > > ldr q17, [x2, #16] > add x11, x21, x11 > neg v18.16b, v16.16b > sshl v17.4s, v17.4s, v18.4s > str q17, [x11, #16] ; iastore You should be able to do this without introducing a new node type. You could change the shift rules to match a vector register like x86.ad and aarch64_sve.ad already do. ------------- PR: https://git.openjdk.java.net/jdk/pull/3371 From xgong at openjdk.java.net Thu Apr 8 06:27:17 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Thu, 8 Apr 2021 06:27:17 GMT Subject: Integrated: 8264352: AArch64: Optimize vector "not/andNot" for NEON and SVE In-Reply-To: <2RRMDFhq5Eo8gVfh1Mrn343KFLKPJh08oHx4TMGgbcw=.d8ea6de7-8efa-424b-abb2-8b7135e4675d@github.com> References: <2RRMDFhq5Eo8gVfh1Mrn343KFLKPJh08oHx4TMGgbcw=.d8ea6de7-8efa-424b-abb2-8b7135e4675d@github.com> Message-ID: On Wed, 7 Apr 2021 05:53:46 GMT, Xiaohong Gong wrote: > Since the vector bitwise `"andNot"` is implemented with `"v1.and(v2.xor(-1))"`, the generated codes with SVE look like: > mov z16.b, #-1 > eor z17.d, z20.d, z16.d > and z18.d, z18.d, z17.d > This could be improved with a single instruction: > bic z16.d, z16.d, z18.d > Similarly, the following optimization for NEON is also needed: > not v21.16b, v21.16b > and v21.16b, v21.16b, v18.16b ==> bic v21.16b, v18.16b, v21.16b > This patch also adds the following optimization to vector` "not"` for SVE which has already been added for NEON: > mov z16.b, #-1 > eor z17.d, z20.d, z16.d ==> not z17.d, p7/m, z20.d > The performance can improve about `16% ~ 36%` with NEON for the `"AND_NOT"` benchmark [1]. > > [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/ByteMaxVector.java#L343 > > Tested tier1 and jdk:tier3. This pull request has now been integrated. Changeset: e89542fb Author: Xiaohong Gong Committer: Ningsheng Jian URL: https://git.openjdk.java.net/jdk/commit/e89542fb Stats: 219 lines in 7 files changed: 185 ins; 0 del; 34 mod 8264352: AArch64: Optimize vector "not/andNot" for NEON and SVE Reviewed-by: aph, njian ------------- PR: https://git.openjdk.java.net/jdk/pull/3370 From dongbo at openjdk.java.net Thu Apr 8 06:33:43 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Thu, 8 Apr 2021 06:33:43 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic [v7] In-Reply-To: References: Message-ID: > In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding. > Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea can be found at http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords. > > Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build. > Tests in `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java` runned specially for the correctness of the implementation. > > There can be illegal characters at the start of the input if the data is MIME encoded. > It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now. > > A JMH micro, Base64Decode.java, is added for performance test. > With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro), > we witness ~2.5x improvements with long inputs and no regression with short inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on Kunpeng916. > > The Base64Decode.java JMH micro-benchmark results: > > Benchmark (lineSize) (maxNumBytes) Mode Cnt Score Error Units > > # Kunpeng916, intrinsic > Base64Decode.testBase64Decode 4 1 avgt 5 48.614 ? 0.609 ns/op > Base64Decode.testBase64Decode 4 3 avgt 5 58.199 ? 1.650 ns/op > Base64Decode.testBase64Decode 4 7 avgt 5 69.400 ? 0.931 ns/op > Base64Decode.testBase64Decode 4 32 avgt 5 96.818 ? 1.687 ns/op > Base64Decode.testBase64Decode 4 64 avgt 5 122.856 ? 9.217 ns/op > Base64Decode.testBase64Decode 4 80 avgt 5 130.935 ? 1.667 ns/op > Base64Decode.testBase64Decode 4 96 avgt 5 143.627 ? 1.751 ns/op > Base64Decode.testBase64Decode 4 112 avgt 5 152.311 ? 1.178 ns/op > Base64Decode.testBase64Decode 4 512 avgt 5 342.631 ? 0.584 ns/op > Base64Decode.testBase64Decode 4 1000 avgt 5 573.635 ? 1.050 ns/op > Base64Decode.testBase64Decode 4 20000 avgt 5 9534.136 ? 45.172 ns/op > Base64Decode.testBase64Decode 4 50000 avgt 5 22718.726 ? 192.070 ns/op > Base64Decode.testBase64MIMEDecode 4 1 avgt 10 63.558 ? 0.336 ns/op > Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.504 ? 0.848 ns/op > Base64Decode.testBase64MIMEDecode 4 7 avgt 10 120.591 ? 0.608 ns/op > Base64Decode.testBase64MIMEDecode 4 32 avgt 10 324.314 ? 6.236 ns/op > Base64Decode.testBase64MIMEDecode 4 64 avgt 10 532.678 ? 4.670 ns/op > Base64Decode.testBase64MIMEDecode 4 80 avgt 10 678.126 ? 4.324 ns/op > Base64Decode.testBase64MIMEDecode 4 96 avgt 10 771.603 ? 6.393 ns/op > Base64Decode.testBase64MIMEDecode 4 112 avgt 10 889.608 ? 0.759 ns/op > Base64Decode.testBase64MIMEDecode 4 512 avgt 10 3663.557 ? 3.422 ns/op > Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7017.784 ? 9.128 ns/op > Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 128670.660 ? 7951.521 ns/op > Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 317113.667 ? 161.758 ns/op > > # Kunpeng916, default > Base64Decode.testBase64Decode 4 1 avgt 5 48.455 ? 0.571 ns/op > Base64Decode.testBase64Decode 4 3 avgt 5 57.937 ? 0.505 ns/op > Base64Decode.testBase64Decode 4 7 avgt 5 73.823 ? 1.452 ns/op > Base64Decode.testBase64Decode 4 32 avgt 5 106.484 ? 1.243 ns/op > Base64Decode.testBase64Decode 4 64 avgt 5 141.004 ? 1.188 ns/op > Base64Decode.testBase64Decode 4 80 avgt 5 156.284 ? 0.572 ns/op > Base64Decode.testBase64Decode 4 96 avgt 5 174.137 ? 0.177 ns/op > Base64Decode.testBase64Decode 4 112 avgt 5 188.445 ? 0.572 ns/op > Base64Decode.testBase64Decode 4 512 avgt 5 610.847 ? 1.559 ns/op > Base64Decode.testBase64Decode 4 1000 avgt 5 1155.368 ? 0.813 ns/op > Base64Decode.testBase64Decode 4 20000 avgt 5 19751.477 ? 24.669 ns/op > Base64Decode.testBase64Decode 4 50000 avgt 5 50046.586 ? 523.155 ns/op > Base64Decode.testBase64MIMEDecode 4 1 avgt 10 64.130 ? 0.238 ns/op > Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.096 ? 0.205 ns/op > Base64Decode.testBase64MIMEDecode 4 7 avgt 10 118.849 ? 0.610 ns/op > Base64Decode.testBase64MIMEDecode 4 32 avgt 10 331.177 ? 4.732 ns/op > Base64Decode.testBase64MIMEDecode 4 64 avgt 10 549.117 ? 0.177 ns/op > Base64Decode.testBase64MIMEDecode 4 80 avgt 10 702.951 ? 4.572 ns/op > Base64Decode.testBase64MIMEDecode 4 96 avgt 10 799.566 ? 0.301 ns/op > Base64Decode.testBase64MIMEDecode 4 112 avgt 10 923.749 ? 0.389 ns/op > Base64Decode.testBase64MIMEDecode 4 512 avgt 10 4000.725 ? 2.519 ns/op > Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7674.994 ? 9.281 ns/op > Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 142059.001 ? 157.920 ns/op > Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 355698.369 ? 216.542 ns/op Dong Bo has updated the pull request incrementally with one additional commit since the last revision: reduce unnecessary memory write traffic in non-SIMD code ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3228/files - new: https://git.openjdk.java.net/jdk/pull/3228/files/a342ad1e..ab0e405b Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3228&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3228&range=05-06 Stats: 7 lines in 1 file changed: 0 ins; 1 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/3228.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3228/head:pull/3228 PR: https://git.openjdk.java.net/jdk/pull/3228 From dongbo at openjdk.java.net Thu Apr 8 06:39:11 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Thu, 8 Apr 2021 06:39:11 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic [v6] In-Reply-To: References: Message-ID: On Wed, 7 Apr 2021 09:53:36 GMT, Andrew Haley wrote: >> src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5829: >> >>> 5827: __ strb(r14, __ post(dst, 1)); >>> 5828: __ strb(r15, __ post(dst, 1)); >>> 5829: __ strb(r13, __ post(dst, 1)); >> >> I think this sequence should be 4 BFMs, STRW, BFM, STRW. That's the best we can do, I think. > > Sorry, that's not quite right, but you get the idea: let's not generate unnecessary memory traffic. Okay, implemented as: __ lslw(r14, r10, 10); __ bfiw(r14, r11, 4, 6); __ bfmw(r14, r12, 2, 5); __ rev16w(r14, r14); __ bfiw(r13, r12, 6, 2); __ strh(r14, __ post(dst, 2)); __ strb(r13, __ post(dst, 1)); ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From aph at openjdk.java.net Thu Apr 8 08:32:18 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 8 Apr 2021 08:32:18 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic [v7] In-Reply-To: References: Message-ID: On Thu, 8 Apr 2021 06:33:43 GMT, Dong Bo wrote: >> In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding. >> Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea can be found at http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords. >> >> Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build. >> Tests in `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java` runned specially for the correctness of the implementation. >> >> There can be illegal characters at the start of the input if the data is MIME encoded. >> It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now. >> >> A JMH micro, Base64Decode.java, is added for performance test. >> With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro), >> we witness ~2.5x improvements with long inputs and no regression with short inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on Kunpeng916. >> >> The Base64Decode.java JMH micro-benchmark results: >> >> Benchmark (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> >> # Kunpeng916, intrinsic >> Base64Decode.testBase64Decode 4 1 avgt 5 48.614 ? 0.609 ns/op >> Base64Decode.testBase64Decode 4 3 avgt 5 58.199 ? 1.650 ns/op >> Base64Decode.testBase64Decode 4 7 avgt 5 69.400 ? 0.931 ns/op >> Base64Decode.testBase64Decode 4 32 avgt 5 96.818 ? 1.687 ns/op >> Base64Decode.testBase64Decode 4 64 avgt 5 122.856 ? 9.217 ns/op >> Base64Decode.testBase64Decode 4 80 avgt 5 130.935 ? 1.667 ns/op >> Base64Decode.testBase64Decode 4 96 avgt 5 143.627 ? 1.751 ns/op >> Base64Decode.testBase64Decode 4 112 avgt 5 152.311 ? 1.178 ns/op >> Base64Decode.testBase64Decode 4 512 avgt 5 342.631 ? 0.584 ns/op >> Base64Decode.testBase64Decode 4 1000 avgt 5 573.635 ? 1.050 ns/op >> Base64Decode.testBase64Decode 4 20000 avgt 5 9534.136 ? 45.172 ns/op >> Base64Decode.testBase64Decode 4 50000 avgt 5 22718.726 ? 192.070 ns/op >> Base64Decode.testBase64MIMEDecode 4 1 avgt 10 63.558 ? 0.336 ns/op >> Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.504 ? 0.848 ns/op >> Base64Decode.testBase64MIMEDecode 4 7 avgt 10 120.591 ? 0.608 ns/op >> Base64Decode.testBase64MIMEDecode 4 32 avgt 10 324.314 ? 6.236 ns/op >> Base64Decode.testBase64MIMEDecode 4 64 avgt 10 532.678 ? 4.670 ns/op >> Base64Decode.testBase64MIMEDecode 4 80 avgt 10 678.126 ? 4.324 ns/op >> Base64Decode.testBase64MIMEDecode 4 96 avgt 10 771.603 ? 6.393 ns/op >> Base64Decode.testBase64MIMEDecode 4 112 avgt 10 889.608 ? 0.759 ns/op >> Base64Decode.testBase64MIMEDecode 4 512 avgt 10 3663.557 ? 3.422 ns/op >> Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7017.784 ? 9.128 ns/op >> Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 128670.660 ? 7951.521 ns/op >> Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 317113.667 ? 161.758 ns/op >> >> # Kunpeng916, default >> Base64Decode.testBase64Decode 4 1 avgt 5 48.455 ? 0.571 ns/op >> Base64Decode.testBase64Decode 4 3 avgt 5 57.937 ? 0.505 ns/op >> Base64Decode.testBase64Decode 4 7 avgt 5 73.823 ? 1.452 ns/op >> Base64Decode.testBase64Decode 4 32 avgt 5 106.484 ? 1.243 ns/op >> Base64Decode.testBase64Decode 4 64 avgt 5 141.004 ? 1.188 ns/op >> Base64Decode.testBase64Decode 4 80 avgt 5 156.284 ? 0.572 ns/op >> Base64Decode.testBase64Decode 4 96 avgt 5 174.137 ? 0.177 ns/op >> Base64Decode.testBase64Decode 4 112 avgt 5 188.445 ? 0.572 ns/op >> Base64Decode.testBase64Decode 4 512 avgt 5 610.847 ? 1.559 ns/op >> Base64Decode.testBase64Decode 4 1000 avgt 5 1155.368 ? 0.813 ns/op >> Base64Decode.testBase64Decode 4 20000 avgt 5 19751.477 ? 24.669 ns/op >> Base64Decode.testBase64Decode 4 50000 avgt 5 50046.586 ? 523.155 ns/op >> Base64Decode.testBase64MIMEDecode 4 1 avgt 10 64.130 ? 0.238 ns/op >> Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.096 ? 0.205 ns/op >> Base64Decode.testBase64MIMEDecode 4 7 avgt 10 118.849 ? 0.610 ns/op >> Base64Decode.testBase64MIMEDecode 4 32 avgt 10 331.177 ? 4.732 ns/op >> Base64Decode.testBase64MIMEDecode 4 64 avgt 10 549.117 ? 0.177 ns/op >> Base64Decode.testBase64MIMEDecode 4 80 avgt 10 702.951 ? 4.572 ns/op >> Base64Decode.testBase64MIMEDecode 4 96 avgt 10 799.566 ? 0.301 ns/op >> Base64Decode.testBase64MIMEDecode 4 112 avgt 10 923.749 ? 0.389 ns/op >> Base64Decode.testBase64MIMEDecode 4 512 avgt 10 4000.725 ? 2.519 ns/op >> Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7674.994 ? 9.281 ns/op >> Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 142059.001 ? 157.920 ns/op >> Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 355698.369 ? 216.542 ns/op > > Dong Bo has updated the pull request incrementally with one additional commit since the last revision: > > reduce unnecessary memory write traffic in non-SIMD code Marked as reviewed by aph (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From whuang at openjdk.java.net Thu Apr 8 08:35:41 2021 From: whuang at openjdk.java.net (Wang Huang) Date: Thu, 8 Apr 2021 08:35:41 GMT Subject: RFR: 8264885: Fix the code style of macro in aarch64_neon_ad.m4 Message-ID: * trivial fix * align the comment of macros ------------- Commit messages: - 8264885: Fix the code style of macro in aarch64_neon_ad.m4 Changes: https://git.openjdk.java.net/jdk/pull/3395/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3395&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264885 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/3395.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3395/head:pull/3395 PR: https://git.openjdk.java.net/jdk/pull/3395 From aph at openjdk.java.net Thu Apr 8 08:49:10 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 8 Apr 2021 08:49:10 GMT Subject: RFR: 8264885: Fix the code style of macro in aarch64_neon_ad.m4 In-Reply-To: References: Message-ID: On Thu, 8 Apr 2021 08:28:52 GMT, Wang Huang wrote: > * trivial fix > * align the comment of macros Marked as reviewed by aph (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3395 From dongbo at openjdk.java.net Thu Apr 8 09:08:17 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Thu, 8 Apr 2021 09:08:17 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic [v7] In-Reply-To: References: Message-ID: On Thu, 8 Apr 2021 08:28:53 GMT, Andrew Haley wrote: >> Dong Bo has updated the pull request incrementally with one additional commit since the last revision: >> >> reduce unnecessary memory write traffic in non-SIMD code > > Marked as reviewed by aph (Reviewer). @theRealAph Thanks for the review. Hi @nick-arm, are you also ok with the newest commit? ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From ngasson at openjdk.java.net Thu Apr 8 09:13:09 2021 From: ngasson at openjdk.java.net (Nick Gasson) Date: Thu, 8 Apr 2021 09:13:09 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic [v7] In-Reply-To: References: Message-ID: On Thu, 8 Apr 2021 06:33:43 GMT, Dong Bo wrote: >> In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding. >> Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea can be found at http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords. >> >> Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build. >> Tests in `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java` runned specially for the correctness of the implementation. >> >> There can be illegal characters at the start of the input if the data is MIME encoded. >> It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now. >> >> A JMH micro, Base64Decode.java, is added for performance test. >> With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro), >> we witness ~2.5x improvements with long inputs and no regression with short inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on Kunpeng916. >> >> The Base64Decode.java JMH micro-benchmark results: >> >> Benchmark (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> >> # Kunpeng916, intrinsic >> Base64Decode.testBase64Decode 4 1 avgt 5 48.614 ? 0.609 ns/op >> Base64Decode.testBase64Decode 4 3 avgt 5 58.199 ? 1.650 ns/op >> Base64Decode.testBase64Decode 4 7 avgt 5 69.400 ? 0.931 ns/op >> Base64Decode.testBase64Decode 4 32 avgt 5 96.818 ? 1.687 ns/op >> Base64Decode.testBase64Decode 4 64 avgt 5 122.856 ? 9.217 ns/op >> Base64Decode.testBase64Decode 4 80 avgt 5 130.935 ? 1.667 ns/op >> Base64Decode.testBase64Decode 4 96 avgt 5 143.627 ? 1.751 ns/op >> Base64Decode.testBase64Decode 4 112 avgt 5 152.311 ? 1.178 ns/op >> Base64Decode.testBase64Decode 4 512 avgt 5 342.631 ? 0.584 ns/op >> Base64Decode.testBase64Decode 4 1000 avgt 5 573.635 ? 1.050 ns/op >> Base64Decode.testBase64Decode 4 20000 avgt 5 9534.136 ? 45.172 ns/op >> Base64Decode.testBase64Decode 4 50000 avgt 5 22718.726 ? 192.070 ns/op >> Base64Decode.testBase64MIMEDecode 4 1 avgt 10 63.558 ? 0.336 ns/op >> Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.504 ? 0.848 ns/op >> Base64Decode.testBase64MIMEDecode 4 7 avgt 10 120.591 ? 0.608 ns/op >> Base64Decode.testBase64MIMEDecode 4 32 avgt 10 324.314 ? 6.236 ns/op >> Base64Decode.testBase64MIMEDecode 4 64 avgt 10 532.678 ? 4.670 ns/op >> Base64Decode.testBase64MIMEDecode 4 80 avgt 10 678.126 ? 4.324 ns/op >> Base64Decode.testBase64MIMEDecode 4 96 avgt 10 771.603 ? 6.393 ns/op >> Base64Decode.testBase64MIMEDecode 4 112 avgt 10 889.608 ? 0.759 ns/op >> Base64Decode.testBase64MIMEDecode 4 512 avgt 10 3663.557 ? 3.422 ns/op >> Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7017.784 ? 9.128 ns/op >> Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 128670.660 ? 7951.521 ns/op >> Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 317113.667 ? 161.758 ns/op >> >> # Kunpeng916, default >> Base64Decode.testBase64Decode 4 1 avgt 5 48.455 ? 0.571 ns/op >> Base64Decode.testBase64Decode 4 3 avgt 5 57.937 ? 0.505 ns/op >> Base64Decode.testBase64Decode 4 7 avgt 5 73.823 ? 1.452 ns/op >> Base64Decode.testBase64Decode 4 32 avgt 5 106.484 ? 1.243 ns/op >> Base64Decode.testBase64Decode 4 64 avgt 5 141.004 ? 1.188 ns/op >> Base64Decode.testBase64Decode 4 80 avgt 5 156.284 ? 0.572 ns/op >> Base64Decode.testBase64Decode 4 96 avgt 5 174.137 ? 0.177 ns/op >> Base64Decode.testBase64Decode 4 112 avgt 5 188.445 ? 0.572 ns/op >> Base64Decode.testBase64Decode 4 512 avgt 5 610.847 ? 1.559 ns/op >> Base64Decode.testBase64Decode 4 1000 avgt 5 1155.368 ? 0.813 ns/op >> Base64Decode.testBase64Decode 4 20000 avgt 5 19751.477 ? 24.669 ns/op >> Base64Decode.testBase64Decode 4 50000 avgt 5 50046.586 ? 523.155 ns/op >> Base64Decode.testBase64MIMEDecode 4 1 avgt 10 64.130 ? 0.238 ns/op >> Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.096 ? 0.205 ns/op >> Base64Decode.testBase64MIMEDecode 4 7 avgt 10 118.849 ? 0.610 ns/op >> Base64Decode.testBase64MIMEDecode 4 32 avgt 10 331.177 ? 4.732 ns/op >> Base64Decode.testBase64MIMEDecode 4 64 avgt 10 549.117 ? 0.177 ns/op >> Base64Decode.testBase64MIMEDecode 4 80 avgt 10 702.951 ? 4.572 ns/op >> Base64Decode.testBase64MIMEDecode 4 96 avgt 10 799.566 ? 0.301 ns/op >> Base64Decode.testBase64MIMEDecode 4 112 avgt 10 923.749 ? 0.389 ns/op >> Base64Decode.testBase64MIMEDecode 4 512 avgt 10 4000.725 ? 2.519 ns/op >> Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7674.994 ? 9.281 ns/op >> Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 142059.001 ? 157.920 ns/op >> Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 355698.369 ? 216.542 ns/op > > Dong Bo has updated the pull request incrementally with one additional commit since the last revision: > > reduce unnecessary memory write traffic in non-SIMD code Marked as reviewed by ngasson (Committer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From ngasson at openjdk.java.net Thu Apr 8 09:13:09 2021 From: ngasson at openjdk.java.net (Nick Gasson) Date: Thu, 8 Apr 2021 09:13:09 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic [v7] In-Reply-To: References: Message-ID: On Thu, 8 Apr 2021 09:05:43 GMT, Dong Bo wrote: > Hi @nick-arm, are you also ok with the newest commit? It looks ok to me but I'm not a Reviewer. ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From dholmes at openjdk.java.net Thu Apr 8 09:25:48 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 8 Apr 2021 09:25:48 GMT Subject: RFR: 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines [v2] In-Reply-To: References: Message-ID: > The existing JRT_ENTRY (and related) macros require the function to which they are applied to declare a parameter "JavaThread* thread" which represents the current thread. These functions are all implicitly "traps" functions as they can result in exceptions, but they are not declared with TRAPS because the only caller of these functions is the runtime itself (via call_VM) and no callers need to be aware to use CHECK; further they need a JavaThread. So the macro declares the THREAD variable for use with other exception-producing functions and assigns it from "thread". > > The majority of this change replaces the parameter name "thread" with "current" so that it is clear that we are always dealing with the current thread. This affects the entry functions as well as the functions called therefrom. > > We can then also replace the use of "THREAD" with "current", in contexts that are not related to exception processing. > > Some methods called by entry functions were declared to have both a "thread" parameter and a "TRAPS" parameter - with nothing to tell you these are always the same, current, thread. So the "thread" parameter is removed and replaced with a local variable "current" obtained from THREAD->as_Java_thread(). > > Some missing CHECK_ uses were added. > > Testing: > - tiers 1-3 > > Thanks, > David David Holmes has updated the pull request incrementally with one additional commit since the last revision: Fixed CHECK on return statement. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3394/files - new: https://git.openjdk.java.net/jdk/pull/3394/files/78298e5f..1adf0fd0 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3394&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3394&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3394.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3394/head:pull/3394 PR: https://git.openjdk.java.net/jdk/pull/3394 From vlivanov at openjdk.java.net Thu Apr 8 10:10:28 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Thu, 8 Apr 2021 10:10:28 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v2] In-Reply-To: <5dkX8tZwS9ZWQ7t3d130spNBF54dICgn-XLKW-OuLCs=.7d319f40-1c32-4ebd-8a6b-f11780292d2a@github.com> References: <5dkX8tZwS9ZWQ7t3d130spNBF54dICgn-XLKW-OuLCs=.7d319f40-1c32-4ebd-8a6b-f11780292d2a@github.com> Message-ID: On Thu, 8 Apr 2021 03:41:40 GMT, Xiaohong Gong wrote: > So does it look good for you if the VectorMaskCast is limited to the masking cast with the same element size? Yes, I'm fine with focusing on no-op case for now. Moreover, both vector length and element size should match to reuse the very same bit representation of the mask. ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From vlivanov at openjdk.java.net Thu Apr 8 10:17:28 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Thu, 8 Apr 2021 10:17:28 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v2] In-Reply-To: References: Message-ID: On Thu, 8 Apr 2021 03:44:41 GMT, Xiaohong Gong wrote: > I notice that VectorNode and its subclasses do not check for TOP inputs. > Since Vector API introduce vectors in graph before SuperWord transformation their input could become dead. > How such cases handled? And why we did not hit them yet? is_vect() should hit assert. `VectorLoadMaskNode::Identity()` can't observe TOP types because it uses the type cached at construction (`type()` and `VectorNode` extends `TypeNode`). Still, a TOP input is possible and should be filtered out by opcode check (`in(1)->Opcode() == Op_VectorStoreMask`). ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From eliu at openjdk.java.net Thu Apr 8 10:39:07 2021 From: eliu at openjdk.java.net (Eric Liu) Date: Thu, 8 Apr 2021 10:39:07 GMT Subject: RFR: 8262916: Merge LShiftCntV and RShiftCntV into a single node In-Reply-To: References: Message-ID: On Thu, 8 Apr 2021 05:14:52 GMT, Dean Long wrote: > You should be able to do this without introducing a new node type. Do you mean just matching them with ReplicateBNode? Indeed they do generate the same instruction. I was wondering about the same but I'm not sure if ReplicateBNode has the same semantic with ShiftCntV in middle-end. ------------- PR: https://git.openjdk.java.net/jdk/pull/3371 From rcastanedalo at openjdk.java.net Thu Apr 8 10:49:22 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Thu, 8 Apr 2021 10:49:22 GMT Subject: RFR: 8263227: C2: inconsistent spilling due to dead nodes in exception block In-Reply-To: References: <9y8Bw6bniQX2kF659D_tyTfbZQc1W3ptA7xSr7fmsc8=.77b3fb6e-6e49-4861-9f1d-3ffbf0b075fd@github.com> Message-ID: On Tue, 6 Apr 2021 17:54:20 GMT, Vladimir Kozlov wrote: >> This change eliminates dead multi-nodes created by call-catch cleanup after GCM. Eliminating all dead code created by call-catch cleanup avoids potential issues when splitting the live range of call result values, see the analysis in the [bug report](https://bugs.openjdk.java.net/browse/JDK-8263227) for details. This solution is the least invasive of the three alternatives proposed in the bug report (the other two are constraining global code motion and extending live-range splitting). >> >> The change also extends the control-flow graph verification pass to catch similar live-range splitting issues earlier (with `+VerifyRegisterAllocator`). >> >> Tested on: >> - original bug reproducer >> - hs-tier1-5 (windows-x64, linux-x64, linux-aarch64, and macosx-x64) with `+VerifyRegisterAllocator` >> - hs-tier1-3 (windows-x64, linux-x64, linux-aarch64, and macosx-x64) with `+VerifyRegisterAllocator` and `+StressGCM` (5 repetitions) > > src/hotspot/share/opto/lcm.cpp line 1415: > >> 1413: if (dead) { >> 1414: // Remove projections if n is a dead multi-node. >> 1415: for (uint k = j + n->outcnt(); sb->get_node(k)->is_Proj(); k--) { > > I don't get this logic. The loop is not executed if sb->get_node(j + n->outcnt()) is not Proj node. Thanks for reviewing, Vladimir! The intention of this loop is to remove all projections of a multi-node `n` before removing `n` itself (if it has been found to be dead). I indeed have to rethink this code as e.g. the loop can be executed if `n` is not a multi-node but `k` ends up pointing to a projection of another node. I will investigate and come back with a new revision. ------------- PR: https://git.openjdk.java.net/jdk/pull/3303 From rcastanedalo at openjdk.java.net Thu Apr 8 10:57:07 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Thu, 8 Apr 2021 10:57:07 GMT Subject: RFR: 8263227: C2: inconsistent spilling due to dead nodes in exception block In-Reply-To: References: <9y8Bw6bniQX2kF659D_tyTfbZQc1W3ptA7xSr7fmsc8=.77b3fb6e-6e49-4861-9f1d-3ffbf0b075fd@github.com> Message-ID: On Tue, 6 Apr 2021 17:55:35 GMT, Vladimir Kozlov wrote: >> This change eliminates dead multi-nodes created by call-catch cleanup after GCM. Eliminating all dead code created by call-catch cleanup avoids potential issues when splitting the live range of call result values, see the analysis in the [bug report](https://bugs.openjdk.java.net/browse/JDK-8263227) for details. This solution is the least invasive of the three alternatives proposed in the bug report (the other two are constraining global code motion and extending live-range splitting). >> >> The change also extends the control-flow graph verification pass to catch similar live-range splitting issues earlier (with `+VerifyRegisterAllocator`). >> >> Tested on: >> - original bug reproducer >> - hs-tier1-5 (windows-x64, linux-x64, linux-aarch64, and macosx-x64) with `+VerifyRegisterAllocator` >> - hs-tier1-3 (windows-x64, linux-x64, linux-aarch64, and macosx-x64) with `+VerifyRegisterAllocator` and `+StressGCM` (5 repetitions) > > src/hotspot/share/opto/lcm.cpp line 1419: > >> 1417: "dead projection should correspond to current node"); >> 1418: sb->get_node(k)->disconnect_inputs(C); >> 1419: sb->remove_node(k); > > If you remove node here then `j` could be incorrect in `sb->remove_node(j)` at line #1424 `k` should always be greater than `j`, as `sb->get_node(k)` is a projection of `sb->get_node(j)`. Shouldn't this make the removal safe? ------------- PR: https://git.openjdk.java.net/jdk/pull/3303 From rcastanedalo at openjdk.java.net Thu Apr 8 11:07:07 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Thu, 8 Apr 2021 11:07:07 GMT Subject: RFR: 8263227: C2: inconsistent spilling due to dead nodes in exception block In-Reply-To: References: <9y8Bw6bniQX2kF659D_tyTfbZQc1W3ptA7xSr7fmsc8=.77b3fb6e-6e49-4861-9f1d-3ffbf0b075fd@github.com> Message-ID: <-vl-o8PemGJNrb-7AtTQoNB6YGnhxoA3CVWot_iihlU=.90713355-bd98-4c1c-b253-08d660c1590e@github.com> On Tue, 6 Apr 2021 17:58:36 GMT, Vladimir Kozlov wrote: > Also where is guarantee that all Proj users are in the same block. Isn't this guaranteed by `PhaseCFG::schedule_node_into_block()`? https://github.com/openjdk/jdk/blob/ec599da68cac6349e7f28b508d8038c18a4e7347/src/hotspot/share/opto/gcm.cpp#L48-L72 I will run some tests with extra assertions in `PhaseCFG::verify()` to double-check. ------------- PR: https://git.openjdk.java.net/jdk/pull/3303 From vlivanov at openjdk.java.net Thu Apr 8 11:08:25 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Thu, 8 Apr 2021 11:08:25 GMT Subject: RFR: 8262916: Merge LShiftCntV and RShiftCntV into a single node In-Reply-To: References: Message-ID: <9T0fatLfUStsY_1SA94rlhKokfqT7TeySYeFzIuO7Nk=.7394e20e-a4a3-4786-a2db-f5d8acc2b74a@github.com> On Thu, 8 Apr 2021 05:14:52 GMT, Dean Long wrote: > You should be able to do this without introducing a new node type. You could change the shift rules to match a vector register like x86.ad and aarch64_sve.ad already do. Not sure what you refer to in x86.ad: vector shifts with variable scalar count require the scalar to be placed in XMM register. `ShiftCntV` handles register-to-register move between different register classes (`RegI` and `Vec*`). Do you suggest reusing some existing vector node (like `Replicate`) to covert the scalar index to vector first? It would have slightly different behavior on x86. ------------- PR: https://git.openjdk.java.net/jdk/pull/3371 From vlivanov at openjdk.java.net Thu Apr 8 11:22:28 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Thu, 8 Apr 2021 11:22:28 GMT Subject: RFR: 8262916: Merge LShiftCntV and RShiftCntV into a single node In-Reply-To: <9T0fatLfUStsY_1SA94rlhKokfqT7TeySYeFzIuO7Nk=.7394e20e-a4a3-4786-a2db-f5d8acc2b74a@github.com> References: <9T0fatLfUStsY_1SA94rlhKokfqT7TeySYeFzIuO7Nk=.7394e20e-a4a3-4786-a2db-f5d8acc2b74a@github.com> Message-ID: On Thu, 8 Apr 2021 11:04:49 GMT, Vladimir Ivanov wrote: >> You should be able to do this without introducing a new node type. You could change the shift rules to match a vector register like x86.ad and aarch64_sve.ad already do. > >> You should be able to do this without introducing a new node type. You could change the shift rules to match a vector register like x86.ad and aarch64_sve.ad already do. > > Not sure what you refer to in x86.ad: vector shifts with variable scalar count require the scalar to be placed in XMM register. `ShiftCntV` handles register-to-register move between different register classes (`RegI` and `Vec*`). > > Do you suggest reusing some existing vector node (like `Replicate`) to covert the scalar index to vector first? It would have slightly different behavior on x86. Regarding the proposed change itself (`LShiftCntV/RShiftCntV => ShiftCntV`). Not sure how important it is, but it has an unfortunate change in generated code for right vector shifts on AArch32: instead of sharing the result of index negation at all use sites, negation is performed at every use site now. As a consequence, in an auto-vectorized loop it will lead to: - 1 instruction per loop iteration (multiplied by unrolling factor); - no way to hoist the negation of loop invariant index. ------------- PR: https://git.openjdk.java.net/jdk/pull/3371 From eliu at openjdk.java.net Thu Apr 8 11:56:18 2021 From: eliu at openjdk.java.net (Eric Liu) Date: Thu, 8 Apr 2021 11:56:18 GMT Subject: RFR: 8262916: Merge LShiftCntV and RShiftCntV into a single node In-Reply-To: References: <9T0fatLfUStsY_1SA94rlhKokfqT7TeySYeFzIuO7Nk=.7394e20e-a4a3-4786-a2db-f5d8acc2b74a@github.com> Message-ID: On Thu, 8 Apr 2021 11:19:27 GMT, Vladimir Ivanov wrote: > Regarding the proposed change itself (`LShiftCntV/RShiftCntV => ShiftCntV`). > > Not sure how important it is, but it has an unfortunate change in generated code for right vector shifts on AArch32: instead of sharing the result of index negation at all use sites, negation is performed at every use site now. > > As a consequence, in an auto-vectorized loop it will lead to: > > * 1 instruction per loop iteration (multiplied by unrolling factor); > * no way to hoist the negation of loop invariant index. Thanks for your feedback! I have checked the generated code on AArch32 and it's a shame that 'vneg' is at every use point. Before: 0xf46c8338: add fp, r7, fp 0xf46c833c: vshl.u16 d9, d9, d8 0xf46c8340: vstr d9, [fp, #12] 0xf46c8344: vshl.u16 d9, d10, d8 0xf46c8348: vstr d9, [fp, #20] 0xf46c834c: vshl.u16 d9, d11, d8 0xf46c8350: vstr d9, [fp, #28] After: 0xf4aa1328: add fp, r7, fp 0xf4aa132c: vneg.s8 d13, d8 0xf4aa1330: vshl.u16 d9, d9, d13 0xf4aa1334: vstr d9, [fp, #12] 0xf4aa1338: vneg.s8 d9, d8 0xf4aa133c: vshl.u16 d9, d10, d9 0xf4aa1340: vstr d9, [fp, #20] 0xf4aa1344: vneg.s8 d9, d8 0xf4aa1348: vshl.u16 d9, d11, d9 0xf4aa134c: vstr d9, [fp, #28] I suppose it's more like a trade off that either remaining those two R/LShiftCntV nodes and change AArch64 and X86 to what AArch32 dose, or merging them and accept this defect on AArch32. ------------- PR: https://git.openjdk.java.net/jdk/pull/3371 From roland at openjdk.java.net Thu Apr 8 12:03:18 2021 From: roland at openjdk.java.net (Roland Westrelin) Date: Thu, 8 Apr 2021 12:03:18 GMT Subject: RFR: 8264063: Outer Safepoint poll load should not reference the head of inner strip mined loop. [v2] In-Reply-To: References: Message-ID: On Wed, 7 Apr 2021 16:05:01 GMT, Vladimir Kozlov wrote: >> When loop is "strip mined" polling address load ([parse1.cpp#L2280](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/parse1.cpp#L2280)) should be cloned together with safepoint node and pinned outside inner loop. Otherwise we have issues like [8263352](https://bugs.openjdk.java.net/browse/JDK-8263352) >> >> I also remove leftover (unused needs_polling_address_input() method) from [8220051](https://bugs.openjdk.java.net/browse/JDK-8220051) changes. >> >> Tested hs-tier1-4 > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Removed unneeded anymore code in match_fill_loop() Looks good to me. ------------- Marked as reviewed by roland (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3365 From vlivanov at openjdk.java.net Thu Apr 8 12:41:17 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Thu, 8 Apr 2021 12:41:17 GMT Subject: RFR: 8264063: Outer Safepoint poll load should not reference the head of inner strip mined loop. [v2] In-Reply-To: References: Message-ID: On Wed, 7 Apr 2021 16:05:01 GMT, Vladimir Kozlov wrote: >> When loop is "strip mined" polling address load ([parse1.cpp#L2280](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/parse1.cpp#L2280)) should be cloned together with safepoint node and pinned outside inner loop. Otherwise we have issues like [8263352](https://bugs.openjdk.java.net/browse/JDK-8263352) >> >> I also remove leftover (unused needs_polling_address_input() method) from [8220051](https://bugs.openjdk.java.net/browse/JDK-8220051) changes. >> >> Tested hs-tier1-4 > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Removed unneeded anymore code in match_fill_loop() Looks good. ------------- Marked as reviewed by vlivanov (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3365 From vlivanov at openjdk.java.net Thu Apr 8 13:02:36 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Thu, 8 Apr 2021 13:02:36 GMT Subject: RFR: 8264918: [JVMCI] getVtableIndexForInterfaceMethod doesn't check that type and method are related Message-ID: getVtableIndexForInterfaceMethod should reject the case when a resolved class doesn't implement the holder interface. Testing: - [x] hs-tier1 - hs-tier4 ------------- Commit messages: - getVtableIndexForInterfaceMethod Changes: https://git.openjdk.java.net/jdk/pull/3396/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3396&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264918 Stats: 11 lines in 2 files changed: 6 ins; 0 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/3396.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3396/head:pull/3396 PR: https://git.openjdk.java.net/jdk/pull/3396 From coleenp at openjdk.java.net Thu Apr 8 13:03:20 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 8 Apr 2021 13:03:20 GMT Subject: RFR: 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines [v2] In-Reply-To: References: Message-ID: On Thu, 8 Apr 2021 09:25:48 GMT, David Holmes wrote: >> The existing JRT_ENTRY (and related) macros require the function to which they are applied to declare a parameter "JavaThread* thread" which represents the current thread. These functions are all implicitly "traps" functions as they can result in exceptions, but they are not declared with TRAPS because the only caller of these functions is the runtime itself (via call_VM) and no callers need to be aware to use CHECK; further they need a JavaThread. So the macro declares the THREAD variable for use with other exception-producing functions and assigns it from "thread". >> >> The majority of this change replaces the parameter name "thread" with "current" so that it is clear that we are always dealing with the current thread. This affects the entry functions as well as the functions called therefrom. >> >> We can then also replace the use of "THREAD" with "current", in contexts that are not related to exception processing. >> >> Some methods called by entry functions were declared to have both a "thread" parameter and a "TRAPS" parameter - with nothing to tell you these are always the same, current, thread. So the "thread" parameter is removed and replaced with a local variable "current" obtained from THREAD->as_Java_thread(). >> >> Some missing CHECK_ uses were added. >> >> Testing: >> - tiers 1-3 >> >> Thanks, >> David > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Fixed CHECK on return statement. I think substituting "JavaThread* thread" for "JavaThread* current" is a good change and convention that makes the code more clear, so worth the dull code review and diffs. src/hotspot/share/c1/c1_Runtime1.cpp line 180: > 178: static void deopt_caller() { > 179: if ( !caller_is_deopted()) { > 180: JavaThread* current = JavaThread::current(); It looks like both of these functions could be passed JavaThread from the callers. You could leave this as a cleanup though. It would eliminate two explicit JavaThread::current calls. src/hotspot/share/c1/c1_Runtime1.cpp line 414: > 412: assert(klass->is_klass(), "not a class"); > 413: assert(rank >= 1, "rank must be nonzero"); > 414: Handle holder(current, klass->klass_holder()); // keep the klass alive I think this is ok now, since current is obviously the current thread. There doesn't seem to be a need to use THREAD here. I don't know about changing all the uses of THREAD to current for Handles/methodHandles/ResourceMark/HandleMark but this seems ok in this context since JavaThread* current is present. src/hotspot/share/c1/c1_Runtime1.cpp line 695: > 693: NOT_PRODUCT(_throw_class_cast_exception_count++;) > 694: ResourceMark rm(current); > 695: char* message = SharedRuntime::generate_class_cast_message(current, object->klass()); Is this indentation off? src/hotspot/share/interpreter/interpreterRuntime.cpp line 1157: > 1155: JRT_END > 1156: > 1157: JRT_ENTRY(void, InterpreterRuntime::post_field_access(JavaThread *current, oopDesc* obj, nit: JavaThread* current src/hotspot/share/interpreter/interpreterRuntime.cpp line 1237: > 1235: JRT_END > 1236: > 1237: JRT_ENTRY(void, InterpreterRuntime::post_method_entry(JavaThread *current)) Also move the star here and one below. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3394 From hseigel at openjdk.java.net Thu Apr 8 13:16:26 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Thu, 8 Apr 2021 13:16:26 GMT Subject: RFR: 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines [v2] In-Reply-To: References: Message-ID: <4IppJ2wYSTxcAKq1z7iSB8fxmmtDSlFimFFSJs9no5g=.29e6b187-a81b-4f9f-afd1-0ce0e9b956f7@github.com> On Thu, 8 Apr 2021 09:25:48 GMT, David Holmes wrote: >> The existing JRT_ENTRY (and related) macros require the function to which they are applied to declare a parameter "JavaThread* thread" which represents the current thread. These functions are all implicitly "traps" functions as they can result in exceptions, but they are not declared with TRAPS because the only caller of these functions is the runtime itself (via call_VM) and no callers need to be aware to use CHECK; further they need a JavaThread. So the macro declares the THREAD variable for use with other exception-producing functions and assigns it from "thread". >> >> The majority of this change replaces the parameter name "thread" with "current" so that it is clear that we are always dealing with the current thread. This affects the entry functions as well as the functions called therefrom. >> >> We can then also replace the use of "THREAD" with "current", in contexts that are not related to exception processing. >> >> Some methods called by entry functions were declared to have both a "thread" parameter and a "TRAPS" parameter - with nothing to tell you these are always the same, current, thread. So the "thread" parameter is removed and replaced with a local variable "current" obtained from THREAD->as_Java_thread(). >> >> Some missing CHECK_ uses were added. >> >> Testing: >> - tiers 1-3 >> >> Thanks, >> David > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Fixed CHECK on return statement. src/hotspot/share/interpreter/interpreterRuntime.cpp line 303: > 301: // We'd expect to assert that we're only here to quicken bytecodes, but in a multithreaded > 302: // program we might have seen an unquick'd bytecode in the interpreter but have another > 303: // current quicken the bytecode before we get here. This should still say 'thread', not 'current' ------------- PR: https://git.openjdk.java.net/jdk/pull/3394 From dholmes at openjdk.java.net Thu Apr 8 13:16:25 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 8 Apr 2021 13:16:25 GMT Subject: RFR: 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines [v2] In-Reply-To: References: Message-ID: <2fThFbhrFCVWjj4KuHJyYtHl47kKuyrBTwkiZFqDQ9c=.2abb180f-479d-4776-aada-6235359d23b9@github.com> On Thu, 8 Apr 2021 13:00:03 GMT, Coleen Phillimore wrote: >> David Holmes has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed CHECK on return statement. > > I think substituting "JavaThread* thread" for "JavaThread* current" is a good change and convention that makes the code more clear, so worth the dull code review and diffs. Looks like I missed some definitions that aren't included in our test builds but have been found via GHA builds. I will rectify those and update. ------------- PR: https://git.openjdk.java.net/jdk/pull/3394 From vlivanov at openjdk.java.net Thu Apr 8 13:41:20 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Thu, 8 Apr 2021 13:41:20 GMT Subject: RFR: 8262916: Merge LShiftCntV and RShiftCntV into a single node In-Reply-To: References: <9T0fatLfUStsY_1SA94rlhKokfqT7TeySYeFzIuO7Nk=.7394e20e-a4a3-4786-a2db-f5d8acc2b74a@github.com> Message-ID: On Thu, 8 Apr 2021 11:52:02 GMT, Eric Liu wrote: >> Regarding the proposed change itself (`LShiftCntV/RShiftCntV => ShiftCntV`). >> >> Not sure how important it is, but it has an unfortunate change in generated code for right vector shifts on AArch32: instead of sharing the result of index negation at all use sites, negation is performed at every use site now. >> >> As a consequence, in an auto-vectorized loop it will lead to: >> - 1 instruction per loop iteration (multiplied by unrolling factor); >> - no way to hoist the negation of loop invariant index. > >> Regarding the proposed change itself (`LShiftCntV/RShiftCntV => ShiftCntV`). >> >> Not sure how important it is, but it has an unfortunate change in generated code for right vector shifts on AArch32: instead of sharing the result of index negation at all use sites, negation is performed at every use site now. >> >> As a consequence, in an auto-vectorized loop it will lead to: >> >> * 1 instruction per loop iteration (multiplied by unrolling factor); >> * no way to hoist the negation of loop invariant index. > > Thanks for your feedback! > > I have checked the generated code on AArch32 and it's a shame that 'vneg' is at every use point. > > Before: > 0xf46c8338: add fp, r7, fp > 0xf46c833c: vshl.u16 d9, d9, d8 > 0xf46c8340: vstr d9, [fp, #12] > 0xf46c8344: vshl.u16 d9, d10, d8 > 0xf46c8348: vstr d9, [fp, #20] > 0xf46c834c: vshl.u16 d9, d11, d8 > 0xf46c8350: vstr d9, [fp, #28] > > After: > 0xf4aa1328: add fp, r7, fp > 0xf4aa132c: vneg.s8 d13, d8 > 0xf4aa1330: vshl.u16 d9, d9, d13 > 0xf4aa1334: vstr d9, [fp, #12] > 0xf4aa1338: vneg.s8 d9, d8 > 0xf4aa133c: vshl.u16 d9, d10, d9 > 0xf4aa1340: vstr d9, [fp, #20] > 0xf4aa1344: vneg.s8 d9, d8 > 0xf4aa1348: vshl.u16 d9, d11, d9 > 0xf4aa134c: vstr d9, [fp, #28] > > I suppose it's more like a trade off that either remaining those two R/LShiftCntV nodes and change AArch64 and X86 to what AArch32 dose, or merging them and accept this defect on AArch32. `LShiftCntV`/`RShiftCntV` were added specifically for AArch32 and other platforms don't need/benefit from such separation. Ultimately, I'd like to hear what AArch32 port maintainers think about it, but if there are concerns about performance impact expressed, as an alternative a platform-specific logic can be introduced that adjusts Ideal IR shape for `URShiftV`/`RShiftV` nodes accordingly. (Not sure whether it is a good trade-off w.r.t. code complexity though.) ------------- PR: https://git.openjdk.java.net/jdk/pull/3371 From rcastanedalo at openjdk.java.net Thu Apr 8 13:53:15 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Thu, 8 Apr 2021 13:53:15 GMT Subject: RFR: 8263227: C2: inconsistent spilling due to dead nodes in exception block In-Reply-To: References: <9y8Bw6bniQX2kF659D_tyTfbZQc1W3ptA7xSr7fmsc8=.77b3fb6e-6e49-4861-9f1d-3ffbf0b075fd@github.com> Message-ID: <-OXlf8TYEEY3rOr6V8ae6zR2QtDnjfnso4R_CKA9alU=.5c7de539-587e-47dd-b25d-ef0443e024d6@github.com> On Thu, 8 Apr 2021 10:46:10 GMT, Roberto Casta?eda Lozano wrote: >> src/hotspot/share/opto/lcm.cpp line 1415: >> >>> 1413: if (dead) { >>> 1414: // Remove projections if n is a dead multi-node. >>> 1415: for (uint k = j + n->outcnt(); sb->get_node(k)->is_Proj(); k--) { >> >> I don't get this logic. The loop is not executed if sb->get_node(j + n->outcnt()) is not Proj node. > > Thanks for reviewing, Vladimir! The intention of this loop is to remove all projections of a multi-node `n` before removing `n` itself (if it has been found to be dead). I indeed have to rethink this code as e.g. the loop can be executed if `n` is not a multi-node but `k` ends up pointing to a projection of another node. I will investigate and come back with a new revision. On second thought, at the loop entry, the only users of `n` are (unused) projections or else `n` wouldn't be dead. Assuming these projections are scheduled right after `n`, the code should be OK. In any case, I will submit a (hopefully) clearer revision. ------------- PR: https://git.openjdk.java.net/jdk/pull/3303 From kvn at openjdk.java.net Thu Apr 8 15:03:22 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 8 Apr 2021 15:03:22 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v2] In-Reply-To: References: Message-ID: On Mon, 29 Mar 2021 06:28:42 GMT, Xiaohong Gong wrote: >> The Vector API defines different element types for floating point VectorMask. For example, the bitwise related APIs use "`long/int`", while data related APIs use "`double/float`". This makes some optimizations that based on the type checking cannot work well. >> >> For example, the VectorBox/Unbox elimination like `"VectorUnbox (VectorBox v) ==> v"` requires the types of output and >> input are equal. Normally this is necessary. However, due to the different element type for floating point VectorMask with the same species, the VectorBox/Unbox pattern is optimized to: >> VectorLoadMask (VectorStoreMask vmask) >> Actually the types can be treated as the same one for such cases. And considering the vector mask representation is the same for >> vectors with the same element size and vector length, it's safe to do the optimization: >> VectorLoadMask (VectorStoreMask vmask) ==> vmask >> The same issue exists for GVN which is based on the type of a node. Making the mask node's` hash()/cmp()` methods depends on the element size instead of the element type can fix it. > > Xiaohong Gong has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask Okay. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3238 From kvn at openjdk.java.net Thu Apr 8 15:09:14 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 8 Apr 2021 15:09:14 GMT Subject: RFR: 8264063: Outer Safepoint poll load should not reference the head of inner strip mined loop. [v2] In-Reply-To: References: Message-ID: On Thu, 8 Apr 2021 12:38:36 GMT, Vladimir Ivanov wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed unneeded anymore code in match_fill_loop() > > Looks good. Thank you Vladimir and Roland. ------------- PR: https://git.openjdk.java.net/jdk/pull/3365 From kvn at openjdk.java.net Thu Apr 8 15:09:15 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 8 Apr 2021 15:09:15 GMT Subject: Integrated: 8264063: Outer Safepoint poll load should not reference the head of inner strip mined loop. In-Reply-To: References: Message-ID: On Wed, 7 Apr 2021 02:29:20 GMT, Vladimir Kozlov wrote: > When loop is "strip mined" polling address load ([parse1.cpp#L2280](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/parse1.cpp#L2280)) should be cloned together with safepoint node and pinned outside inner loop. Otherwise we have issues like [8263352](https://bugs.openjdk.java.net/browse/JDK-8263352) > > I also remove leftover (unused needs_polling_address_input() method) from [8220051](https://bugs.openjdk.java.net/browse/JDK-8220051) changes. > > Tested hs-tier1-4 This pull request has now been integrated. Changeset: 81d35e43 Author: Vladimir Kozlov URL: https://git.openjdk.java.net/jdk/commit/81d35e43 Stats: 57 lines in 7 files changed: 10 ins; 46 del; 1 mod 8264063: Outer Safepoint poll load should not reference the head of inner strip mined loop. Reviewed-by: roland, vlivanov ------------- PR: https://git.openjdk.java.net/jdk/pull/3365 From kvn at openjdk.java.net Thu Apr 8 15:20:26 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 8 Apr 2021 15:20:26 GMT Subject: RFR: 8264918: [JVMCI] getVtableIndexForInterfaceMethod doesn't check that type and method are related In-Reply-To: References: Message-ID: On Thu, 8 Apr 2021 12:49:21 GMT, Vladimir Ivanov wrote: > getVtableIndexForInterfaceMethod should reject the case when a resolved class doesn't implement the holder interface. > > Testing: > - [x] hs-tier1 - hs-tier4 Good. It needs review from Graal team. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3396 From kvn at openjdk.java.net Thu Apr 8 17:56:06 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 8 Apr 2021 17:56:06 GMT Subject: RFR: 8263227: C2: inconsistent spilling due to dead nodes in exception block In-Reply-To: <-vl-o8PemGJNrb-7AtTQoNB6YGnhxoA3CVWot_iihlU=.90713355-bd98-4c1c-b253-08d660c1590e@github.com> References: <9y8Bw6bniQX2kF659D_tyTfbZQc1W3ptA7xSr7fmsc8=.77b3fb6e-6e49-4861-9f1d-3ffbf0b075fd@github.com> <-vl-o8PemGJNrb-7AtTQoNB6YGnhxoA3CVWot_iihlU=.90713355-bd98-4c1c-b253-08d660c1590e@github.com> Message-ID: On Thu, 8 Apr 2021 11:04:00 GMT, Roberto Casta?eda Lozano wrote: >> Also where is guarantee that all Proj users are in the same block. > >> Also where is guarantee that all Proj users are in the same block. > > Isn't this guaranteed by `PhaseCFG::schedule_node_into_block()`? > > https://github.com/openjdk/jdk/blob/ec599da68cac6349e7f28b508d8038c18a4e7347/src/hotspot/share/opto/gcm.cpp#L48-L72 > > I will run some tests with extra assertions in `PhaseCFG::verify()` to double-check. > > Also where is guarantee that all Proj users are in the same block. > > Isn't this guaranteed by `PhaseCFG::schedule_node_into_block()`? > > > I will run some tests with extra assertions in `PhaseCFG::verify()` to double-check. Okay. ------------- PR: https://git.openjdk.java.net/jdk/pull/3303 From mdoerr at openjdk.java.net Thu Apr 8 17:58:14 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Thu, 8 Apr 2021 17:58:14 GMT Subject: RFR: 8259822: [PPC64] Support the prefixed instruction format added in POWER10 [v8] In-Reply-To: References: Message-ID: <3JbTLT-r8ksY7GL4YSdsLKAW7In2c1qCQ9iZvpwemDI=.f6652e1d-6c15-4022-8bbd-25b9e665da48@github.com> On Mon, 5 Apr 2021 03:10:30 GMT, Kazunori Ogata wrote: >> Looks good to me, now. Is the latest version substantially tested? We need to rely on IBMs testing because nobody else has Power10. > > @TheRealMDoerr @CoreyAshford Thank you for your review. I think this is now ready to be integrated. > > @TheRealMDoerr Could you sponsor this pull request? Yes, I can sponsor it. Thanks for testing. @CoreyAshford Are all your requests resolved? Do you need more time for additional tests? Please approve once you're fine with it. ------------- PR: https://git.openjdk.java.net/jdk/pull/2095 From kvn at openjdk.java.net Thu Apr 8 18:24:09 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 8 Apr 2021 18:24:09 GMT Subject: RFR: 8263227: C2: inconsistent spilling due to dead nodes in exception block In-Reply-To: <-OXlf8TYEEY3rOr6V8ae6zR2QtDnjfnso4R_CKA9alU=.5c7de539-587e-47dd-b25d-ef0443e024d6@github.com> References: <9y8Bw6bniQX2kF659D_tyTfbZQc1W3ptA7xSr7fmsc8=.77b3fb6e-6e49-4861-9f1d-3ffbf0b075fd@github.com> <-OXlf8TYEEY3rOr6V8ae6zR2QtDnjfnso4R_CKA9alU=.5c7de539-587e-47dd-b25d-ef0443e024d6@github.com> Message-ID: On Thu, 8 Apr 2021 13:49:56 GMT, Roberto Casta?eda Lozano wrote: >> Thanks for reviewing, Vladimir! The intention of this loop is to remove all projections of a multi-node `n` before removing `n` itself (if it has been found to be dead). I indeed have to rethink this code as e.g. the loop can be executed if `n` is not a multi-node but `k` ends up pointing to a projection of another node. I will investigate and come back with a new revision. > > On second thought, at the loop entry, the only users of `n` are (unused) projections or else `n` wouldn't be dead. Assuming these projections are scheduled right after `n`, the code should be OK. In any case, I will submit a (hopefully) clearer revision. You are right that users will follow `n` but I am not sure that no other nodes were inserted in between users. I would prefer to see loop code like this: for (uint k = new_cnt; k > j; k-- ) { Node* user = sb->get_node(k); if (user->is_Proj() && user->in(0) == n) { Or may be record user's indexes into a local array for users in previous loop at lines #1406-1412 and iterate of that loop. I also noticed wrong code stile in `for()` statements near your change. Please, fix them too. ------------- PR: https://git.openjdk.java.net/jdk/pull/3303 From kvn at openjdk.java.net Thu Apr 8 18:24:10 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 8 Apr 2021 18:24:10 GMT Subject: RFR: 8263227: C2: inconsistent spilling due to dead nodes in exception block In-Reply-To: References: <9y8Bw6bniQX2kF659D_tyTfbZQc1W3ptA7xSr7fmsc8=.77b3fb6e-6e49-4861-9f1d-3ffbf0b075fd@github.com> Message-ID: On Thu, 8 Apr 2021 10:54:31 GMT, Roberto Casta?eda Lozano wrote: >> src/hotspot/share/opto/lcm.cpp line 1419: >> >>> 1417: "dead projection should correspond to current node"); >>> 1418: sb->get_node(k)->disconnect_inputs(C); >>> 1419: sb->remove_node(k); >> >> If you remove node here then `j` could be incorrect in `sb->remove_node(j)` at line #1424 > > `k` should always be greater than `j`, as `sb->get_node(k)` is a projection of `sb->get_node(j)`. Shouldn't this make the removal safe? Yes, you are right - users should follow. ------------- PR: https://git.openjdk.java.net/jdk/pull/3303 From dnsimon at openjdk.java.net Thu Apr 8 19:00:08 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Thu, 8 Apr 2021 19:00:08 GMT Subject: RFR: 8264918: [JVMCI] getVtableIndexForInterfaceMethod doesn't check that type and method are related In-Reply-To: References: Message-ID: On Thu, 8 Apr 2021 15:17:28 GMT, Vladimir Kozlov wrote: >> getVtableIndexForInterfaceMethod should reject the case when a resolved class doesn't implement the holder interface. >> >> Testing: >> - [x] hs-tier1 - hs-tier4 > > Good. > It needs review from Graal team. The changes look good to me. How did you discover this problem? If the detail is in https://bugs.openjdk.java.net/browse/JDK-8264918, I can wait until the JBS server comes back up. ------------- PR: https://git.openjdk.java.net/jdk/pull/3396 From vlivanov at openjdk.java.net Thu Apr 8 20:05:09 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Thu, 8 Apr 2021 20:05:09 GMT Subject: RFR: 8264918: [JVMCI] getVtableIndexForInterfaceMethod doesn't check that type and method are related In-Reply-To: References: Message-ID: On Thu, 8 Apr 2021 18:57:11 GMT, Doug Simon wrote: >> Good. >> It needs review from Graal team. > > The changes look good to me. > How did you discover this problem? If the detail is in https://bugs.openjdk.java.net/browse/JDK-8264918, I can wait until the JBS server comes back up. Thanks for the reviews, Vladimir and Doug. It was discovered by a new assert introduced in #3346. ------------- PR: https://git.openjdk.java.net/jdk/pull/3396 From cashford at openjdk.java.net Thu Apr 8 20:40:10 2021 From: cashford at openjdk.java.net (Corey Ashford) Date: Thu, 8 Apr 2021 20:40:10 GMT Subject: RFR: 8259822: [PPC64] Support the prefixed instruction format added in POWER10 [v8] In-Reply-To: References: Message-ID: <-yjWFxV8D5j_g3Ke6Eb5cCtNnshSGiqPSHXuhXrJTRg=.9339fd6d-0ec6-473b-aa2c-45c08450144e@github.com> On Thu, 25 Mar 2021 06:16:53 GMT, Kazunori Ogata wrote: >> The POWER10 processor, which implements Power ISA 3.1 [1], supports new instruction formats where an instruction takes two 32bit words. The first word is called prefix, and the instructions with prefix are called prefixed instructions. With more bits in opcode and operand fields, POWER10 supports larger immediate value in an operand, as well as many new instructions. >> >> This is the first changes to handle prefixed instructions, and this adds support of prefixed addi (= paddi) instruction as an example of prefix usage. paddi accepts 34bit immediate value, while original addi accepts 16bit value. >> >> [1] https://ibm.ent.box.com/s/hhjfw0x0lrbtyzmiaffnbxh2fuo0fog0 > > Kazunori Ogata has updated the pull request incrementally with one additional commit since the last revision: > > Clean up compute_padding() and fix grammatical errors Marked as reviewed by cashford (Author). ------------- PR: https://git.openjdk.java.net/jdk/pull/2095 From cashford at openjdk.java.net Thu Apr 8 20:40:11 2021 From: cashford at openjdk.java.net (Corey Ashford) Date: Thu, 8 Apr 2021 20:40:11 GMT Subject: RFR: 8259822: [PPC64] Support the prefixed instruction format added in POWER10 [v8] In-Reply-To: <3JbTLT-r8ksY7GL4YSdsLKAW7In2c1qCQ9iZvpwemDI=.f6652e1d-6c15-4022-8bbd-25b9e665da48@github.com> References: <3JbTLT-r8ksY7GL4YSdsLKAW7In2c1qCQ9iZvpwemDI=.f6652e1d-6c15-4022-8bbd-25b9e665da48@github.com> Message-ID: On Thu, 8 Apr 2021 17:55:15 GMT, Martin Doerr wrote: >> @TheRealMDoerr @CoreyAshford Thank you for your review. I think this is now ready to be integrated. >> >> @TheRealMDoerr Could you sponsor this pull request? > > Yes, I can sponsor it. Thanks for testing. > @CoreyAshford Are all your requests resolved? Do you need more time for additional tests? Please approve once you're fine with it. The only other thing I see is that there are a number of static functions that are defined for accessing fields (e.g. `inv_st_x1`), but aren't actually used yet. However, I think these are probably ok, since they are not very complex. So I will approve. ------------- PR: https://git.openjdk.java.net/jdk/pull/2095 From dholmes at openjdk.java.net Thu Apr 8 23:40:18 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 8 Apr 2021 23:40:18 GMT Subject: RFR: 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines [v2] In-Reply-To: References: Message-ID: On Thu, 8 Apr 2021 12:18:12 GMT, Coleen Phillimore wrote: >> David Holmes has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed CHECK on return statement. > > src/hotspot/share/c1/c1_Runtime1.cpp line 180: > >> 178: static void deopt_caller() { >> 179: if ( !caller_is_deopted()) { >> 180: JavaThread* current = JavaThread::current(); > > It looks like both of these functions could be passed JavaThread from the callers. You could leave this as a cleanup though. It would eliminate two explicit JavaThread::current calls. That's a good additional cleanup - thanks. Fixed. > src/hotspot/share/c1/c1_Runtime1.cpp line 695: > >> 693: NOT_PRODUCT(_throw_class_cast_exception_count++;) >> 694: ResourceMark rm(current); >> 695: char* message = SharedRuntime::generate_class_cast_message(current, object->klass()); > > Is this indentation off? Fixed. (My emacs can't figure out how to indent when these macros are used. :( ) > src/hotspot/share/interpreter/interpreterRuntime.cpp line 1157: > >> 1155: JRT_END >> 1156: >> 1157: JRT_ENTRY(void, InterpreterRuntime::post_field_access(JavaThread *current, oopDesc* obj, > > nit: JavaThread* current Fixed > src/hotspot/share/interpreter/interpreterRuntime.cpp line 1237: > >> 1235: JRT_END >> 1236: >> 1237: JRT_ENTRY(void, InterpreterRuntime::post_method_entry(JavaThread *current)) > > Also move the star here and one below. Fixed all. ------------- PR: https://git.openjdk.java.net/jdk/pull/3394 From dholmes at openjdk.java.net Thu Apr 8 23:40:19 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 8 Apr 2021 23:40:19 GMT Subject: RFR: 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines [v2] In-Reply-To: <4IppJ2wYSTxcAKq1z7iSB8fxmmtDSlFimFFSJs9no5g=.29e6b187-a81b-4f9f-afd1-0ce0e9b956f7@github.com> References: <4IppJ2wYSTxcAKq1z7iSB8fxmmtDSlFimFFSJs9no5g=.29e6b187-a81b-4f9f-afd1-0ce0e9b956f7@github.com> Message-ID: On Thu, 8 Apr 2021 13:11:36 GMT, Harold Seigel wrote: >> David Holmes has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed CHECK on return statement. > > src/hotspot/share/interpreter/interpreterRuntime.cpp line 303: > >> 301: // We'd expect to assert that we're only here to quicken bytecodes, but in a multithreaded >> 302: // program we might have seen an unquick'd bytecode in the interpreter but have another >> 303: // current quicken the bytecode before we get here. > > This should still say 'thread', not 'current' Well spotted! Fixed. ------------- PR: https://git.openjdk.java.net/jdk/pull/3394 From dholmes at openjdk.java.net Thu Apr 8 23:45:40 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 8 Apr 2021 23:45:40 GMT Subject: RFR: 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines [v3] In-Reply-To: References: Message-ID: > The existing JRT_ENTRY (and related) macros require the function to which they are applied to declare a parameter "JavaThread* thread" which represents the current thread. These functions are all implicitly "traps" functions as they can result in exceptions, but they are not declared with TRAPS because the only caller of these functions is the runtime itself (via call_VM) and no callers need to be aware to use CHECK; further they need a JavaThread. So the macro declares the THREAD variable for use with other exception-producing functions and assigns it from "thread". > > The majority of this change replaces the parameter name "thread" with "current" so that it is clear that we are always dealing with the current thread. This affects the entry functions as well as the functions called therefrom. > > We can then also replace the use of "THREAD" with "current", in contexts that are not related to exception processing. > > Some methods called by entry functions were declared to have both a "thread" parameter and a "TRAPS" parameter - with nothing to tell you these are always the same, current, thread. So the "thread" parameter is removed and replaced with a local variable "current" obtained from THREAD->as_Java_thread(). > > Some missing CHECK_ uses were added. > > Testing: > - tiers 1-3 > > Thanks, > David David Holmes has updated the pull request incrementally with three additional commits since the last revision: - Fix minor nits reported by @coleenp and @hseigel - @coleenp review comment - Avoid manifesting JavaThread::current() when it can be passed in - Added in missed JRT_BLOCK_ENTRY methods in AOT-related jvmci/compilerRuntime.cpp ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3394/files - new: https://git.openjdk.java.net/jdk/pull/3394/files/1adf0fd0..041d7ef5 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3394&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3394&range=01-02 Stats: 71 lines in 5 files changed: 1 ins; 2 del; 68 mod Patch: https://git.openjdk.java.net/jdk/pull/3394.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3394/head:pull/3394 PR: https://git.openjdk.java.net/jdk/pull/3394 From dongbo at openjdk.java.net Fri Apr 9 01:31:21 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Fri, 9 Apr 2021 01:31:21 GMT Subject: Integrated: 8256245: AArch64: Implement Base64 decoding intrinsic In-Reply-To: References: Message-ID: On Sat, 27 Mar 2021 08:58:03 GMT, Dong Bo wrote: > In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding. > Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea can be found at http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords. > > Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build. > Tests in `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java` runned specially for the correctness of the implementation. > > There can be illegal characters at the start of the input if the data is MIME encoded. > It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now. > > A JMH micro, Base64Decode.java, is added for performance test. > With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro), > we witness ~2.5x improvements with long inputs and no regression with short inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on Kunpeng916. > > The Base64Decode.java JMH micro-benchmark results: > > Benchmark (lineSize) (maxNumBytes) Mode Cnt Score Error Units > > # Kunpeng916, intrinsic > Base64Decode.testBase64Decode 4 1 avgt 5 48.614 ? 0.609 ns/op > Base64Decode.testBase64Decode 4 3 avgt 5 58.199 ? 1.650 ns/op > Base64Decode.testBase64Decode 4 7 avgt 5 69.400 ? 0.931 ns/op > Base64Decode.testBase64Decode 4 32 avgt 5 96.818 ? 1.687 ns/op > Base64Decode.testBase64Decode 4 64 avgt 5 122.856 ? 9.217 ns/op > Base64Decode.testBase64Decode 4 80 avgt 5 130.935 ? 1.667 ns/op > Base64Decode.testBase64Decode 4 96 avgt 5 143.627 ? 1.751 ns/op > Base64Decode.testBase64Decode 4 112 avgt 5 152.311 ? 1.178 ns/op > Base64Decode.testBase64Decode 4 512 avgt 5 342.631 ? 0.584 ns/op > Base64Decode.testBase64Decode 4 1000 avgt 5 573.635 ? 1.050 ns/op > Base64Decode.testBase64Decode 4 20000 avgt 5 9534.136 ? 45.172 ns/op > Base64Decode.testBase64Decode 4 50000 avgt 5 22718.726 ? 192.070 ns/op > Base64Decode.testBase64MIMEDecode 4 1 avgt 10 63.558 ? 0.336 ns/op > Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.504 ? 0.848 ns/op > Base64Decode.testBase64MIMEDecode 4 7 avgt 10 120.591 ? 0.608 ns/op > Base64Decode.testBase64MIMEDecode 4 32 avgt 10 324.314 ? 6.236 ns/op > Base64Decode.testBase64MIMEDecode 4 64 avgt 10 532.678 ? 4.670 ns/op > Base64Decode.testBase64MIMEDecode 4 80 avgt 10 678.126 ? 4.324 ns/op > Base64Decode.testBase64MIMEDecode 4 96 avgt 10 771.603 ? 6.393 ns/op > Base64Decode.testBase64MIMEDecode 4 112 avgt 10 889.608 ? 0.759 ns/op > Base64Decode.testBase64MIMEDecode 4 512 avgt 10 3663.557 ? 3.422 ns/op > Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7017.784 ? 9.128 ns/op > Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 128670.660 ? 7951.521 ns/op > Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 317113.667 ? 161.758 ns/op > > # Kunpeng916, default > Base64Decode.testBase64Decode 4 1 avgt 5 48.455 ? 0.571 ns/op > Base64Decode.testBase64Decode 4 3 avgt 5 57.937 ? 0.505 ns/op > Base64Decode.testBase64Decode 4 7 avgt 5 73.823 ? 1.452 ns/op > Base64Decode.testBase64Decode 4 32 avgt 5 106.484 ? 1.243 ns/op > Base64Decode.testBase64Decode 4 64 avgt 5 141.004 ? 1.188 ns/op > Base64Decode.testBase64Decode 4 80 avgt 5 156.284 ? 0.572 ns/op > Base64Decode.testBase64Decode 4 96 avgt 5 174.137 ? 0.177 ns/op > Base64Decode.testBase64Decode 4 112 avgt 5 188.445 ? 0.572 ns/op > Base64Decode.testBase64Decode 4 512 avgt 5 610.847 ? 1.559 ns/op > Base64Decode.testBase64Decode 4 1000 avgt 5 1155.368 ? 0.813 ns/op > Base64Decode.testBase64Decode 4 20000 avgt 5 19751.477 ? 24.669 ns/op > Base64Decode.testBase64Decode 4 50000 avgt 5 50046.586 ? 523.155 ns/op > Base64Decode.testBase64MIMEDecode 4 1 avgt 10 64.130 ? 0.238 ns/op > Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.096 ? 0.205 ns/op > Base64Decode.testBase64MIMEDecode 4 7 avgt 10 118.849 ? 0.610 ns/op > Base64Decode.testBase64MIMEDecode 4 32 avgt 10 331.177 ? 4.732 ns/op > Base64Decode.testBase64MIMEDecode 4 64 avgt 10 549.117 ? 0.177 ns/op > Base64Decode.testBase64MIMEDecode 4 80 avgt 10 702.951 ? 4.572 ns/op > Base64Decode.testBase64MIMEDecode 4 96 avgt 10 799.566 ? 0.301 ns/op > Base64Decode.testBase64MIMEDecode 4 112 avgt 10 923.749 ? 0.389 ns/op > Base64Decode.testBase64MIMEDecode 4 512 avgt 10 4000.725 ? 2.519 ns/op > Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7674.994 ? 9.281 ns/op > Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 142059.001 ? 157.920 ns/op > Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 355698.369 ? 216.542 ns/op This pull request has now been integrated. Changeset: 77b16739 Author: Dong Bo Committer: Fei Yang URL: https://git.openjdk.java.net/jdk/commit/77b16739 Stats: 436 lines in 3 files changed: 436 ins; 0 del; 0 mod 8256245: AArch64: Implement Base64 decoding intrinsic Reviewed-by: aph, ngasson ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From jiefu at openjdk.java.net Fri Apr 9 02:26:46 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 9 Apr 2021 02:26:46 GMT Subject: RFR: 8264945: Optimize the code-gen for Math.pow(x, 0.5) Message-ID: Hi all, I'd like to optimize the code-gen for Math.pow(x, 0.5). And 7x ~ 14x performance improvement is observed by the jmh micro-benchmarks. While I was optimizing a machine learning program, I found both Math.pow(x, 2) and Math.pow(x, 0.5) are used. To my surprise, C2 just optimizes the case for Math.pow(x, 2) [1], but still not for Math.pow(x, 0.5) yet. The patch just replace Math.pow(x, 0.5) with Math.sqrt(x). Before: Benchmark (seed) Mode Cnt Score Error Units MathBench.powDouble0Dot5 0 thrpt 8 45525.117 ? 11.686 ops/ms MathBench.powDouble0Dot5Loop 0 thrpt 8 0.031 ? 0.001 ops/ms Benchmark (seed) Mode Cnt Score Error Units MathBench.powDouble0Dot5 0 thrpt 8 45509.317 ? 6.581 ops/ms MathBench.powDouble0Dot5Loop 0 thrpt 8 0.031 ? 0.001 ops/ms After: Benchmark (seed) Mode Cnt Score Error Units MathBench.powDouble0Dot5 0 thrpt 8 343354.892 ? 362.900 ops/ms MathBench.powDouble0Dot5Loop 0 thrpt 8 0.457 ? 0.001 ops/mso Benchmark (seed) Mode Cnt Score Error Units MathBench.powDouble0Dot5 0 thrpt 8 343421.559 ? 49.326 ops/ms MathBench.powDouble0Dot5Loop 0 thrpt 8 0.457 ? 0.001 ops/ms Testing: - tier1~3 on Linux/x64 Thanks, Best regards, Jie [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/library_call.cpp#L1680 ------------- Commit messages: - 8264945: Optimize the code-gen for Math.pow(x, 0.5) Changes: https://git.openjdk.java.net/jdk/pull/3404/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3404&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264945 Stats: 28 lines in 2 files changed: 23 ins; 0 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/3404.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3404/head:pull/3404 PR: https://git.openjdk.java.net/jdk/pull/3404 From hshi at openjdk.java.net Fri Apr 9 03:41:34 2021 From: hshi at openjdk.java.net (Hui Shi) Date: Fri, 9 Apr 2021 03:41:34 GMT Subject: RFR: 8264945: Optimize the code-gen for Math.pow(x, 0.5) In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 02:19:10 GMT, Jie Fu wrote: > Hi all, > > I'd like to optimize the code-gen for Math.pow(x, 0.5). > And 7x ~ 14x performance improvement is observed by the jmh micro-benchmarks. > > While I was optimizing a machine learning program, I found both Math.pow(x, 2) and Math.pow(x, 0.5) are used. > To my surprise, C2 just optimizes the case for Math.pow(x, 2) [1], but still not for Math.pow(x, 0.5) yet. > > The patch just replace Math.pow(x, 0.5) with Math.sqrt(x). > > Before: > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble0Dot5 0 thrpt 8 45525.117 ? 11.686 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.031 ? 0.001 ops/ms > > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble0Dot5 0 thrpt 8 45509.317 ? 6.581 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.031 ? 0.001 ops/ms > > After: > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble0Dot5 0 thrpt 8 343354.892 ? 362.900 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.457 ? 0.001 ops/mso > > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble0Dot5 0 thrpt 8 343421.559 ? 49.326 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.457 ? 0.001 ops/ms > > Testing: > - tier1~3 on Linux/x64 > > Thanks, > Best regards, > Jie > > [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/library_call.cpp#L1680 Does float vector needs same optimization? convert pow to sqrt? ------------- PR: https://git.openjdk.java.net/jdk/pull/3404 From jiefu at openjdk.java.net Fri Apr 9 03:59:15 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 9 Apr 2021 03:59:15 GMT Subject: RFR: 8264945: Optimize the code-gen for Math.pow(x, 0.5) In-Reply-To: References: Message-ID: <0LkbLa3f1wADOWAqkzIyDR3w-jlo1-AsUE5fQn_vsXM=.a1c544d6-b849-4a4a-b34b-2640e8f0fe27@github.com> On Fri, 9 Apr 2021 03:37:52 GMT, Hui Shi wrote: > Does float vector needs same optimization? convert pow to sqrt? Thanks @huishi-hs for your review. The same optimization for the vector api is in progress. Let's see what the reviewers think of this opt. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/3404 From dlong at openjdk.java.net Fri Apr 9 05:05:15 2021 From: dlong at openjdk.java.net (Dean Long) Date: Fri, 9 Apr 2021 05:05:15 GMT Subject: RFR: 8262916: Merge LShiftCntV and RShiftCntV into a single node In-Reply-To: References: <9T0fatLfUStsY_1SA94rlhKokfqT7TeySYeFzIuO7Nk=.7394e20e-a4a3-4786-a2db-f5d8acc2b74a@github.com> Message-ID: On Thu, 8 Apr 2021 11:52:02 GMT, Eric Liu wrote: >> Regarding the proposed change itself (`LShiftCntV/RShiftCntV => ShiftCntV`). >> >> Not sure how important it is, but it has an unfortunate change in generated code for right vector shifts on AArch32: instead of sharing the result of index negation at all use sites, negation is performed at every use site now. >> >> As a consequence, in an auto-vectorized loop it will lead to: >> - 1 instruction per loop iteration (multiplied by unrolling factor); >> - no way to hoist the negation of loop invariant index. > >> Regarding the proposed change itself (`LShiftCntV/RShiftCntV => ShiftCntV`). >> >> Not sure how important it is, but it has an unfortunate change in generated code for right vector shifts on AArch32: instead of sharing the result of index negation at all use sites, negation is performed at every use site now. >> >> As a consequence, in an auto-vectorized loop it will lead to: >> >> * 1 instruction per loop iteration (multiplied by unrolling factor); >> * no way to hoist the negation of loop invariant index. > > Thanks for your feedback! > > I have checked the generated code on AArch32 and it's a shame that 'vneg' is at every use point. > > Before: > 0xf46c8338: add fp, r7, fp > 0xf46c833c: vshl.u16 d9, d9, d8 > 0xf46c8340: vstr d9, [fp, #12] > 0xf46c8344: vshl.u16 d9, d10, d8 > 0xf46c8348: vstr d9, [fp, #20] > 0xf46c834c: vshl.u16 d9, d11, d8 > 0xf46c8350: vstr d9, [fp, #28] > > After: > 0xf4aa1328: add fp, r7, fp > 0xf4aa132c: vneg.s8 d13, d8 > 0xf4aa1330: vshl.u16 d9, d9, d13 > 0xf4aa1334: vstr d9, [fp, #12] > 0xf4aa1338: vneg.s8 d9, d8 > 0xf4aa133c: vshl.u16 d9, d10, d9 > 0xf4aa1340: vstr d9, [fp, #20] > 0xf4aa1344: vneg.s8 d9, d8 > 0xf4aa1348: vshl.u16 d9, d11, d9 > 0xf4aa134c: vstr d9, [fp, #28] > > I suppose it's more like a trade off that either remaining those two R/LShiftCntV nodes and change AArch64 and X86 to what AArch32 dose, or merging them and accept this defect on AArch32. @theRealELiu @iwanowww Yes, but I was thinking of the `vshiftcnt` rules and didn't realize the `replicate` rules are virtually identical. Now that I look again, it appears that aarch64 is already doing the same as x86 (converting the scalar shift count to a vector register, and matching the vector register in the rule). So why is it that x86 can share the shift register, but aarch64 cannot? Is it because x86 has combined left- and right-shift into the same rule? ------------- PR: https://git.openjdk.java.net/jdk/pull/3371 From dholmes at openjdk.java.net Fri Apr 9 05:08:37 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 9 Apr 2021 05:08:37 GMT Subject: RFR: 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines [v4] In-Reply-To: References: Message-ID: <_q9nVMOqIePMWqifBZo_Cj9Vsal63l2BLN8X7x5S8iw=.abdcc58f-877d-4e36-b26e-3699cb05eeda@github.com> > The existing JRT_ENTRY (and related) macros require the function to which they are applied to declare a parameter "JavaThread* thread" which represents the current thread. These functions are all implicitly "traps" functions as they can result in exceptions, but they are not declared with TRAPS because the only caller of these functions is the runtime itself (via call_VM) and no callers need to be aware to use CHECK; further they need a JavaThread. So the macro declares the THREAD variable for use with other exception-producing functions and assigns it from "thread". > > The majority of this change replaces the parameter name "thread" with "current" so that it is clear that we are always dealing with the current thread. This affects the entry functions as well as the functions called therefrom. > > We can then also replace the use of "THREAD" with "current", in contexts that are not related to exception processing. > > Some methods called by entry functions were declared to have both a "thread" parameter and a "TRAPS" parameter - with nothing to tell you these are always the same, current, thread. So the "thread" parameter is removed and replaced with a local variable "current" obtained from THREAD->as_Java_thread(). > > Some missing CHECK_ uses were added. > > Testing: > - tiers 1-3 > > Thanks, > David David Holmes has updated the pull request incrementally with one additional commit since the last revision: Fix search&replace mistake ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3394/files - new: https://git.openjdk.java.net/jdk/pull/3394/files/041d7ef5..9cfc43c2 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3394&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3394&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3394.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3394/head:pull/3394 PR: https://git.openjdk.java.net/jdk/pull/3394 From shade at openjdk.java.net Fri Apr 9 06:32:19 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 9 Apr 2021 06:32:19 GMT Subject: RFR: 8264885: Fix the code style of macro in aarch64_neon_ad.m4 In-Reply-To: References: Message-ID: On Thu, 8 Apr 2021 08:28:52 GMT, Wang Huang wrote: > * trivial fix > * align the comment of macros Marked as reviewed by shade (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3395 From whuang at openjdk.java.net Fri Apr 9 06:32:20 2021 From: whuang at openjdk.java.net (Wang Huang) Date: Fri, 9 Apr 2021 06:32:20 GMT Subject: Integrated: 8264885: Fix the code style of macro in aarch64_neon_ad.m4 In-Reply-To: References: Message-ID: <2BpB-3gRVHHOD-FrAM_jnXDzC5jv0MqpSN8UjyKIzxc=.62071fc2-2ec0-4945-ae04-1f0f9815ac5e@github.com> On Thu, 8 Apr 2021 08:28:52 GMT, Wang Huang wrote: > * trivial fix > * align the comment of macros This pull request has now been integrated. Changeset: 3e57924a Author: Wang Huang Committer: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/3e57924a Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod 8264885: Fix the code style of macro in aarch64_neon_ad.m4 Reviewed-by: aph, shade ------------- PR: https://git.openjdk.java.net/jdk/pull/3395 From thartmann at openjdk.java.net Fri Apr 9 07:04:24 2021 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Fri, 9 Apr 2021 07:04:24 GMT Subject: RFR: 8264649: runtime/InternalApi/ThreadCpuTimesDeadlock.java crash in fastdebug C2 with -XX:-UseTLAB [v2] In-Reply-To: References: Message-ID: <-63abPXoiWhcD8FhNKDptvxLVKYCLjKW5WtAGKUVg4U=.9b9367a3-202f-418c-b049-dad477aa6f2f@github.com> On Wed, 7 Apr 2021 12:09:05 GMT, Hui Shi wrote: >> Please help review this fix. Detailed analysis in https://bugs.openjdk.java.net/browse/JDK-8264649 >> >> Tier1/2 release /fastdebug passed > > Hui Shi has updated the pull request incrementally with one additional commit since the last revision: > > fix typo in test description Thanks for the details. Your fix looks reasonable to me. ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3336 From stefank at openjdk.java.net Fri Apr 9 07:09:08 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Fri, 9 Apr 2021 07:09:08 GMT Subject: RFR: 8264358: Don't create invalid oop in method handle tracing In-Reply-To: References: Message-ID: <_Nynq1SJulaT-SE9YZZ06M5q9Neg137TssgQL3p1-ag=.3a8d545a-b45e-4282-848b-7340b8145317@github.com> On Mon, 29 Mar 2021 11:49:56 GMT, Stefan Karlsson wrote: > The `mh` field in: > > struct MethodHandleStubArguments { > const char* adaptername; > oopDesc* mh; > intptr_t* saved_regs; > intptr_t* entry_sp; > }; > > doesn't always point to a valid object. The `oopDesc*` is then implicitly converted to an `oop` here: > > void trace_method_handle_stub_wrapper(MethodHandleStubArguments* args) { > trace_method_handle_stub(args->adaptername, > args->mh, > args->saved_regs, > args->entry_sp); > } > > This gets caught by my ad-hoc verification code that verifies oops when they are created/used. > > I propose that we don't create an oop until it `mh` is actually used, and it has been checked that the argument should contain a valid oop. I started with a more elaborate fix that changed the type of `mh` to be `void*`, but then reverted to a more targetted fix to remove the early oopDesc* > oop conversion. > > One thing that I am curious about is this code inside trace_method_handle_stub: > if (has_mh && oopDesc::is_oop(mh)) { > mh->print_on(&ls); > > Delaying the oopDesc* > oop conversion to after the `has_mh` check solves my verification failure, but I wonder if the `oopDesc::is_oop(mh)` call is really needed when we have the `has_mh` check? Could I get a review for this? It is a super small change. ------------- PR: https://git.openjdk.java.net/jdk/pull/3242 From jbhateja at openjdk.java.net Fri Apr 9 07:30:28 2021 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Fri, 9 Apr 2021 07:30:28 GMT Subject: RFR: 8264954: unified handling for VectorMask object re-materialization during de-optimization Message-ID: Following flow describes object reconstruction for de-optimization:- 1) PhaseVector::scalarize_vbox_node() creates SafePointScalarObjectNode to captures the box type information, also it connects to node holding the boxed value. 2) During code emit phase (PhaseOutput) C2 process above information to dumps ObjectValue holding the box information and LocationValue to holding the value information into ScopeDescriptor corresponding to Safepoint PC. 3) De-optimization blobs dump the value held in registers to the stack locations using RegisterSave::save_live_registers() and a mapping b/w register and its stack location is added to RegisterMap. 4) During de-optimization, compiled frame objects are re-allocated using identity information held in ObjectValue and their fields are initialized using values held in the stack locations accessed through register-stack mappings. By inserting a VectorStoreMaskNode before stitching the mask holding node to Safepoint we make sure that value held in opmask/vector register is transferred to a byte vector. Thus rest of the flow works as it is, stack location will hold the value in the form of a byte array irrespective of the box shape. tier1-tier3 regressions are clean with UseAVX=2/3. ------------- Commit messages: - 8264954: unified handling for VectorMask object re-materialization during de-optimization Changes: https://git.openjdk.java.net/jdk/pull/3408/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3408&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264954 Stats: 51 lines in 2 files changed: 29 ins; 15 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/3408.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3408/head:pull/3408 PR: https://git.openjdk.java.net/jdk/pull/3408 From eliu at openjdk.java.net Fri Apr 9 07:35:15 2021 From: eliu at openjdk.java.net (Eric Liu) Date: Fri, 9 Apr 2021 07:35:15 GMT Subject: RFR: 8262916: Merge LShiftCntV and RShiftCntV into a single node In-Reply-To: References: <9T0fatLfUStsY_1SA94rlhKokfqT7TeySYeFzIuO7Nk=.7394e20e-a4a3-4786-a2db-f5d8acc2b74a@github.com> Message-ID: On Thu, 8 Apr 2021 11:52:02 GMT, Eric Liu wrote: >> Regarding the proposed change itself (`LShiftCntV/RShiftCntV => ShiftCntV`). >> >> Not sure how important it is, but it has an unfortunate change in generated code for right vector shifts on AArch32: instead of sharing the result of index negation at all use sites, negation is performed at every use site now. >> >> As a consequence, in an auto-vectorized loop it will lead to: >> - 1 instruction per loop iteration (multiplied by unrolling factor); >> - no way to hoist the negation of loop invariant index. > >> Regarding the proposed change itself (`LShiftCntV/RShiftCntV => ShiftCntV`). >> >> Not sure how important it is, but it has an unfortunate change in generated code for right vector shifts on AArch32: instead of sharing the result of index negation at all use sites, negation is performed at every use site now. >> >> As a consequence, in an auto-vectorized loop it will lead to: >> >> * 1 instruction per loop iteration (multiplied by unrolling factor); >> * no way to hoist the negation of loop invariant index. > > Thanks for your feedback! > > I have checked the generated code on AArch32 and it's a shame that 'vneg' is at every use point. > > Before: > 0xf46c8338: add fp, r7, fp > 0xf46c833c: vshl.u16 d9, d9, d8 > 0xf46c8340: vstr d9, [fp, #12] > 0xf46c8344: vshl.u16 d9, d10, d8 > 0xf46c8348: vstr d9, [fp, #20] > 0xf46c834c: vshl.u16 d9, d11, d8 > 0xf46c8350: vstr d9, [fp, #28] > > After: > 0xf4aa1328: add fp, r7, fp > 0xf4aa132c: vneg.s8 d13, d8 > 0xf4aa1330: vshl.u16 d9, d9, d13 > 0xf4aa1334: vstr d9, [fp, #12] > 0xf4aa1338: vneg.s8 d9, d8 > 0xf4aa133c: vshl.u16 d9, d10, d9 > 0xf4aa1340: vstr d9, [fp, #20] > 0xf4aa1344: vneg.s8 d9, d8 > 0xf4aa1348: vshl.u16 d9, d11, d9 > 0xf4aa134c: vstr d9, [fp, #28] > > I suppose it's more like a trade off that either remaining those two R/LShiftCntV nodes and change AArch64 and X86 to what AArch32 dose, or merging them and accept this defect on AArch32. > @theRealELiu @iwanowww Yes, but I was thinking of the `vshiftcnt` rules and didn't realize the `replicate` rules are virtually identical. Now that I look again, it appears that aarch64 is already doing the same as x86 (converting the scalar shift count to a vector register, and matching the vector register in the rule). So why is it that x86 can share the shift register, but aarch64 cannot? Is it because x86 has combined left- and right-shift into the same rule? I suppose x86 has the different kinds of vector shift instructions[1], but AArch64 and AArch32 only have the left one. For this reason, arm should use a "neg-leftshift" pair to implement right shift. [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp#L1234 ------------- PR: https://git.openjdk.java.net/jdk/pull/3371 From xgong at openjdk.java.net Fri Apr 9 07:43:50 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Fri, 9 Apr 2021 07:43:50 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v3] In-Reply-To: References: Message-ID: > The Vector API defines different element types for floating point VectorMask. For example, the bitwise related APIs use "`long/int`", while data related APIs use "`double/float`". This makes some optimizations that based on the type checking cannot work well. > > For example, the VectorBox/Unbox elimination like `"VectorUnbox (VectorBox v) ==> v"` requires the types of output and > input are equal. Normally this is necessary. However, due to the different element type for floating point VectorMask with the same species, the VectorBox/Unbox pattern is optimized to: > VectorLoadMask (VectorStoreMask vmask) > Actually the types can be treated as the same one for such cases. And considering the vector mask representation is the same for > vectors with the same element size and vector length, it's safe to do the optimization: > VectorLoadMask (VectorStoreMask vmask) ==> vmask > The same issue exists for GVN which is based on the type of a node. Making the mask node's` hash()/cmp()` methods depends on the element size instead of the element type can fix it. Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: Revert changes to VectorLoadMask and add a VectorMaskCast for the same element size mask casting ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3238/files - new: https://git.openjdk.java.net/jdk/pull/3238/files/bf5b2028..ce3577ae Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3238&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3238&range=01-02 Stats: 121 lines in 10 files changed: 92 ins; 27 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/3238.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3238/head:pull/3238 PR: https://git.openjdk.java.net/jdk/pull/3238 From xgong at openjdk.java.net Fri Apr 9 07:46:18 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Fri, 9 Apr 2021 07:46:18 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v2] In-Reply-To: References: Message-ID: <1rTu7ezhH3o26yoU1RZa9HrJBqshdC6QSPdZwpSlLR4=.4486d157-6ebf-476c-9ffc-64ccd7d8e5b7@github.com> On Thu, 8 Apr 2021 15:00:10 GMT, Vladimir Kozlov wrote: >> Xiaohong Gong has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: >> >> 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask > > Okay. Hi @iwanowww , I'v updated the codes to use `VectorMaskCast` for vector mask casting with the same element size and vector length. Would you mind having a look again? Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From eliu at openjdk.java.net Fri Apr 9 08:54:22 2021 From: eliu at openjdk.java.net (Eric Liu) Date: Fri, 9 Apr 2021 08:54:22 GMT Subject: RFR: 8262916: Merge LShiftCntV and RShiftCntV into a single node In-Reply-To: References: <9T0fatLfUStsY_1SA94rlhKokfqT7TeySYeFzIuO7Nk=.7394e20e-a4a3-4786-a2db-f5d8acc2b74a@github.com> Message-ID: On Thu, 8 Apr 2021 13:37:31 GMT, Vladimir Ivanov wrote: > `LShiftCntV`/`RShiftCntV` were added specifically for AArch32 and other platforms don't need/benefit from such separation. I think AArch64 shared the same problem with AArch32 since it only has left shift instruction as well. I checked the generated code on AArch64 with latest master. Test case is TestCharVect2::test_srav[1]. 0x0000ffffa9106c68: ldr q17, [x15, #16] 0x0000ffffa9106c6c: add x14, x10, x14 0x0000ffffa9106c70: neg v18.16b, v16.16b 0x0000ffffa9106c74: ushl v17.8h, v17.8h, v18.8h 0x0000ffffa9106c78: str q17, [x14, #16] 0x0000ffffa9106c7c: ldr q17, [x15, #32] 0x0000ffffa9106c80: neg v18.16b, v16.16b 0x0000ffffa9106c84: ushl v17.8h, v17.8h, v18.8h 0x0000ffffa9106c88: str q17, [x14, #32] 0x0000ffffa9106c8c: ldr q17, [x15, #48] 0x0000ffffa9106c90: neg v18.16b, v16.16b 0x0000ffffa9106c94: ushl v17.8h, v17.8h, v18.8h 0x0000ffffa9106c98: str q17, [x14, #48] 0x0000ffffa9106c9c: ldr q17, [x15, #64] 0x0000ffffa9106ca0: neg v18.16b, v16.16b 0x0000ffffa9106ca4: ushl v17.8h, v17.8h, v18.8h 0x0000ffffa9106ca8: str q17, [x14, #64] 0x0000ffffa9106cac: ldr q17, [x15, #80] 0x0000ffffa9106cb0: neg v18.16b, v16.16b 0x0000ffffa9106cb4: ushl v17.8h, v17.8h, v18.8h It seems that keeping those two RShiftCntV and LShiftCntV is friendly to AArch32/64 in this case, but AArch64 should changed to what AArch32 dose. @theRealAph [1] https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/compiler/codegen/TestCharVect2.java#L1215 ------------- PR: https://git.openjdk.java.net/jdk/pull/3371 From ogatak at openjdk.java.net Fri Apr 9 09:10:32 2021 From: ogatak at openjdk.java.net (Kazunori Ogata) Date: Fri, 9 Apr 2021 09:10:32 GMT Subject: Integrated: 8259822: [PPC64] Support the prefixed instruction format added in POWER10 In-Reply-To: References: Message-ID: On Fri, 15 Jan 2021 08:09:56 GMT, Kazunori Ogata wrote: > The POWER10 processor, which implements Power ISA 3.1 [1], supports new instruction formats where an instruction takes two 32bit words. The first word is called prefix, and the instructions with prefix are called prefixed instructions. With more bits in opcode and operand fields, POWER10 supports larger immediate value in an operand, as well as many new instructions. > > This is the first changes to handle prefixed instructions, and this adds support of prefixed addi (= paddi) instruction as an example of prefix usage. paddi accepts 34bit immediate value, while original addi accepts 16bit value. > > [1] https://ibm.ent.box.com/s/hhjfw0x0lrbtyzmiaffnbxh2fuo0fog0 This pull request has now been integrated. Changeset: f7a6c63a Author: Kazunori Ogata Committer: Martin Doerr URL: https://git.openjdk.java.net/jdk/commit/f7a6c63a Stats: 244 lines in 3 files changed: 240 ins; 0 del; 4 mod 8259822: [PPC64] Support the prefixed instruction format added in POWER10 Reviewed-by: cashford, mdoerr ------------- PR: https://git.openjdk.java.net/jdk/pull/2095 From aph at openjdk.java.net Fri Apr 9 09:29:20 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 9 Apr 2021 09:29:20 GMT Subject: RFR: 8262916: Merge LShiftCntV and RShiftCntV into a single node In-Reply-To: References: <9T0fatLfUStsY_1SA94rlhKokfqT7TeySYeFzIuO7Nk=.7394e20e-a4a3-4786-a2db-f5d8acc2b74a@github.com> Message-ID: On Fri, 9 Apr 2021 08:50:49 GMT, Eric Liu wrote: > > It seems that keeping those two RShiftCntV and LShiftCntV is friendly to AArch32/64 in this case, but AArch64 should changed to what AArch32 dose. @theRealAph Thanks, but it's been a while since I looked at the vector code. Can you point me to the AArch32 patterns in question, to show me the AArch64 changes needed? Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/3371 From xgong at openjdk.java.net Fri Apr 9 10:11:41 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Fri, 9 Apr 2021 10:11:41 GMT Subject: RFR: 8264957: Type::dual_type array is not aligned with enum TYPES Message-ID: This is a bug fix for [1] which adds a new vector mask type. The new added TYPE `"VectorMask"` is inserted into `enum TYPES`, while the array `"Type::dual_type"` is not updated. This makes the array elements are not aligned with TYPES. I met the following crash due to this issue when I was working on the masking feature support on panama-vector: Internal Error (/home/xiagon01/code/panama-vector/src/hotspot/share/opto/type.hpp:1727), pid=104432, tid=104449 # assert(_base >= AnyPtr && _base <= KlassPtr) failed: Not a pointer Adding a value like other vector types for the `"VectorMask"` in the array `"dual_type"` can fix it. [1] https://bugs.openjdk.java.net/browse/JDK-8262355 Tested with tier1 and jdk:tier3 ------------- Commit messages: - 8264957: Type::dual_type array is not aligned with enum TYPES Changes: https://git.openjdk.java.net/jdk/pull/3410/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3410&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264957 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3410.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3410/head:pull/3410 PR: https://git.openjdk.java.net/jdk/pull/3410 From vlivanov at openjdk.java.net Fri Apr 9 10:49:21 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Fri, 9 Apr 2021 10:49:21 GMT Subject: Integrated: 8264918: [JVMCI] getVtableIndexForInterfaceMethod doesn't check that type and method are related In-Reply-To: References: Message-ID: On Thu, 8 Apr 2021 12:49:21 GMT, Vladimir Ivanov wrote: > getVtableIndexForInterfaceMethod should reject the case when a resolved class doesn't implement the holder interface. > > Testing: > - [x] hs-tier1 - hs-tier4 This pull request has now been integrated. Changeset: b3782ead Author: Vladimir Ivanov URL: https://git.openjdk.java.net/jdk/commit/b3782ead Stats: 11 lines in 2 files changed: 6 ins; 0 del; 5 mod 8264918: [JVMCI] getVtableIndexForInterfaceMethod doesn't check that type and method are related Reviewed-by: kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/3396 From eliu at openjdk.java.net Fri Apr 9 10:51:20 2021 From: eliu at openjdk.java.net (Eric Liu) Date: Fri, 9 Apr 2021 10:51:20 GMT Subject: RFR: 8262916: Merge LShiftCntV and RShiftCntV into a single node In-Reply-To: References: <9T0fatLfUStsY_1SA94rlhKokfqT7TeySYeFzIuO7Nk=.7394e20e-a4a3-4786-a2db-f5d8acc2b74a@github.com> Message-ID: On Fri, 9 Apr 2021 09:26:19 GMT, Andrew Haley wrote: > > It seems that keeping those two RShiftCntV and LShiftCntV is friendly to AArch32/64 in this case, but AArch64 should changed to what AArch32 dose. @theRealAph > > Thanks, but it's been a while since I looked at the vector code. Can you point me to the AArch32 patterns in question, to show me the AArch64 changes needed? Thanks. AArch32 combinates 'neg' with 'dup' in RShiftCntV[1], which AArch64 has a single 'dup' only[2] and generates 'neg' before every use sites, e.g. RShiftVB[3]. [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/arm/arm.ad#L10451 [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/aarch64_neon.ad#L5117 [3] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/aarch64_neon.ad#L5188 ------------- PR: https://git.openjdk.java.net/jdk/pull/3371 From jiefu at openjdk.java.net Fri Apr 9 11:05:25 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 9 Apr 2021 11:05:25 GMT Subject: RFR: 8264957: Type::dual_type array is not aligned with enum TYPES In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 10:04:10 GMT, Xiaohong Gong wrote: > This is a bug fix for [1] which adds a new vector mask type. The new added TYPE `"VectorMask"` is inserted into `enum TYPES`, while the array `"Type::dual_type"` is not updated. This makes the array elements are not aligned with TYPES. > > I met the following crash due to this issue when I was working on the masking feature support on panama-vector: > Internal Error (/home/xiagon01/code/panama-vector/src/hotspot/share/opto/type.hpp:1727), pid=104432, tid=104449 > # assert(_base >= AnyPtr && _base <= KlassPtr) failed: Not a pointer > Adding a value like other vector types for the `"VectorMask"` in the array `"dual_type"` can fix it. > > [1] https://bugs.openjdk.java.net/browse/JDK-8262355 > > Tested with tier1 and jdk:tier3 This sounds reasonable to me. Thanks. ------------- Marked as reviewed by jiefu (Committer). PR: https://git.openjdk.java.net/jdk/pull/3410 From neliasso at openjdk.java.net Fri Apr 9 11:08:19 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Fri, 9 Apr 2021 11:08:19 GMT Subject: RFR: 8264358: Don't create invalid oop in method handle tracing In-Reply-To: References: Message-ID: On Mon, 29 Mar 2021 11:49:56 GMT, Stefan Karlsson wrote: > The `mh` field in: > > struct MethodHandleStubArguments { > const char* adaptername; > oopDesc* mh; > intptr_t* saved_regs; > intptr_t* entry_sp; > }; > > doesn't always point to a valid object. The `oopDesc*` is then implicitly converted to an `oop` here: > > void trace_method_handle_stub_wrapper(MethodHandleStubArguments* args) { > trace_method_handle_stub(args->adaptername, > args->mh, > args->saved_regs, > args->entry_sp); > } > > This gets caught by my ad-hoc verification code that verifies oops when they are created/used. > > I propose that we don't create an oop until it `mh` is actually used, and it has been checked that the argument should contain a valid oop. I started with a more elaborate fix that changed the type of `mh` to be `void*`, but then reverted to a more targetted fix to remove the early oopDesc* > oop conversion. > > One thing that I am curious about is this code inside trace_method_handle_stub: > if (has_mh && oopDesc::is_oop(mh)) { > mh->print_on(&ls); > > Delaying the oopDesc* > oop conversion to after the `has_mh` check solves my verification failure, but I wonder if the `oopDesc::is_oop(mh)` call is really needed when we have the `has_mh` check? Looks good. ------------- Marked as reviewed by neliasso (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3242 From neliasso at openjdk.java.net Fri Apr 9 11:13:14 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Fri, 9 Apr 2021 11:13:14 GMT Subject: RFR: 8264871: Dependencies: Miscellaneous cleanups in dependencies.cpp In-Reply-To: References: Message-ID: On Wed, 7 Apr 2021 21:51:30 GMT, Vladimir Ivanov wrote: > Miscellaneous cleanups in dependencies.cpp. > > Testing: > * [x] hs-tier1 - hs-tier4 Looks good. ------------- Marked as reviewed by neliasso (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3385 From vlivanov at openjdk.java.net Fri Apr 9 11:18:10 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Fri, 9 Apr 2021 11:18:10 GMT Subject: RFR: 8264954: unified handling for VectorMask object re-materialization during de-optimization In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 07:25:11 GMT, Jatin Bhateja wrote: > Following flow describes object reconstruction for de-optimization:- > 1) PhaseVector::scalarize_vbox_node() creates SafePointScalarObjectNode to captures the box type information, also it connects to node holding the boxed value. > 2) During code emit phase (PhaseOutput) C2 process above information to dumps ObjectValue holding the box information and LocationValue to holding the value information into ScopeDescriptor corresponding to Safepoint PC. > 3) De-optimization blobs dump the value held in registers to the stack locations using RegisterSave::save_live_registers() and a mapping b/w register and its stack location is added to RegisterMap. > 4) During de-optimization, compiled frame objects are re-allocated using identity information held in ObjectValue and their fields are initialized using values held in the stack locations accessed through register-stack mappings. > > By inserting a VectorStoreMaskNode before stitching the mask holding node to Safepoint we make sure that value held in opmask/vector register is transferred to a byte vector. Thus rest of the flow works as it is, stack location will hold the value in the form of a byte array irrespective of the box shape. > > tier1-tier3 regressions are clean with UseAVX=2/3. Very nice! I like your idea to unify mask representation. It simplifies implementation a lot. But it also means there'll be 2 representations kept alive for masks with debug usages and burns 2 registers (speaking of x86, either 1 predicate + 1 vector registers on AVX512 or 2 vector regsiters on pre-AVX512). Do you see any problems with that? (It can be improved later if it turns out to be a problem in practice.) src/hotspot/share/opto/vector.cpp line 277: > 275: const TypeVect* vt = vec_value->bottom_type()->is_vect(); > 276: BasicType bt = vt->element_basic_type(); > 277: vec_value = gvn.transform(VectorStoreMaskNode::make(gvn, vec_value, bt, vt->length())); `VectorStoreMaskNode::Identity()` already handles `VectorStoreMask (VectorLoadMask bv) elem_size ==> bv` transformation, doesn't it? So, you can just unconditionally instantiate `VectorStoreMaskNode` and let IGVN handle it. Also, an observation on naming: `VectorLoadMask` and `VectorStoreMask` names are misleading. On the surface, they look like memory nodes, but in fact, are cast nodes (`Vector <-> Mask`). Please, file an RFE to address that eventually. src/hotspot/share/prims/vectorSupport.cpp line 91: > 89: void VectorSupport::init_payload_element(typeArrayOop arr, bool is_mask, BasicType elem_bt, int index, address addr) { > 90: if (is_mask) { > 91: // Masks require special handling: when boxed they are packed and stored in boolean The comment is still valid. Please, keep it and update with latest details about mask support. src/hotspot/share/prims/vectorSupport.cpp line 97: > 95: case T_LONG: > 96: case T_FLOAT: > 97: case T_DOUBLE: arr->bool_at_put(index, (*(jbyte*)addr) != 0); break; No switch needed anymore. Just leave the `arr->bool_at_put(index, (*(jbyte*)addr) != 0)`. As an option, consider adding an assert (`assert(is_java_type(bt) && bt != T_BOOLEAN)`. src/hotspot/share/prims/vectorSupport.cpp line 124: > 122: // byte array for masks present in both predicated register > 123: // or vector registers. > 124: int elem_size = is_mask ? 1 : type2aelembytes(elem_bt); If `elem_bt == T_BOOLEAN` for masks, you can get rid of `is_mask` usages in `VectorSupport::allocate_vector_payload_helper()` and just introduce a special case in `VectorSupport::klass2bt` (as already done for `VectorShuffle`s). ------------- PR: https://git.openjdk.java.net/jdk/pull/3408 From vlivanov at openjdk.java.net Fri Apr 9 11:18:11 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Fri, 9 Apr 2021 11:18:11 GMT Subject: RFR: 8264954: unified handling for VectorMask object re-materialization during de-optimization In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 11:04:18 GMT, Vladimir Ivanov wrote: >> Following flow describes object reconstruction for de-optimization:- >> 1) PhaseVector::scalarize_vbox_node() creates SafePointScalarObjectNode to captures the box type information, also it connects to node holding the boxed value. >> 2) During code emit phase (PhaseOutput) C2 process above information to dumps ObjectValue holding the box information and LocationValue to holding the value information into ScopeDescriptor corresponding to Safepoint PC. >> 3) De-optimization blobs dump the value held in registers to the stack locations using RegisterSave::save_live_registers() and a mapping b/w register and its stack location is added to RegisterMap. >> 4) During de-optimization, compiled frame objects are re-allocated using identity information held in ObjectValue and their fields are initialized using values held in the stack locations accessed through register-stack mappings. >> >> By inserting a VectorStoreMaskNode before stitching the mask holding node to Safepoint we make sure that value held in opmask/vector register is transferred to a byte vector. Thus rest of the flow works as it is, stack location will hold the value in the form of a byte array irrespective of the box shape. >> >> tier1-tier3 regressions are clean with UseAVX=2/3. > > src/hotspot/share/prims/vectorSupport.cpp line 97: > >> 95: case T_LONG: >> 96: case T_FLOAT: >> 97: case T_DOUBLE: arr->bool_at_put(index, (*(jbyte*)addr) != 0); break; > > No switch needed anymore. Just leave the `arr->bool_at_put(index, (*(jbyte*)addr) != 0)`. > As an option, consider adding an assert (`assert(is_java_type(bt) && bt != T_BOOLEAN)`. Actually, I think you can get rid of `is_mask` and just use `elem_bt = T_BOOLEAN` / `bool_at_put(index, (*(jboolean*)addr))`. ------------- PR: https://git.openjdk.java.net/jdk/pull/3408 From neliasso at openjdk.java.net Fri Apr 9 11:45:20 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Fri, 9 Apr 2021 11:45:20 GMT Subject: RFR: 8264872: Dependencies: Migrate to PerfData counters In-Reply-To: <6G6POLSvOArZJ0iXvZW2PDx1ps6xJLTnNe20nwcOyVA=.a254144b-8fc9-494d-9655-ba68638469da@github.com> References: <6G6POLSvOArZJ0iXvZW2PDx1ps6xJLTnNe20nwcOyVA=.a254144b-8fc9-494d-9655-ba68638469da@github.com> Message-ID: On Wed, 7 Apr 2021 22:07:18 GMT, Vladimir Ivanov wrote: > Migrate performance counters on `Dependencies` (`deps_find_witness_*`) to `PerfData`. > > Testing: > - [x] hs-tier1 - hs-tier2 Looks good. src/hotspot/share/code/dependencies.cpp line 1212: > 1210: }; > 1211: > 1212: PerfCounter* ClassHierarchyWalker::_perf_find_witness_anywhere_calls_count = NULL; Align '=' on all three or skip extra space ------------- Marked as reviewed by neliasso (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3386 From neliasso at openjdk.java.net Fri Apr 9 11:49:18 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Fri, 9 Apr 2021 11:49:18 GMT Subject: RFR: 8264957: Type::dual_type array is not aligned with enum TYPES In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 10:04:10 GMT, Xiaohong Gong wrote: > This is a bug fix for [1] which adds a new vector mask type. The new added TYPE `"VectorMask"` is inserted into `enum TYPES`, while the array `"Type::dual_type"` is not updated. This makes the array elements are not aligned with TYPES. > > I met the following crash due to this issue when I was working on the masking feature support on panama-vector: > Internal Error (/home/xiagon01/code/panama-vector/src/hotspot/share/opto/type.hpp:1727), pid=104432, tid=104449 > # assert(_base >= AnyPtr && _base <= KlassPtr) failed: Not a pointer > Adding a value like other vector types for the `"VectorMask"` in the array `"dual_type"` can fix it. > > [1] https://bugs.openjdk.java.net/browse/JDK-8262355 > > Tested with tier1 and jdk:tier3 Could you please add an assert so that we don't make this mistake again. ------------- Changes requested by neliasso (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3410 From neliasso at openjdk.java.net Fri Apr 9 12:00:14 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Fri, 9 Apr 2021 12:00:14 GMT Subject: RFR: 8264945: Optimize the code-gen for Math.pow(x, 0.5) In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 02:19:10 GMT, Jie Fu wrote: > Hi all, > > I'd like to optimize the code-gen for Math.pow(x, 0.5). > And 7x ~ 14x performance improvement is observed by the jmh micro-benchmarks. > > While I was optimizing a machine learning program, I found both Math.pow(x, 2) and Math.pow(x, 0.5) are used. > To my surprise, C2 just optimizes the case for Math.pow(x, 2) [1], but still not for Math.pow(x, 0.5) yet. > > The patch just replace Math.pow(x, 0.5) with Math.sqrt(x). > > Before: > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble0Dot5 0 thrpt 8 45525.117 ? 11.686 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.031 ? 0.001 ops/ms > > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble0Dot5 0 thrpt 8 45509.317 ? 6.581 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.031 ? 0.001 ops/ms > > After: > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble0Dot5 0 thrpt 8 343354.892 ? 362.900 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.457 ? 0.001 ops/mso > > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble0Dot5 0 thrpt 8 343421.559 ? 49.326 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.457 ? 0.001 ops/ms > > Testing: > - tier1~3 on Linux/x64 > > Thanks, > Best regards, > Jie > > [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/library_call.cpp#L1680 Looks good. ------------- Marked as reviewed by neliasso (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3404 From neliasso at openjdk.java.net Fri Apr 9 12:03:17 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Fri, 9 Apr 2021 12:03:17 GMT Subject: RFR: 8264872: Dependencies: Migrate to PerfData counters In-Reply-To: References: <6G6POLSvOArZJ0iXvZW2PDx1ps6xJLTnNe20nwcOyVA=.a254144b-8fc9-494d-9655-ba68638469da@github.com> Message-ID: On Fri, 9 Apr 2021 11:15:54 GMT, Nils Eliasson wrote: >> Migrate performance counters on `Dependencies` (`deps_find_witness_*`) to `PerfData`. >> >> Testing: >> - [x] hs-tier1 - hs-tier2 > > src/hotspot/share/code/dependencies.cpp line 1212: > >> 1210: }; >> 1211: >> 1212: PerfCounter* ClassHierarchyWalker::_perf_find_witness_anywhere_calls_count = NULL; > > Align '=' on all three or skip extra space I'm refering to the whitespace before '= NULL' ------------- PR: https://git.openjdk.java.net/jdk/pull/3386 From jiefu at openjdk.java.net Fri Apr 9 12:41:21 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 9 Apr 2021 12:41:21 GMT Subject: RFR: 8264945: Optimize the code-gen for Math.pow(x, 0.5) In-Reply-To: References: Message-ID: <8zL5le-W7Hdb86TOlgFEhr_7vb08pEdvQElS0wP1DoE=.b39b7f49-722e-4dfe-a39e-6480f20bc0cf@github.com> On Fri, 9 Apr 2021 11:57:06 GMT, Nils Eliasson wrote: > Looks good. Thanks @neliasso . ------------- PR: https://git.openjdk.java.net/jdk/pull/3404 From hseigel at openjdk.java.net Fri Apr 9 13:43:18 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Fri, 9 Apr 2021 13:43:18 GMT Subject: RFR: 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines [v4] In-Reply-To: <_q9nVMOqIePMWqifBZo_Cj9Vsal63l2BLN8X7x5S8iw=.abdcc58f-877d-4e36-b26e-3699cb05eeda@github.com> References: <_q9nVMOqIePMWqifBZo_Cj9Vsal63l2BLN8X7x5S8iw=.abdcc58f-877d-4e36-b26e-3699cb05eeda@github.com> Message-ID: On Fri, 9 Apr 2021 05:08:37 GMT, David Holmes wrote: >> The existing JRT_ENTRY (and related) macros require the function to which they are applied to declare a parameter "JavaThread* thread" which represents the current thread. These functions are all implicitly "traps" functions as they can result in exceptions, but they are not declared with TRAPS because the only caller of these functions is the runtime itself (via call_VM) and no callers need to be aware to use CHECK; further they need a JavaThread. So the macro declares the THREAD variable for use with other exception-producing functions and assigns it from "thread". >> >> The majority of this change replaces the parameter name "thread" with "current" so that it is clear that we are always dealing with the current thread. This affects the entry functions as well as the functions called therefrom. >> >> We can then also replace the use of "THREAD" with "current", in contexts that are not related to exception processing. >> >> Some methods called by entry functions were declared to have both a "thread" parameter and a "TRAPS" parameter - with nothing to tell you these are always the same, current, thread. So the "thread" parameter is removed and replaced with a local variable "current" obtained from THREAD->as_Java_thread(). >> >> Some missing CHECK_ uses were added. >> >> Testing: >> - tiers 1-3 >> >> Thanks, >> David > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Fix search&replace mistake Nice big cleanup! LGTM Harold ------------- Marked as reviewed by hseigel (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3394 From kvn at openjdk.java.net Fri Apr 9 17:13:17 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 9 Apr 2021 17:13:17 GMT Subject: RFR: 8264805: Remove the experimental Ahead-of-Time Compiler [v4] In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 04:32:14 GMT, David Holmes wrote: > Hi Vladimir, > > This looks good to me - I went through all the files. > > It was nice to see how nicely compartmentalized the AOT feature was - that made checking the changes quite simple. The one exception was the fingerprinting code, which for some reason was AOT-only but not marked that way, so I'm not sure what the story is there. ?? > > I was also surprised to see some of the flags were not marked experimental, so there will need to a CSR request to remove those without going through the normal deprecation process. > > Thanks, > David Thank you, David. We thought that we could use fingerprint code for other cases that is why we did not put it under `#if INCLUDE_AOT`. I filed CSR for AOT product flags removal: https://bugs.openjdk.java.net/browse/JDK-8265000 ------------- PR: https://git.openjdk.java.net/jdk/pull/3398 From kvn at openjdk.java.net Fri Apr 9 17:42:27 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 9 Apr 2021 17:42:27 GMT Subject: RFR: 8264805: Remove the experimental Ahead-of-Time Compiler [v4] In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 17:09:58 GMT, Vladimir Kozlov wrote: >> Hi Vladimir, >> >> This looks good to me - I went through all the files. >> >> It was nice to see how nicely compartmentalized the AOT feature was - that made checking the changes quite simple. The one exception was the fingerprinting code, which for some reason was AOT-only but not marked that way, so I'm not sure what the story is there. ?? >> >> I was also surprised to see some of the flags were not marked experimental, so there will need to a CSR request to remove those without going through the normal deprecation process. >> >> Thanks, >> David > >> Hi Vladimir, >> >> This looks good to me - I went through all the files. >> >> It was nice to see how nicely compartmentalized the AOT feature was - that made checking the changes quite simple. The one exception was the fingerprinting code, which for some reason was AOT-only but not marked that way, so I'm not sure what the story is there. ?? >> >> I was also surprised to see some of the flags were not marked experimental, so there will need to a CSR request to remove those without going through the normal deprecation process. >> >> Thanks, >> David > > Thank you, David. > We thought that we could use fingerprint code for other cases that is why we did not put it under `#if INCLUDE_AOT`. > I filed CSR for AOT product flags removal: https://bugs.openjdk.java.net/browse/JDK-8265000 Thank you, Igor Ignatyev, Igor Veresov, Ioi, Aleksey and Andrew for reviews. I will update changes based on your comments. ------------- PR: https://git.openjdk.java.net/jdk/pull/3398 From kvn at openjdk.java.net Fri Apr 9 17:45:20 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 9 Apr 2021 17:45:20 GMT Subject: RFR: 8264805: Remove the experimental Ahead-of-Time Compiler [v4] In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 02:44:23 GMT, David Holmes wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove exports from Graal module to jdk.aot > > src/hotspot/cpu/x86/compiledIC_x86.cpp line 134: > >> 132: #ifdef ASSERT >> 133: CodeBlob *cb = CodeCache::find_blob_unsafe((address) _call); >> 134: assert(cb, "sanity"); > > Nit: implied boolean - use "cb != NULL" done ------------- PR: https://git.openjdk.java.net/jdk/pull/3398 From kvn at openjdk.java.net Fri Apr 9 18:23:26 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 9 Apr 2021 18:23:26 GMT Subject: RFR: 8264805: Remove the experimental Ahead-of-Time Compiler [v4] In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 03:03:33 GMT, David Holmes wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove exports from Graal module to jdk.aot > > src/hotspot/share/memory/heap.hpp line 174: > >> 172: bool contains(const void* p) const { return low() <= p && p < high(); } >> 173: bool contains_blob(const CodeBlob* blob) const { >> 174: const void* start = (void*)blob; > > start seems redundant now Done: bool contains_blob(const CodeBlob* blob) const { - const void* start = (void*)blob; - return contains(start); + return contains((void*)blob); } ------------- PR: https://git.openjdk.java.net/jdk/pull/3398 From kvn at openjdk.java.net Fri Apr 9 18:23:25 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 9 Apr 2021 18:23:25 GMT Subject: RFR: 8264805: Remove the experimental Ahead-of-Time Compiler [v4] In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 08:29:21 GMT, Aleksey Shipilev wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove exports from Graal module to jdk.aot > > src/hotspot/cpu/x86/globalDefinitions_x86.hpp line 73: > >> 71: >> 72: #if INCLUDE_JVMCI >> 73: #define COMPRESSED_CLASS_POINTERS_DEPENDS_ON_COMPRESSED_OOPS (EnableJVMCI) > > Minor nit: can probably drop parentheses here. done ------------- PR: https://git.openjdk.java.net/jdk/pull/3398 From kvn at openjdk.java.net Fri Apr 9 18:32:26 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 9 Apr 2021 18:32:26 GMT Subject: RFR: 8264805: Remove the experimental Ahead-of-Time Compiler [v4] In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 08:32:59 GMT, Aleksey Shipilev wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove exports from Graal module to jdk.aot > > src/hotspot/share/code/compiledIC.cpp line 715: > >> 713: tty->print("interpreted"); >> 714: } else { >> 715: tty->print("unknown"); > > We can probably split this cleanup into the minor issue, your call. The benefit of separate issue: backportability. Reverted and filed https://bugs.openjdk.java.net/browse/JDK-8265013 ------------- PR: https://git.openjdk.java.net/jdk/pull/3398 From kvn at openjdk.java.net Fri Apr 9 18:36:24 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 9 Apr 2021 18:36:24 GMT Subject: RFR: 8264805: Remove the experimental Ahead-of-Time Compiler [v4] In-Reply-To: <32eWo34nJ7czxicNNgoww6GpOpg0jKq8NZY_pPeQPpI=.e698cb8a-78e7-40a2-b54e-9b29ce1bedb1@github.com> References: <32eWo34nJ7czxicNNgoww6GpOpg0jKq8NZY_pPeQPpI=.e698cb8a-78e7-40a2-b54e-9b29ce1bedb1@github.com> Message-ID: <8rFTTlGuCqN9XzBEbaAAf9R-YhTTqe45jv9Gh7F506w=.67e92697-68c4-4657-abb4-803231a9d1a9@github.com> On Fri, 9 Apr 2021 16:30:41 GMT, Igor Veresov wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove exports from Graal module to jdk.aot > > src/hotspot/share/oops/instanceKlass.hpp line 257: > >> 255: _misc_declares_nonstatic_concrete_methods = 1 << 6, // directly declares non-static, concrete methods >> 256: _misc_has_been_redefined = 1 << 7, // class has been redefined >> 257: _unused = 1 << 8, // > > Any particular reason we don't want to remove this gap? Less changes. We don't get any benefits from removing it. It is opposite - if we need a new value we will use it without changing following values again. ------------- PR: https://git.openjdk.java.net/jdk/pull/3398 From jbhateja at openjdk.java.net Fri Apr 9 18:56:43 2021 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Fri, 9 Apr 2021 18:56:43 GMT Subject: RFR: 8264954: unified handling for VectorMask object re-materialization during de-optimization [v2] In-Reply-To: References: Message-ID: > Following flow describes object reconstruction for de-optimization:- > 1) PhaseVector::scalarize_vbox_node() creates SafePointScalarObjectNode to captures the box type information, also it connects to node holding the boxed value. > 2) During code emit phase (PhaseOutput) C2 process above information to dumps ObjectValue holding the box information and LocationValue to holding the value information into ScopeDescriptor corresponding to Safepoint PC. > 3) De-optimization blobs dump the value held in registers to the stack locations using RegisterSave::save_live_registers() and a mapping b/w register and its stack location is added to RegisterMap. > 4) During de-optimization, compiled frame objects are re-allocated using identity information held in ObjectValue and their fields are initialized using values held in the stack locations accessed through register-stack mappings. > > By inserting a VectorStoreMaskNode before stitching the mask holding node to Safepoint we make sure that value held in opmask/vector register is transferred to a byte vector. Thus rest of the flow works as it is, stack location will hold the value in the form of a byte array irrespective of the box shape. > > tier1-tier3 regressions are clean with UseAVX=2/3. Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: 8264954: Review comments resolution. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3408/files - new: https://git.openjdk.java.net/jdk/pull/3408/files/39312ade..239923fd Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3408&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3408&range=00-01 Stats: 41 lines in 3 files changed: 2 ins; 13 del; 26 mod Patch: https://git.openjdk.java.net/jdk/pull/3408.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3408/head:pull/3408 PR: https://git.openjdk.java.net/jdk/pull/3408 From jbhateja at openjdk.java.net Fri Apr 9 18:56:44 2021 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Fri, 9 Apr 2021 18:56:44 GMT Subject: RFR: 8264954: unified handling for VectorMask object re-materialization during de-optimization [v2] In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 11:15:14 GMT, Vladimir Ivanov wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> 8264954: Review comments resolution. > > Very nice! I like your idea to unify mask representation. It simplifies implementation a lot. > But it also means there'll be 2 representations kept alive for masks with debug usages and burns 2 registers (speaking of x86, either 1 predicate + 1 vector registers on AVX512 or 2 vector regsiters on pre-AVX512). Do you see any problems with that? (It can be improved later if it turns out to be a problem in practice.) Thanks @iwanowww , your comments are resolved. ------------- PR: https://git.openjdk.java.net/jdk/pull/3408 From kvn at openjdk.java.net Fri Apr 9 19:09:25 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 9 Apr 2021 19:09:25 GMT Subject: RFR: 8264805: Remove the experimental Ahead-of-Time Compiler [v4] In-Reply-To: <32eWo34nJ7czxicNNgoww6GpOpg0jKq8NZY_pPeQPpI=.e698cb8a-78e7-40a2-b54e-9b29ce1bedb1@github.com> References: <32eWo34nJ7czxicNNgoww6GpOpg0jKq8NZY_pPeQPpI=.e698cb8a-78e7-40a2-b54e-9b29ce1bedb1@github.com> Message-ID: On Fri, 9 Apr 2021 16:34:58 GMT, Igor Veresov wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove exports from Graal module to jdk.aot > > src/hotspot/share/jvmci/jvmciCodeInstaller.cpp line 1184: > >> 1182: } >> 1183: } else if (jvmci_env()->isa_HotSpotMetaspaceConstantImpl(constant)) { >> 1184: if (!_immutable_pic_compilation) { > > All _immutable_pic_compilation mentions can be removed as well. It is true only for AOT compiles produced by Graal. It's never going to be used without AOT. Done. I removed _immutable_pic_compilation in JVMCI in Hotspot. ------------- PR: https://git.openjdk.java.net/jdk/pull/3398 From vlivanov at openjdk.java.net Fri Apr 9 19:20:28 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Fri, 9 Apr 2021 19:20:28 GMT Subject: RFR: 8264954: unified handling for VectorMask object re-materialization during de-optimization [v2] In-Reply-To: References: Message-ID: <4jMKSSVNKx5Fh7hek0pT5MMIYAnTjB7y7faOh_L60Pw=.2ce391ed-b29c-4a05-a412-bb3935126b2f@github.com> On Fri, 9 Apr 2021 18:56:43 GMT, Jatin Bhateja wrote: >> Following flow describes object reconstruction for de-optimization:- >> 1) PhaseVector::scalarize_vbox_node() creates SafePointScalarObjectNode to captures the box type information, also it connects to node holding the boxed value. >> 2) During code emit phase (PhaseOutput) C2 process above information to dumps ObjectValue holding the box information and LocationValue to holding the value information into ScopeDescriptor corresponding to Safepoint PC. >> 3) De-optimization blobs dump the value held in registers to the stack locations using RegisterSave::save_live_registers() and a mapping b/w register and its stack location is added to RegisterMap. >> 4) During de-optimization, compiled frame objects are re-allocated using identity information held in ObjectValue and their fields are initialized using values held in the stack locations accessed through register-stack mappings. >> >> By inserting a VectorStoreMaskNode before stitching the mask holding node to Safepoint we make sure that value held in opmask/vector register is transferred to a byte vector. Thus rest of the flow works as it is, stack location will hold the value in the form of a byte array irrespective of the box shape. >> >> tier1-tier3 regressions are clean with UseAVX=2/3. > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > 8264954: Review comments resolution. Looks good. Minor comments follow. src/hotspot/share/opto/vector.cpp line 270: > 268: bool is_mask = is_vector_mask(iklass); > 269: if (is_mask) { > 270: if (vec_value->Opcode() != Op_VectorStoreMask) { It's better to do the check against the type of `vec_value`: `VectorStoreMask` produces a vector of booleans and you can guard `gvn.transform(VectorStoreMaskNode::make(...))` call with `bt != T_BOOLEAN` check instead. src/hotspot/share/prims/vectorSupport.cpp line 98: > 96: // and safepoint node always ensures the existence of masks in a boolean array. > 97: // > 98: // TODO: revisit when predicate registers are fully supported. TODO line can be removed now. src/hotspot/share/prims/vectorSupport.cpp line 102: > 100: void VectorSupport::init_payload_element(typeArrayOop arr, BasicType elem_bt, int index, address addr) { > 101: switch (elem_bt) { > 102: case T_BOOLEAN: Please, use `bool_at_put()` accessor here (just for clarity): case T_BOOLEAN: arr->bool_at_put(index, *(jboolean*)addr); break; ------------- Marked as reviewed by vlivanov (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3408 From kvn at openjdk.java.net Fri Apr 9 19:26:22 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 9 Apr 2021 19:26:22 GMT Subject: RFR: 8264805: Remove the experimental Ahead-of-Time Compiler [v4] In-Reply-To: <3zISek5YyT0zkmPX3UZtXKy_r63eOZoW6emV3SuRjPg=.4b805fb5-80c9-4b76-92f8-43f2335c525b@github.com> References: <3zISek5YyT0zkmPX3UZtXKy_r63eOZoW6emV3SuRjPg=.4b805fb5-80c9-4b76-92f8-43f2335c525b@github.com> Message-ID: <9RiwxlBLMiREzNvRHU14RQBW33nieUT8-Pmkod_GvtA=.ad51b2f9-f244-4346-8844-9fee39c9aa5b@github.com> On Fri, 9 Apr 2021 16:54:35 GMT, Ioi Lam wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove exports from Graal module to jdk.aot > > src/hotspot/share/oops/methodCounters.cpp line 77: > >> 75: } >> 76: >> 77: void MethodCounters::metaspace_pointers_do(MetaspaceClosure* it) { > > MethodCounters::metaspace_pointers_do can be removed altogether (also need to remove the declaration in methodCounter.hpp). Done. ------------- PR: https://git.openjdk.java.net/jdk/pull/3398 From kvn at openjdk.java.net Fri Apr 9 20:07:16 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 9 Apr 2021 20:07:16 GMT Subject: RFR: 8264649: runtime/InternalApi/ThreadCpuTimesDeadlock.java crash in fastdebug C2 with -XX:-UseTLAB [v2] In-Reply-To: <03pdWufgmVqBPCnhdEvHJnRHkblBlxSiAUsG4Zjpg30=.a17a5efa-8f65-4933-994a-f176f9c45cfb@github.com> References: <6BN6FtOuI_DuZS8Zpuy2k5xTSu10uXZjrfxMEt6e978=.b5e1ee6b-7dba-413c-9c04-bbf66098b588@github.com> <03pdWufgmVqBPCnhdEvHJnRHkblBlxSiAUsG4Zjpg30=.a17a5efa-8f65-4933-994a-f176f9c45cfb@github.com> Message-ID: On Wed, 7 Apr 2021 12:00:10 GMT, Hui Shi wrote: >> src/hotspot/share/opto/phaseX.cpp line 1481: >> >>> 1479: // Smash all inputs to 'old', isolating him completely >>> 1480: Node *temp = new Node(1); >>> 1481: temp->init_req(0,nn); // Add a use to nn to prevent him from dying >> >> Just wondering, why do we even need this? Without that code, `remove_dead_node(old)` would kill `nn` as well if it becomes dead. > > Thanks for your review! > > Checking code history, this code is quite old. From comments around, it doesn't want nn removed directly in PhaseIterGVN::subsume_node and leaves optimization to next GVN iteration. > > In my understanding, it might save callers to insert codes checking if "nn" is removed/dead at every subsume_node/replace_node callsite, simplifies implementation. > > 8153779a hotspot/src/share/vm/opto/phaseX.cpp (J. Duke 2007-12-01 00:00:00 +0000 1479) // Smash all inputs to 'old', isolating him completely > 2a0815a5 hotspot/src/share/vm/opto/phaseX.cpp (Tobias Hartmann 2014-06-02 08:07:29 +0200 1480) Node *temp = new Node(1); > 8153779a hotspot/src/share/vm/opto/phaseX.cpp (J. Duke 2007-12-01 00:00:00 +0000 1481) temp->init_req(0,nn); // Add a use to nn to prevent him from dying > 8153779a hotspot/src/share/vm/opto/phaseX.cpp (J. Duke 2007-12-01 00:00:00 +0000 1482) remove_dead_node( old ); > 8153779a hotspot/src/share/vm/opto/phaseX.cpp (J. Duke 2007-12-01 00:00:00 +0000 1483) temp->del_req(0); // Yank bogus edge Note `nn` could be new node created during parsing which does not have users yet. that is why we need to keep it ------------- PR: https://git.openjdk.java.net/jdk/pull/3336 From kvn at openjdk.java.net Fri Apr 9 20:32:20 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 9 Apr 2021 20:32:20 GMT Subject: RFR: 8264649: runtime/InternalApi/ThreadCpuTimesDeadlock.java crash in fastdebug C2 with -XX:-UseTLAB [v2] In-Reply-To: References: Message-ID: On Wed, 7 Apr 2021 12:09:05 GMT, Hui Shi wrote: >> Please help review this fix. Detailed analysis in https://bugs.openjdk.java.net/browse/JDK-8264649 >> >> Tier1/2 release /fastdebug passed > > Hui Shi has updated the pull request incrementally with one additional commit since the last revision: > > fix typo in test description I agree with change. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3336 From mchung at openjdk.java.net Fri Apr 9 20:53:33 2021 From: mchung at openjdk.java.net (Mandy Chung) Date: Fri, 9 Apr 2021 20:53:33 GMT Subject: RFR: 8264805: Remove the experimental Ahead-of-Time Compiler [v4] In-Reply-To: References: Message-ID: On Thu, 8 Apr 2021 17:24:38 GMT, Vladimir Kozlov wrote: >> As part of [JEP 410](http://openjdk.java.net/jeps/410) remove code related to Ahead-of-Time Compiler from JDK: >> >> - `jdk.aot` module ? the `jaotc` tool >> - `src/hotspot/share/aot` ? loads AoT compiled code into VM for execution >> - related HotSpot code guarded by `#if INCLUDE_AOT` >> >> Additionally, remove tests as well as code in makefiles related to AoT compilation. >> >> Tested hs-tier1-4 > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Remove exports from Graal module to jdk.aot I reviewed the module-loader-map and non-hotspot non-AOT tests. ------------- Marked as reviewed by mchung (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3398 From kvn at openjdk.java.net Fri Apr 9 21:59:04 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 9 Apr 2021 21:59:04 GMT Subject: RFR: 8264805: Remove the experimental Ahead-of-Time Compiler [v5] In-Reply-To: References: Message-ID: > As part of [JEP 410](http://openjdk.java.net/jeps/410) remove code related to Ahead-of-Time Compiler from JDK: > > - `jdk.aot` module ? the `jaotc` tool > - `src/hotspot/share/aot` ? loads AoT compiled code into VM for execution > - related HotSpot code guarded by `#if INCLUDE_AOT` > > Additionally, remove tests as well as code in makefiles related to AoT compilation. > > Tested hs-tier1-4 Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: Address review comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3398/files - new: https://git.openjdk.java.net/jdk/pull/3398/files/6cce0f6c..71a166c1 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3398&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3398&range=03-04 Stats: 36 lines in 9 files changed: 0 ins; 27 del; 9 mod Patch: https://git.openjdk.java.net/jdk/pull/3398.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3398/head:pull/3398 PR: https://git.openjdk.java.net/jdk/pull/3398 From iveresov at openjdk.java.net Fri Apr 9 22:02:33 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Fri, 9 Apr 2021 22:02:33 GMT Subject: RFR: 8264805: Remove the experimental Ahead-of-Time Compiler [v5] In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 21:59:04 GMT, Vladimir Kozlov wrote: >> As part of [JEP 410](http://openjdk.java.net/jeps/410) remove code related to Ahead-of-Time Compiler from JDK: >> >> - `jdk.aot` module ? the `jaotc` tool >> - `src/hotspot/share/aot` ? loads AoT compiled code into VM for execution >> - related HotSpot code guarded by `#if INCLUDE_AOT` >> >> Additionally, remove tests as well as code in makefiles related to AoT compilation. >> >> Tested hs-tier1-4 > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments Marked as reviewed by iveresov (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3398 From vlivanov at openjdk.java.net Fri Apr 9 22:07:21 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Fri, 9 Apr 2021 22:07:21 GMT Subject: RFR: 8264871: Dependencies: Miscellaneous cleanups in dependencies.cpp In-Reply-To: References: Message-ID: <5oO0lwHfxXsX5SQJmylkqnTpPDsYILhwyBP4-BhXWe4=.d87704eb-2551-4a67-867a-22acbfc725fb@github.com> On Fri, 9 Apr 2021 11:10:33 GMT, Nils Eliasson wrote: >> Miscellaneous cleanups in dependencies.cpp. >> >> Testing: >> * [x] hs-tier1 - hs-tier4 > > Looks good. Thanks for the review, Nils. ------------- PR: https://git.openjdk.java.net/jdk/pull/3385 From vlivanov at openjdk.java.net Fri Apr 9 22:07:22 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Fri, 9 Apr 2021 22:07:22 GMT Subject: Integrated: 8264871: Dependencies: Miscellaneous cleanups in dependencies.cpp In-Reply-To: References: Message-ID: On Wed, 7 Apr 2021 21:51:30 GMT, Vladimir Ivanov wrote: > Miscellaneous cleanups in dependencies.cpp. > > Testing: > * [x] hs-tier1 - hs-tier4 This pull request has now been integrated. Changeset: 07c8ff47 Author: Vladimir Ivanov URL: https://git.openjdk.java.net/jdk/commit/07c8ff47 Stats: 156 lines in 2 files changed: 70 ins; 58 del; 28 mod 8264871: Dependencies: Miscellaneous cleanups in dependencies.cpp Reviewed-by: neliasso ------------- PR: https://git.openjdk.java.net/jdk/pull/3385 From vlivanov at openjdk.java.net Fri Apr 9 22:14:05 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Fri, 9 Apr 2021 22:14:05 GMT Subject: RFR: 8264872: Dependencies: Migrate to PerfData counters [v2] In-Reply-To: References: <6G6POLSvOArZJ0iXvZW2PDx1ps6xJLTnNe20nwcOyVA=.a254144b-8fc9-494d-9655-ba68638469da@github.com> Message-ID: On Fri, 9 Apr 2021 11:42:19 GMT, Nils Eliasson wrote: >> Vladimir Ivanov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: >> >> - Merge branch 'master' into 8264872.perf_counters >> - Formatting >> - CountingClassHierarchyIterator >> - Migrate to PerfData >> - Depenencies perf counters >> - KlassDepChange::involves_context >> - KlassDepChange::_new_type >> - Dependencies::is_concrete_method >> - Dependencies::verify_method_context >> - int -> uint > > Looks good. Thanks for the reviews, Vladimir and Nils. ------------- PR: https://git.openjdk.java.net/jdk/pull/3386 From vlivanov at openjdk.java.net Fri Apr 9 22:14:00 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Fri, 9 Apr 2021 22:14:00 GMT Subject: RFR: 8264872: Dependencies: Migrate to PerfData counters [v2] In-Reply-To: <6G6POLSvOArZJ0iXvZW2PDx1ps6xJLTnNe20nwcOyVA=.a254144b-8fc9-494d-9655-ba68638469da@github.com> References: <6G6POLSvOArZJ0iXvZW2PDx1ps6xJLTnNe20nwcOyVA=.a254144b-8fc9-494d-9655-ba68638469da@github.com> Message-ID: > Migrate performance counters on `Dependencies` (`deps_find_witness_*`) to `PerfData`. > > Testing: > - [x] hs-tier1 - hs-tier2 Vladimir Ivanov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: - Merge branch 'master' into 8264872.perf_counters - Formatting - CountingClassHierarchyIterator - Migrate to PerfData - Depenencies perf counters - KlassDepChange::involves_context - KlassDepChange::_new_type - Dependencies::is_concrete_method - Dependencies::verify_method_context - int -> uint ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3386/files - new: https://git.openjdk.java.net/jdk/pull/3386/files/3567e0f0..a810a10e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3386&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3386&range=00-01 Stats: 5774 lines in 162 files changed: 4269 ins; 1012 del; 493 mod Patch: https://git.openjdk.java.net/jdk/pull/3386.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3386/head:pull/3386 PR: https://git.openjdk.java.net/jdk/pull/3386 From vlivanov at openjdk.java.net Fri Apr 9 22:19:26 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Fri, 9 Apr 2021 22:19:26 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v3] In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 07:43:50 GMT, Xiaohong Gong wrote: >> The Vector API defines different element types for floating point VectorMask. For example, the bitwise related APIs use "`long/int`", while data related APIs use "`double/float`". This makes some optimizations that based on the type checking cannot work well. >> >> For example, the VectorBox/Unbox elimination like `"VectorUnbox (VectorBox v) ==> v"` requires the types of output and >> input are equal. Normally this is necessary. However, due to the different element type for floating point VectorMask with the same species, the VectorBox/Unbox pattern is optimized to: >> VectorLoadMask (VectorStoreMask vmask) >> Actually the types can be treated as the same one for such cases. And considering the vector mask representation is the same for >> vectors with the same element size and vector length, it's safe to do the optimization: >> VectorLoadMask (VectorStoreMask vmask) ==> vmask >> The same issue exists for GVN which is based on the type of a node. Making the mask node's` hash()/cmp()` methods depends on the element size instead of the element type can fix it. > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Revert changes to VectorLoadMask and add a VectorMaskCast for the same element size mask casting Overall, looks good. src/hotspot/share/opto/vectornode.cpp line 1236: > 1234: if (is_vector_mask) { > 1235: if (in_vt->length_in_bytes() == out_vt->length_in_bytes() && > 1236: Matcher::match_rule_supported(Op_VectorMaskCast)) { There's `Matcher::match_rule_supported_vector()` specifically for vector nodes. src/hotspot/share/opto/vectornode.cpp line 1237: > 1235: if (in_vt->length_in_bytes() == out_vt->length_in_bytes() && > 1236: Matcher::match_rule_supported(Op_VectorMaskCast)) { > 1237: // VectorUnbox (VectorBox vmask) ==> VectorMaskCast (vmask) It's better to implement it as a 2-step transformation and place it in `VectorLoadMaskNode::Ideal()`: VectorUnbox (VectorBox vmask) ==> VectorLoadMask (VectorStoreMask vmask) => VectorMaskCast (vmask) src/hotspot/share/opto/vectornode.hpp line 1247: > 1245: assert(type2aelembytes(in_vt->element_basic_type()) == type2aelembytes(vt->element_basic_type()), "element size must match"); > 1246: } > 1247: FTR there's a way to avoid platform-specific changes (in AD files) if we don't envision any non-trivial conversions to be supported: just disable matching of the node by overriding `Node::ideal_reg()`: virtual uint ideal_reg() const { return 0; } // not matched in the AD file ------------- Marked as reviewed by vlivanov (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3238 From vlivanov at openjdk.java.net Fri Apr 9 22:22:30 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Fri, 9 Apr 2021 22:22:30 GMT Subject: Integrated: 8264872: Dependencies: Migrate to PerfData counters In-Reply-To: <6G6POLSvOArZJ0iXvZW2PDx1ps6xJLTnNe20nwcOyVA=.a254144b-8fc9-494d-9655-ba68638469da@github.com> References: <6G6POLSvOArZJ0iXvZW2PDx1ps6xJLTnNe20nwcOyVA=.a254144b-8fc9-494d-9655-ba68638469da@github.com> Message-ID: On Wed, 7 Apr 2021 22:07:18 GMT, Vladimir Ivanov wrote: > Migrate performance counters on `Dependencies` (`deps_find_witness_*`) to `PerfData`. > > Testing: > - [x] hs-tier1 - hs-tier2 This pull request has now been integrated. Changeset: 76bd313d Author: Vladimir Ivanov URL: https://git.openjdk.java.net/jdk/commit/76bd313d Stats: 121 lines in 4 files changed: 61 ins; 37 del; 23 mod 8264872: Dependencies: Migrate to PerfData counters Reviewed-by: kvn, neliasso ------------- PR: https://git.openjdk.java.net/jdk/pull/3386 From vlivanov at openjdk.java.net Fri Apr 9 22:41:42 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Fri, 9 Apr 2021 22:41:42 GMT Subject: RFR: 8264873: Dependencies: Split ClassHierarchyWalker [v3] In-Reply-To: References: Message-ID: > Split ClassHierarchyWalker into ConcreteMethodFinder and ConcreteSubtypeFinder. > > Testing: > - [x] hs-tier1 - hs-tier4 Vladimir Ivanov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: - Merge branch 'master' into 8264873.cha.split - Fix formatting - Split ClassHierarchyWalker into ConcreteMethodFinder and ConcreteSubtypeFinder - CountingClassHierarchyIterator - Migrate to PerfData - Depenencies perf counters - KlassDepChange::involves_context - KlassDepChange::_new_type - Dependencies::is_concrete_method - Dependencies::verify_method_context - ... and 1 more: https://git.openjdk.java.net/jdk/compare/76bd313d...e38b995c ------------- Changes: https://git.openjdk.java.net/jdk/pull/3387/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3387&range=02 Stats: 437 lines in 1 file changed: 134 ins; 144 del; 159 mod Patch: https://git.openjdk.java.net/jdk/pull/3387.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3387/head:pull/3387 PR: https://git.openjdk.java.net/jdk/pull/3387 From kvn at openjdk.java.net Fri Apr 9 22:42:20 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 9 Apr 2021 22:42:20 GMT Subject: RFR: 8264945: Optimize the code-gen for Math.pow(x, 0.5) In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 02:19:10 GMT, Jie Fu wrote: > Hi all, > > I'd like to optimize the code-gen for Math.pow(x, 0.5). > And 7x ~ 14x performance improvement is observed by the jmh micro-benchmarks. > > While I was optimizing a machine learning program, I found both Math.pow(x, 2) and Math.pow(x, 0.5) are used. > To my surprise, C2 just optimizes the case for Math.pow(x, 2) [1], but still not for Math.pow(x, 0.5) yet. > > The patch just replace Math.pow(x, 0.5) with Math.sqrt(x). > > Before: > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble0Dot5 0 thrpt 8 45525.117 ? 11.686 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.031 ? 0.001 ops/ms > > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble0Dot5 0 thrpt 8 45509.317 ? 6.581 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.031 ? 0.001 ops/ms > > After: > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble0Dot5 0 thrpt 8 343354.892 ? 362.900 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.457 ? 0.001 ops/mso > > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble0Dot5 0 thrpt 8 343421.559 ? 49.326 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.457 ? 0.001 ops/ms > > Testing: > - tier1~3 on Linux/x64 > > Thanks, > Best regards, > Jie > > [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/library_call.cpp#L1680 Please, verify that result is the same when run with -Xint (Interpreter only) and (-XX:TieredStopAtLevel=1) C1 only. May be they need the same optimization. ------------- PR: https://git.openjdk.java.net/jdk/pull/3404 From xliu at openjdk.java.net Fri Apr 9 23:10:26 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Fri, 9 Apr 2021 23:10:26 GMT Subject: RFR: 8264795: IGV: Upgrade NetBeans platform [v2] In-Reply-To: References: Message-ID: On Wed, 7 Apr 2021 11:59:28 GMT, Roberto Casta?eda Lozano wrote: >> This change upgrades the NetBeans platform on which IGV is based to its latest version ([12.3](https://netbeans.apache.org/download/nb123/index.html)) and switches IGV's build system from Ant to Maven. The upgrade introduces support for a wide range of JDK versions (from 8 to 15, the latest version supported by NetBeans 12.3), and the switch from Ant to Maven makes the IGV build simpler, faster (first-time build is approximately 5x faster), and more stable (all dependencies are fetched directly from the Maven central repository). >> >> The change also fixes broken unit tests in the Data module and runs them by default when building. >> >> #### Testing >> >> Regression-tested the following use cases manually on all combinations of (Linux, Windows, macOS) and (JDK 8, JDK 11, JDK 15): >> >> - build >> - open graph file (small.xml in [test-input.zip](https://bugs.openjdk.java.net/secure/attachment/93988/test-input.zip)) >> - import graphs via network (localhost) >> - expand groups in outline >> - open a graph >> - open a clone >> - zoom in and out >> - search a node >> - apply filters >> - extract a node >> - show all nodes >> - select nodes corresponding to a bytecode >> - view control flow >> - select nodes corresponding to a basic block >> - cluster nodes >> - show satellite view >> - compute the difference of two graphs >> - change node text >> - remove a group >> - save groups into a file >> - open a large graph file (large.xml in [test-input.zip](https://bugs.openjdk.java.net/secure/attachment/93988/test-input.zip)) >> - open a large graph ("After Escape Analysis" in large.xml) >> >> Thanks to Vladimir Ivanov for helping with testing on macOS. > > Roberto Casta?eda Lozano has updated the pull request incrementally with three additional commits since the last revision: > > - Document how to build and run on a specific JDK > - Use lambdas to define runnables > - Document latest JDK version supported Thank you for modernizing IGV. I manage to import this project to IntelliJ. The process is hassle-free! I also ran using jdk8/jdk11 on macOS. I haven't identified any problem so far. src/utils/IdealGraphVisualizer/Filter/pom.xml line 93: > 91: > 92: > 93: [15,) Does this mean that module `Filter` will depend on external nashorn after jdk15? I know that jdk doesn't remove nashorn util jdk15, but does it make sense that IGV always picks up external nashorn. My concern is "javascript" in jdk8/11 may be not well-maintained and leave vulnerability behind. ------------- Marked as reviewed by xliu (no project role). PR: https://git.openjdk.java.net/jdk/pull/3361 From hshi at openjdk.java.net Fri Apr 9 23:59:29 2021 From: hshi at openjdk.java.net (Hui Shi) Date: Fri, 9 Apr 2021 23:59:29 GMT Subject: RFR: 8264649: runtime/InternalApi/ThreadCpuTimesDeadlock.java crash in fastdebug C2 with -XX:-UseTLAB [v2] In-Reply-To: <-63abPXoiWhcD8FhNKDptvxLVKYCLjKW5WtAGKUVg4U=.9b9367a3-202f-418c-b049-dad477aa6f2f@github.com> References: <-63abPXoiWhcD8FhNKDptvxLVKYCLjKW5WtAGKUVg4U=.9b9367a3-202f-418c-b049-dad477aa6f2f@github.com> Message-ID: On Fri, 9 Apr 2021 07:01:01 GMT, Tobias Hartmann wrote: >> Hui Shi has updated the pull request incrementally with one additional commit since the last revision: >> >> fix typo in test description > > Thanks for the details. Your fix looks reasonable to me. Thanks! @TobiHartmann @vnkozlov ------------- PR: https://git.openjdk.java.net/jdk/pull/3336 From hshi at openjdk.java.net Sat Apr 10 00:07:34 2021 From: hshi at openjdk.java.net (Hui Shi) Date: Sat, 10 Apr 2021 00:07:34 GMT Subject: Integrated: 8264649: runtime/InternalApi/ThreadCpuTimesDeadlock.java crash in fastdebug C2 with -XX:-UseTLAB In-Reply-To: References: Message-ID: On Sat, 3 Apr 2021 00:53:54 GMT, Hui Shi wrote: > Please help review this fix. Detailed analysis in https://bugs.openjdk.java.net/browse/JDK-8264649 > > Tier1/2 release /fastdebug passed This pull request has now been integrated. Changeset: 42f4d706 Author: Hui Shi Committer: Jie Fu URL: https://git.openjdk.java.net/jdk/commit/42f4d706 Stats: 13 lines in 2 files changed: 12 ins; 0 del; 1 mod 8264649: runtime/InternalApi/ThreadCpuTimesDeadlock.java crash in fastdebug C2 with -XX:-UseTLAB Reviewed-by: thartmann, kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/3336 From dlong at openjdk.java.net Sat Apr 10 04:05:27 2021 From: dlong at openjdk.java.net (Dean Long) Date: Sat, 10 Apr 2021 04:05:27 GMT Subject: RFR: 8264805: Remove the experimental Ahead-of-Time Compiler [v4] In-Reply-To: <3zISek5YyT0zkmPX3UZtXKy_r63eOZoW6emV3SuRjPg=.4b805fb5-80c9-4b76-92f8-43f2335c525b@github.com> References: <3zISek5YyT0zkmPX3UZtXKy_r63eOZoW6emV3SuRjPg=.4b805fb5-80c9-4b76-92f8-43f2335c525b@github.com> Message-ID: On Fri, 9 Apr 2021 16:54:51 GMT, Ioi Lam wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove exports from Graal module to jdk.aot > > LGTM. Just a small nit. @iklam I thought the fingerprint code was also used by CDS. ------------- PR: https://git.openjdk.java.net/jdk/pull/3398 From iklam at openjdk.java.net Sat Apr 10 06:08:33 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Sat, 10 Apr 2021 06:08:33 GMT Subject: RFR: 8264805: Remove the experimental Ahead-of-Time Compiler [v4] In-Reply-To: <3zISek5YyT0zkmPX3UZtXKy_r63eOZoW6emV3SuRjPg=.4b805fb5-80c9-4b76-92f8-43f2335c525b@github.com> References: <3zISek5YyT0zkmPX3UZtXKy_r63eOZoW6emV3SuRjPg=.4b805fb5-80c9-4b76-92f8-43f2335c525b@github.com> Message-ID: On Fri, 9 Apr 2021 16:54:51 GMT, Ioi Lam wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove exports from Graal module to jdk.aot > > LGTM. Just a small nit. > @iklam > I thought the fingerprint code was also used by CDS. CDS actually uses a different mechanism (CRC of the classfile bytes) to validate that a class has not changed between archive dumping time and runtime. See https://github.com/openjdk/jdk/blob/5784f6b7f74d0b8081ac107fea172539d57d6020/src/hotspot/share/classfile/systemDictionaryShared.cpp#L1126-L1130 ------------- PR: https://git.openjdk.java.net/jdk/pull/3398 From ogatak at openjdk.java.net Sat Apr 10 12:30:33 2021 From: ogatak at openjdk.java.net (Kazunori Ogata) Date: Sat, 10 Apr 2021 12:30:33 GMT Subject: RFR: 8259822: [PPC64] Support the prefixed instruction format added in POWER10 [v8] In-Reply-To: References: Message-ID: On Thu, 25 Mar 2021 10:50:45 GMT, Martin Doerr wrote: >> Kazunori Ogata has updated the pull request incrementally with one additional commit since the last revision: >> >> Clean up compute_padding() and fix grammatical errors > > Looks good to me, now. Is the latest version substantially tested? We need to rely on IBMs testing because nobody else has Power10. @TheRealMDoerr Thank you for sponsoring this patch. ------------- PR: https://git.openjdk.java.net/jdk/pull/2095 From iignatyev at openjdk.java.net Sat Apr 10 15:28:35 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Sat, 10 Apr 2021 15:28:35 GMT Subject: RFR: 8264806: Remove the experimental JIT compiler In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 22:30:32 GMT, Vladimir Kozlov wrote: >> As part of [JEP 410](http://openjdk.java.net/jeps/410) remove code related to Java-based JIT compiler (Graal) from JDK: >> >> - `jdk.internal.vm.compiler` ? the Graal compiler >> - `jdk.internal.vm.compiler.management` ? Graal's `MBean` >> - `test/hotspot/jtreg/compiler/graalunit` ? Graal's unit tests >> >> Remove Graal related code in makefiles. >> >> Note, next two `module-info.java` files are preserved so that the JVMCI module `jdk.internal.vm.ci` continues to build: >> src/jdk.internal.vm.compiler/share/classes/module-info.java >> src/jdk.internal.vm.compiler.management/share/classes/module-info.java >> >> @AlanBateman suggested that we can avoid it by using Module API to export packages at runtime . It requires changes in GraalVM's JVMCI too so I will file followup RFE to implement it. >> >> Tested hs-tier1-4 > > Thankyou, @erikj79, for review. I restored code as you asked. > For some reasons incremental webrev shows update only in Cdiffs. none of the full webrevs seem to render even the list of changed files? is it just me? ------------- PR: https://git.openjdk.java.net/jdk/pull/3421 From iignatyev at openjdk.java.net Sat Apr 10 15:41:44 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Sat, 10 Apr 2021 15:41:44 GMT Subject: RFR: 8264806: Remove the experimental JIT compiler [v2] In-Reply-To: <_PS0KHvkB_l9YrKjZ7wLAiKb-SK761YFub4XU5mrBRc=.c32572f6-063f-4503-a20f-aa6b9115f808@github.com> References: <_PS0KHvkB_l9YrKjZ7wLAiKb-SK761YFub4XU5mrBRc=.c32572f6-063f-4503-a20f-aa6b9115f808@github.com> Message-ID: On Fri, 9 Apr 2021 22:26:40 GMT, Vladimir Kozlov wrote: >> As part of [JEP 410](http://openjdk.java.net/jeps/410) remove code related to Java-based JIT compiler (Graal) from JDK: >> >> - `jdk.internal.vm.compiler` ? the Graal compiler >> - `jdk.internal.vm.compiler.management` ? Graal's `MBean` >> - `test/hotspot/jtreg/compiler/graalunit` ? Graal's unit tests >> >> Remove Graal related code in makefiles. >> >> Note, next two `module-info.java` files are preserved so that the JVMCI module `jdk.internal.vm.ci` continues to build: >> src/jdk.internal.vm.compiler/share/classes/module-info.java >> src/jdk.internal.vm.compiler.management/share/classes/module-info.java >> >> @AlanBateman suggested that we can avoid it by using Module API to export packages at runtime . It requires changes in GraalVM's JVMCI too so I will file followup RFE to implement it. >> >> Tested hs-tier1-4 > > Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Restore Graal Builder image makefile > - Merge latest changes from branch 'JDK-8264805' into JDK-8264806 > - 8264806: Remove the experimental JIT compiler should we remove `sun.hotspot.code.Compiler::isGraalEnabled` method and update a few of its users accordingly? what about `vm.graal.enabled` `@requires` property? ------------- Changes requested by iignatyev (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3421 From kvn at openjdk.java.net Sat Apr 10 16:35:40 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Sat, 10 Apr 2021 16:35:40 GMT Subject: RFR: 8264806: Remove the experimental JIT compiler [v2] In-Reply-To: References: <_PS0KHvkB_l9YrKjZ7wLAiKb-SK761YFub4XU5mrBRc=.c32572f6-063f-4503-a20f-aa6b9115f808@github.com> Message-ID: On Sat, 10 Apr 2021 15:38:11 GMT, Igor Ignatyev wrote: > should we remove `sun.hotspot.code.Compiler::isGraalEnabled` method and update a few of its users accordingly? > what about `vm.graal.enabled` `@requires` property? Thank you, @iignatev for looking on changes. I forgot to mention that `Compiler::isGraalEnabled()` returns always false now. Because 94 tests uses `@requires !vm.graal.enabled` I don't want to include them in these changes which are already very big. I am not sure if I should modify tests if GraalVM group wants to run all these tests. Unfortunately changes in `Compiler.java` are listed the last on `Files changed` tab and GitHub has trouble to load these big changes - it takes time to see them. Here `Compiler.java` chnges for review: diff --git a/test/lib/sun/hotspot/code/Compiler.java b/test/lib/sun/hotspot/code/Compiler.java index 99122bd93b8..71288ae4482 100644 --- a/test/lib/sun/hotspot/code/Compiler.java +++ b/test/lib/sun/hotspot/code/Compiler.java @@ -60,33 +60,10 @@ public class Compiler { /** * Check if Graal is used as JIT compiler. * - * Graal is enabled if following conditions are true: - * - we are not in Interpreter mode - * - UseJVMCICompiler flag is true - * - jvmci.Compiler variable is equal to 'graal' - * - TieredCompilation is not used or TieredStopAtLevel is greater than 3 - * No need to check client mode because it set UseJVMCICompiler to false. - * - * @return true if Graal is used as JIT compiler. + * @return false because Graal is removed from JDK. */ public static boolean isGraalEnabled() { - Boolean useCompiler = WB.getBooleanVMFlag("UseCompiler"); - if (useCompiler == null || !useCompiler) { - return false; - } - Boolean useJvmciComp = WB.getBooleanVMFlag("UseJVMCICompiler"); - if (useJvmciComp == null || !useJvmciComp) { - return false; - } - - Boolean tieredCompilation = WB.getBooleanVMFlag("TieredCompilation"); - Long compLevel = WB.getIntxVMFlag("TieredStopAtLevel"); - // if TieredCompilation is enabled and compilation level is <= 3 then no Graal is used - if (tieredCompilation != null && tieredCompilation && - compLevel != null && compLevel <= 3) { - return false; - } - return true; + return false; } ``` ------------- PR: https://git.openjdk.java.net/jdk/pull/3421 From kvn at openjdk.java.net Sat Apr 10 16:40:48 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Sat, 10 Apr 2021 16:40:48 GMT Subject: RFR: 8264806: Remove the experimental JIT compiler [v2] In-Reply-To: References: <_PS0KHvkB_l9YrKjZ7wLAiKb-SK761YFub4XU5mrBRc=.c32572f6-063f-4503-a20f-aa6b9115f808@github.com> Message-ID: On Sat, 10 Apr 2021 15:38:11 GMT, Igor Ignatyev wrote: >> Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Restore Graal Builder image makefile >> - Merge latest changes from branch 'JDK-8264805' into JDK-8264806 >> - 8264806: Remove the experimental JIT compiler > > should we remove `sun.hotspot.code.Compiler::isGraalEnabled` method and update a few of its users accordingly? > what about `vm.graal.enabled` `@requires` property? @iignatev If you think that I should clean tests anyway I will file follow up RFE to do that. ------------- PR: https://git.openjdk.java.net/jdk/pull/3421 From iignatyev at openjdk.java.net Sat Apr 10 16:50:46 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Sat, 10 Apr 2021 16:50:46 GMT Subject: RFR: 8264806: Remove the experimental JIT compiler [v2] In-Reply-To: References: <_PS0KHvkB_l9YrKjZ7wLAiKb-SK761YFub4XU5mrBRc=.c32572f6-063f-4503-a20f-aa6b9115f808@github.com> Message-ID: On Sat, 10 Apr 2021 16:36:54 GMT, Vladimir Kozlov wrote: >> should we remove `sun.hotspot.code.Compiler::isGraalEnabled` method and update a few of its users accordingly? >> what about `vm.graal.enabled` `@requires` property? > > @iignatev If you think that I should clean tests anyway I will file follow up RFE to do that. > > should we remove `sun.hotspot.code.Compiler::isGraalEnabled` method and update a few of its users accordingly? > > what about `vm.graal.enabled` `@requires` property? > > Thank you, @iignatev for looking on changes. > > I forgot to mention that `Compiler::isGraalEnabled()` returns always false now. Because 94 tests uses `@requires !vm.graal.enabled` I don't want to include them in these changes which are already very big. I am not sure if I should modify tests if GraalVM group wants to run all these tests. > If you think that I should clean tests anyway I will file follow up RFE to do that. changing `Compiler::isGraalEnabled()` to always return false effectively makes these tests unrunnable for GraalVM group (unless they are keep the modified `sun/hotspot/code/Compiler` and/or `requires/VMProps` in their forks). on top of that, I foresee that there will be more tests incompatible w/ Graal yet given it won't be run/tested in OpenJDK, these tests won't be marked and hence will fail when run w/ Graal. so Graal people will have to either do marking themselves (I guess in both upstream and their fork) or maintain a list of inapplicable tests in a format that works best for their setup. that's to say, I think we should clean up the tests, yet I totally agree there is no need to do it as part of this PR. we can discuss how to do it better for both OpenJDK and GraalVM folks in the follow-up RFE. ------------- PR: https://git.openjdk.java.net/jdk/pull/3421 From iignatyev at openjdk.java.net Sat Apr 10 16:50:46 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Sat, 10 Apr 2021 16:50:46 GMT Subject: RFR: 8264806: Remove the experimental JIT compiler [v2] In-Reply-To: <_PS0KHvkB_l9YrKjZ7wLAiKb-SK761YFub4XU5mrBRc=.c32572f6-063f-4503-a20f-aa6b9115f808@github.com> References: <_PS0KHvkB_l9YrKjZ7wLAiKb-SK761YFub4XU5mrBRc=.c32572f6-063f-4503-a20f-aa6b9115f808@github.com> Message-ID: On Fri, 9 Apr 2021 22:26:40 GMT, Vladimir Kozlov wrote: >> As part of [JEP 410](http://openjdk.java.net/jeps/410) remove code related to Java-based JIT compiler (Graal) from JDK: >> >> - `jdk.internal.vm.compiler` ? the Graal compiler >> - `jdk.internal.vm.compiler.management` ? Graal's `MBean` >> - `test/hotspot/jtreg/compiler/graalunit` ? Graal's unit tests >> >> Remove Graal related code in makefiles. >> >> Note, next two `module-info.java` files are preserved so that the JVMCI module `jdk.internal.vm.ci` continues to build: >> src/jdk.internal.vm.compiler/share/classes/module-info.java >> src/jdk.internal.vm.compiler.management/share/classes/module-info.java >> >> @AlanBateman suggested that we can avoid it by using Module API to export packages at runtime . It requires changes in GraalVM's JVMCI too so I will file followup RFE to implement it. >> >> Tested hs-tier1-4 > > Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Restore Graal Builder image makefile > - Merge latest changes from branch 'JDK-8264805' into JDK-8264806 > - 8264806: Remove the experimental JIT compiler Marked as reviewed by iignatyev (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3421 From kvn at openjdk.java.net Sat Apr 10 17:44:34 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Sat, 10 Apr 2021 17:44:34 GMT Subject: RFR: 8264806: Remove the experimental JIT compiler [v2] In-Reply-To: References: <_PS0KHvkB_l9YrKjZ7wLAiKb-SK761YFub4XU5mrBRc=.c32572f6-063f-4503-a20f-aa6b9115f808@github.com> Message-ID: On Sat, 10 Apr 2021 16:47:45 GMT, Igor Ignatyev wrote: >> Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Restore Graal Builder image makefile >> - Merge latest changes from branch 'JDK-8264805' into JDK-8264806 >> - 8264806: Remove the experimental JIT compiler > > Marked as reviewed by iignatyev (Reviewer). Thank you, Igor. I filed https://bugs.openjdk.java.net/browse/JDK-8265032 ------------- PR: https://git.openjdk.java.net/jdk/pull/3421 From yyang at openjdk.java.net Sun Apr 11 05:17:08 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Sun, 11 Apr 2021 05:17:08 GMT Subject: RFR: 8264795: IGV: Upgrade NetBeans platform [v2] In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 23:07:39 GMT, Xin Liu wrote: >> Roberto Casta?eda Lozano has updated the pull request incrementally with three additional commits since the last revision: >> >> - Document how to build and run on a specific JDK >> - Use lambdas to define runnables >> - Document latest JDK version supported > > Thank you for modernizing IGV. I manage to import this project to IntelliJ. The process is hassle-free! > > I also ran using jdk8/jdk11 on macOS. I haven't identified any problem so far. Nice work. It works on Windows home pc. ?? ------------- PR: https://git.openjdk.java.net/jdk/pull/3361 From yyang at openjdk.java.net Sun Apr 11 05:29:42 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Sun, 11 Apr 2021 05:29:42 GMT Subject: RFR: 8263790: C2: new igv_print_immediately() for debugging purpose In-Reply-To: <48S8jMzp6NlZui6ThjZRTI2Z4NigPrIiNLGkPxCLAds=.6f39378f-f127-4320-8799-f382aa25cc42@github.com> References: <-tNb12ZzWakvHEo3QYkjv43on_YR7ZlgFOIXzkr71ao=.4cd993d6-cc1b-4a8b-8e4b-6479386942c9@github.com> <8qcB9THxEItMnobccojmNuJHb9c-9Q6aF0WWkWW1iGo=.b9b233e2-24dc-446a-88ee-8c8e797d2964@github.com> <48S8jMzp6NlZui6ThjZRTI2Z4NigPrIiNLGkPxCLAds=.6f39378f-f127-4320-8799-f382aa25cc42@github.com> Message-ID: On Fri, 19 Mar 2021 08:50:25 GMT, Roberto Casta?eda Lozano wrote: >> Sure, I have uploaded `custom_debug.xml` in JBS issue. I don't find an option like '-version' for idealgraphvisualizer. But from GUI->Preference->About, it shows `Version 1.0 (1.0)`. > > Hi @kelthuzadx, I cannot reproduce the problem running the latest IGV on JDK 8 and linux-x64 either. > > Note that IGV is designed to tolerate XML files without closing `` and `` tags. The parser does not use an error handler, which leads to ignoring all parsing errors [1], and on top of that certain parsing exceptions are ignored: > > https://github.com/openjdk/jdk/blob/d24e4cfef36026b781906a9e0c5cf519eb72696e/src/utils/IdealGraphVisualizer/Data/src/com/sun/hotspot/igv/data/serialization/Parser.java#L531-L539 > > [1] https://docs.oracle.com/javase/8/docs/api/org/xml/sax/XMLReader.html#setErrorHandler-org.xml.sax.ErrorHandler- > > Perhaps the problem you report should be addressed by making sure IGV also accepts missing closing `` and `` tags in your case? As you mention, this is a common debugging scenario. The latest IGV can not open incomplete XML as well. When I changed the system locale to en-US, it works for incomplete XML as expected. I guess this is yet another an i18n problem. As @robcasloz pointed out, the parser does not use an error handler, which leads to ignoring all parsing errors, and on top of that certain parsing exceptions are ignored try { XMLReader reader = createReader(); reader.setContentHandler(new XMLParser(xmlDocument, monitor)); reader.parse(new InputSource(Channels.newInputStream(channel))); } catch (SAXException ex) { if (!(ex instanceof SAXParseException) || !"XML document structures must start and end within the same entity.".equals(ex.getMessage())) { throw new IOException(ex); } } In the catch clause, ex.getMessage() compares with ASCII characters, but ex.getMessage()gets characters that corresponding to their system locale settings. To support non-English system locale settings(if needed), we could code something like this: if (!(ex instanceof SAXParseException) || !"XML document structures must start and end within the same entity.".equals(disable_i18n(ex.getMessage()))) ------- ![fig1](https://user-images.githubusercontent.com/5010047/114293326-c0474e00-9ac7-11eb-983b-22e60bceea6e.png) ![FIG2](https://user-images.githubusercontent.com/5010047/114293327-c2111180-9ac7-11eb-94c6-4e9aac373f74.png) ------------- PR: https://git.openjdk.java.net/jdk/pull/3071 From dnsimon at openjdk.java.net Sun Apr 11 10:28:36 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Sun, 11 Apr 2021 10:28:36 GMT Subject: RFR: 8264806: Remove the experimental JIT compiler [v2] In-Reply-To: References: <_PS0KHvkB_l9YrKjZ7wLAiKb-SK761YFub4XU5mrBRc=.c32572f6-063f-4503-a20f-aa6b9115f808@github.com> Message-ID: On Sat, 10 Apr 2021 17:41:05 GMT, Vladimir Kozlov wrote: >> Marked as reviewed by iignatyev (Reviewer). > > Thank you, Igor. I filed https://bugs.openjdk.java.net/browse/JDK-8265032 We would definitely like to be able to continue testing of GraalVM with the JDK set of jtreg tests. So keeping `Compiler::isGraalEnabled()` working like it does today is important. ------------- PR: https://git.openjdk.java.net/jdk/pull/3421 From yyang at openjdk.java.net Mon Apr 12 03:34:05 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Mon, 12 Apr 2021 03:34:05 GMT Subject: RFR: 8264972: Unused TypeFunc declared in OptoRuntime Message-ID: When investigating some C2 related stuff, I noticed that some TypeFunc are declared in OptoRuntime since JDK6: // leaf on stack replacement interpreter accessor types static const TypeFunc* fetch_int_Type(); static const TypeFunc* fetch_long_Type(); static const TypeFunc* fetch_float_Type(); static const TypeFunc* fetch_double_Type(); static const TypeFunc* fetch_oop_Type(); static const TypeFunc* fetch_monitor_Type(); They are not used nor implemented. It looks like we can remove them(I'm curious about their stories/history.) ------------- Commit messages: - remove unused TypeFunc Changes: https://git.openjdk.java.net/jdk/pull/3429/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3429&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264972 Stats: 8 lines in 1 file changed: 0 ins; 8 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3429.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3429/head:pull/3429 PR: https://git.openjdk.java.net/jdk/pull/3429 From thartmann at openjdk.java.net Mon Apr 12 06:40:04 2021 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Mon, 12 Apr 2021 06:40:04 GMT Subject: RFR: 8264358: Don't create invalid oop in method handle tracing In-Reply-To: References: Message-ID: <2s-ZjC47FeWGiEGlBlI5TwcVtL1tLXsmJGps3vTYYLE=.a06c1d2e-adc1-4bd4-888f-81657074f723@github.com> On Mon, 29 Mar 2021 11:49:56 GMT, Stefan Karlsson wrote: > The `mh` field in: > > struct MethodHandleStubArguments { > const char* adaptername; > oopDesc* mh; > intptr_t* saved_regs; > intptr_t* entry_sp; > }; > > doesn't always point to a valid object. The `oopDesc*` is then implicitly converted to an `oop` here: > > void trace_method_handle_stub_wrapper(MethodHandleStubArguments* args) { > trace_method_handle_stub(args->adaptername, > args->mh, > args->saved_regs, > args->entry_sp); > } > > This gets caught by my ad-hoc verification code that verifies oops when they are created/used. > > I propose that we don't create an oop until it `mh` is actually used, and it has been checked that the argument should contain a valid oop. I started with a more elaborate fix that changed the type of `mh` to be `void*`, but then reverted to a more targetted fix to remove the early oopDesc* > oop conversion. > > One thing that I am curious about is this code inside trace_method_handle_stub: > if (has_mh && oopDesc::is_oop(mh)) { > mh->print_on(&ls); > > Delaying the oopDesc* > oop conversion to after the `has_mh` check solves my verification failure, but I wonder if the `oopDesc::is_oop(mh)` call is really needed when we have the `has_mh` check? Looks good to me too. ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3242 From stefank at openjdk.java.net Mon Apr 12 06:40:05 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 12 Apr 2021 06:40:05 GMT Subject: RFR: 8264358: Don't create invalid oop in method handle tracing In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 11:05:34 GMT, Nils Eliasson wrote: >> The `mh` field in: >> >> struct MethodHandleStubArguments { >> const char* adaptername; >> oopDesc* mh; >> intptr_t* saved_regs; >> intptr_t* entry_sp; >> }; >> >> doesn't always point to a valid object. The `oopDesc*` is then implicitly converted to an `oop` here: >> >> void trace_method_handle_stub_wrapper(MethodHandleStubArguments* args) { >> trace_method_handle_stub(args->adaptername, >> args->mh, >> args->saved_regs, >> args->entry_sp); >> } >> >> This gets caught by my ad-hoc verification code that verifies oops when they are created/used. >> >> I propose that we don't create an oop until it `mh` is actually used, and it has been checked that the argument should contain a valid oop. I started with a more elaborate fix that changed the type of `mh` to be `void*`, but then reverted to a more targetted fix to remove the early oopDesc* > oop conversion. >> >> One thing that I am curious about is this code inside trace_method_handle_stub: >> if (has_mh && oopDesc::is_oop(mh)) { >> mh->print_on(&ls); >> >> Delaying the oopDesc* > oop conversion to after the `has_mh` check solves my verification failure, but I wonder if the `oopDesc::is_oop(mh)` call is really needed when we have the `has_mh` check? > > Looks good. Thanks, @neliasso. ------------- PR: https://git.openjdk.java.net/jdk/pull/3242 From stefank at openjdk.java.net Mon Apr 12 06:43:31 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 12 Apr 2021 06:43:31 GMT Subject: Integrated: 8264358: Don't create invalid oop in method handle tracing In-Reply-To: References: Message-ID: On Mon, 29 Mar 2021 11:49:56 GMT, Stefan Karlsson wrote: > The `mh` field in: > > struct MethodHandleStubArguments { > const char* adaptername; > oopDesc* mh; > intptr_t* saved_regs; > intptr_t* entry_sp; > }; > > doesn't always point to a valid object. The `oopDesc*` is then implicitly converted to an `oop` here: > > void trace_method_handle_stub_wrapper(MethodHandleStubArguments* args) { > trace_method_handle_stub(args->adaptername, > args->mh, > args->saved_regs, > args->entry_sp); > } > > This gets caught by my ad-hoc verification code that verifies oops when they are created/used. > > I propose that we don't create an oop until it `mh` is actually used, and it has been checked that the argument should contain a valid oop. I started with a more elaborate fix that changed the type of `mh` to be `void*`, but then reverted to a more targetted fix to remove the early oopDesc* > oop conversion. > > One thing that I am curious about is this code inside trace_method_handle_stub: > if (has_mh && oopDesc::is_oop(mh)) { > mh->print_on(&ls); > > Delaying the oopDesc* > oop conversion to after the `has_mh` check solves my verification failure, but I wonder if the `oopDesc::is_oop(mh)` call is really needed when we have the `has_mh` check? This pull request has now been integrated. Changeset: b1ebf822 Author: Stefan Karlsson URL: https://git.openjdk.java.net/jdk/commit/b1ebf822 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod 8264358: Don't create invalid oop in method handle tracing Reviewed-by: neliasso, thartmann ------------- PR: https://git.openjdk.java.net/jdk/pull/3242 From jiefu at tencent.com Mon Apr 12 06:51:27 2021 From: jiefu at tencent.com (=?utf-8?B?amllZnUo5YKF5p2wKQ==?=) Date: Mon, 12 Apr 2021 06:51:27 +0000 Subject: Conflict API definitions of Math.pow(x, 0.5) and Math.sqrt(x) for x={-0.0, Double.NEGATIVE_INFINITY} Message-ID: Hi all, I found Math.pow(x, 0.5) and Math.sqrt(x) would compute different values as the following: ``` Math.pow(-0.0, 0.5) = 0.0 Math.sqrt(-0.0) = -0.0 Math.pow(Double.NEGATIVE_INFINITY, 0.5) = Infinity Math.sqrt(Double.NEGATIVE_INFINITY) = NaN ``` The reason is that both of pow and sqrt have special rules for these computations. For example, this rule [1] specifies Math.pow(-0.0, 0.5) must be 0.0. And this one [2] specifies Math.sqrt(-0.0) must be -0.0. And we do have rules for Math.pow(Double.NEGATIVE_INFINITY, 0.5) = Infinity and Math.sqrt(Double.NEGATIVE_INFINITY) = NaN too. I think most people will be confused by these rules because from the view of mathematics, Math.pow(x, 0.5) should be equal to Math.sqrt(x). So why Java creates conflict special rules for them? Is it possible to let Math.pow(-0.0, 0.5) = -0.0 and Math.pow(Double.NEGATIVE_INFINITY, 0.5) = NaN also be allowed? I came across this problem when I was trying to optimize pow(x, 0.5) with sqrt(x). If pow(x, 0.5)'s two special rules can be aligned with sqrt(x), then pow(x, 0.5)'s performance can be improved by 7x~14x [3]. Thanks. Best regards, Jie [1] https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/Math.java#L644 [2] https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/Math.java#L385 [3] https://github.com/openjdk/jdk/pull/3404/ From thartmann at openjdk.java.net Mon Apr 12 06:52:17 2021 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Mon, 12 Apr 2021 06:52:17 GMT Subject: RFR: 8264873: Dependencies: Split ClassHierarchyWalker [v3] In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 22:41:42 GMT, Vladimir Ivanov wrote: >> Split ClassHierarchyWalker into ConcreteMethodFinder and ConcreteSubtypeFinder. >> >> Testing: >> - [x] hs-tier1 - hs-tier4 > > Vladimir Ivanov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: > > - Merge branch 'master' into 8264873.cha.split > - Fix formatting > - Split ClassHierarchyWalker into ConcreteMethodFinder and ConcreteSubtypeFinder > - CountingClassHierarchyIterator > - Migrate to PerfData > - Depenencies perf counters > - KlassDepChange::involves_context > - KlassDepChange::_new_type > - Dependencies::is_concrete_method > - Dependencies::verify_method_context > - ... and 1 more: https://git.openjdk.java.net/jdk/compare/76bd313d...e38b995c Looks good to me. ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3387 From thartmann at openjdk.java.net Mon Apr 12 07:03:39 2021 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Mon, 12 Apr 2021 07:03:39 GMT Subject: RFR: 8264972: Unused TypeFunc declared in OptoRuntime In-Reply-To: References: Message-ID: On Mon, 12 Apr 2021 03:18:33 GMT, Yi Yang wrote: > When investigating some C2 related stuff, I noticed that some TypeFunc are declared in OptoRuntime since JDK6: > > // leaf on stack replacement interpreter accessor types > static const TypeFunc* fetch_int_Type(); > static const TypeFunc* fetch_long_Type(); > static const TypeFunc* fetch_float_Type(); > static const TypeFunc* fetch_double_Type(); > static const TypeFunc* fetch_oop_Type(); > static const TypeFunc* fetch_monitor_Type(); > > They are neither used nor implemented. It looks like we can remove them(I'm curious about their stories/history.) Good catch. Looks good and trivial to me. ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3429 From jiefu at openjdk.java.net Mon Apr 12 07:06:38 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Mon, 12 Apr 2021 07:06:38 GMT Subject: RFR: 8264945: Optimize the code-gen for Math.pow(x, 0.5) In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 22:39:44 GMT, Vladimir Kozlov wrote: >> Hi all, >> >> I'd like to optimize the code-gen for Math.pow(x, 0.5). >> And 7x ~ 14x performance improvement is observed by the jmh micro-benchmarks. >> >> While I was optimizing a machine learning program, I found both Math.pow(x, 2) and Math.pow(x, 0.5) are used. >> To my surprise, C2 just optimizes the case for Math.pow(x, 2) [1], but still not for Math.pow(x, 0.5) yet. >> >> The patch just replace Math.pow(x, 0.5) with Math.sqrt(x). >> >> Before: >> Benchmark (seed) Mode Cnt Score Error Units >> MathBench.powDouble0Dot5 0 thrpt 8 45525.117 ? 11.686 ops/ms >> MathBench.powDouble0Dot5Loop 0 thrpt 8 0.031 ? 0.001 ops/ms >> >> Benchmark (seed) Mode Cnt Score Error Units >> MathBench.powDouble0Dot5 0 thrpt 8 45509.317 ? 6.581 ops/ms >> MathBench.powDouble0Dot5Loop 0 thrpt 8 0.031 ? 0.001 ops/ms >> >> After: >> Benchmark (seed) Mode Cnt Score Error Units >> MathBench.powDouble0Dot5 0 thrpt 8 343354.892 ? 362.900 ops/ms >> MathBench.powDouble0Dot5Loop 0 thrpt 8 0.457 ? 0.001 ops/mso >> >> Benchmark (seed) Mode Cnt Score Error Units >> MathBench.powDouble0Dot5 0 thrpt 8 343421.559 ? 49.326 ops/ms >> MathBench.powDouble0Dot5Loop 0 thrpt 8 0.457 ? 0.001 ops/ms >> >> Testing: >> - tier1~3 on Linux/x64 >> >> Thanks, >> Best regards, >> Jie >> >> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/library_call.cpp#L1680 > > Please, verify that result is the same when run with -Xint (Interpreter only) and (-XX:TieredStopAtLevel=1) C1 only. May be they need the same optimization. Hi @vnkozlov @neliasso and @huishi-hs , Thanks for your review and comments. While I was implementing the opt for C1 and interpreter, I found Math.pow(x, 0.5) and Math.sqrt(x) would compute different values for x={-0.0, Double.NEGATIVE_INFINITY} [1]. Let's hold on this issue until we have a conclusion about that question. Thanks. [1] https://mail.openjdk.java.net/pipermail/core-libs-dev/2021-April/076195.html ------------- PR: https://git.openjdk.java.net/jdk/pull/3404 From xgong at openjdk.java.net Mon Apr 12 07:06:34 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Mon, 12 Apr 2021 07:06:34 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v3] In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 21:52:45 GMT, Vladimir Ivanov wrote: >> Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert changes to VectorLoadMask and add a VectorMaskCast for the same element size mask casting > > src/hotspot/share/opto/vectornode.cpp line 1236: > >> 1234: if (is_vector_mask) { >> 1235: if (in_vt->length_in_bytes() == out_vt->length_in_bytes() && >> 1236: Matcher::match_rule_supported(Op_VectorMaskCast)) { > > There's `Matcher::match_rule_supported_vector()` specifically for vector nodes. OK, I will use this instead. Thanks! > src/hotspot/share/opto/vectornode.cpp line 1237: > >> 1235: if (in_vt->length_in_bytes() == out_vt->length_in_bytes() && >> 1236: Matcher::match_rule_supported(Op_VectorMaskCast)) { >> 1237: // VectorUnbox (VectorBox vmask) ==> VectorMaskCast (vmask) > > It's better to implement it as a 2-step transformation and place it in `VectorLoadMaskNode::Ideal()`: > VectorUnbox (VectorBox vmask) ==> VectorLoadMask (VectorStoreMask vmask) => VectorMaskCast (vmask) Thanks for your comments. Yes, theoretically it's better to place it in `VectorLoadMaskNode::Ideal()`. Unfortunately, we met an issue that is related to optimization for `VectorStoreMask`. Considering the following case: LoadVector LoadVector | | VectorLoadMask (double) VectorLoadMask (double) | | VectorUnbox (long) ==> VectorStoreMask (double) | VectorLoadMask (long) This case loads the masking values for a double type, and does a bitwise `and` operation. Since the type is mismatched, the compiler will try to do `VectorUnbox (VectorBox vmask) ==> VectorLoadMask (VectorStoreMask vmask)`. However, since there is the transformation `VectorStoreMask (VectorLoadMask value) ==> value`, the above `VectorStoreMaskNode` will be optimized out. The final graph looks like: LoadVector LoadVector | / \ VectorLoadMask (double) / \ | ==> VectorLoadMask (double) \ VectorStoreMask (double) VectorLoadMask (long) | VectorLoadMask (long) Since the two `VectorLoadMaskNode` have different element type, the GVN cannot optimize out one. So finally there will be two similar `VectorLoadMaskNode`s. That's also why I override the `cmp/hash` for `VectorLoadMaskNode` in the first version. So I prefer to add `VectorUnbox (VectorBox vmask) ==> VectorMaskCast (vmask)` directly. > src/hotspot/share/opto/vectornode.hpp line 1247: > >> 1245: assert(type2aelembytes(in_vt->element_basic_type()) == type2aelembytes(vt->element_basic_type()), "element size must match"); >> 1246: } >> 1247: > > FTR there's a way to avoid platform-specific changes (in AD files) if we don't envision any non-trivial conversions to be supported: just disable matching of the node by overriding `Node::ideal_reg()`: > virtual uint ideal_reg() const { return 0; } // not matched in the AD file Good idea, I will try this way. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From yyang at openjdk.java.net Mon Apr 12 07:09:01 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Mon, 12 Apr 2021 07:09:01 GMT Subject: RFR: 8264972: Unused TypeFunc declared in OptoRuntime In-Reply-To: References: Message-ID: On Mon, 12 Apr 2021 06:58:24 GMT, Tobias Hartmann wrote: >> When investigating some C2 related stuff, I noticed that some TypeFunc are declared in OptoRuntime since JDK6: >> >> // leaf on stack replacement interpreter accessor types >> static const TypeFunc* fetch_int_Type(); >> static const TypeFunc* fetch_long_Type(); >> static const TypeFunc* fetch_float_Type(); >> static const TypeFunc* fetch_double_Type(); >> static const TypeFunc* fetch_oop_Type(); >> static const TypeFunc* fetch_monitor_Type(); >> >> They are neither used nor implemented. It looks like we can remove them(I'm curious about their stories/history.) > > Good catch. Looks good and trivial to me. Thank you Tobias. It's trivial indeed. ------------- PR: https://git.openjdk.java.net/jdk/pull/3429 From stefank at openjdk.java.net Mon Apr 12 07:49:10 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 12 Apr 2021 07:49:10 GMT Subject: RFR: 8264358: Don't create invalid oop in method handle tracing In-Reply-To: <2s-ZjC47FeWGiEGlBlI5TwcVtL1tLXsmJGps3vTYYLE=.a06c1d2e-adc1-4bd4-888f-81657074f723@github.com> References: <2s-ZjC47FeWGiEGlBlI5TwcVtL1tLXsmJGps3vTYYLE=.a06c1d2e-adc1-4bd4-888f-81657074f723@github.com> Message-ID: On Mon, 12 Apr 2021 06:37:45 GMT, Tobias Hartmann wrote: >> The `mh` field in: >> >> struct MethodHandleStubArguments { >> const char* adaptername; >> oopDesc* mh; >> intptr_t* saved_regs; >> intptr_t* entry_sp; >> }; >> >> doesn't always point to a valid object. The `oopDesc*` is then implicitly converted to an `oop` here: >> >> void trace_method_handle_stub_wrapper(MethodHandleStubArguments* args) { >> trace_method_handle_stub(args->adaptername, >> args->mh, >> args->saved_regs, >> args->entry_sp); >> } >> >> This gets caught by my ad-hoc verification code that verifies oops when they are created/used. >> >> I propose that we don't create an oop until it `mh` is actually used, and it has been checked that the argument should contain a valid oop. I started with a more elaborate fix that changed the type of `mh` to be `void*`, but then reverted to a more targetted fix to remove the early oopDesc* > oop conversion. >> >> One thing that I am curious about is this code inside trace_method_handle_stub: >> if (has_mh && oopDesc::is_oop(mh)) { >> mh->print_on(&ls); >> >> Delaying the oopDesc* > oop conversion to after the `has_mh` check solves my verification failure, but I wonder if the `oopDesc::is_oop(mh)` call is really needed when we have the `has_mh` check? > > Looks good to me too. Thanks, @TobiHartmann ------------- PR: https://git.openjdk.java.net/jdk/pull/3242 From xgong at openjdk.java.net Mon Apr 12 07:58:07 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Mon, 12 Apr 2021 07:58:07 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v3] In-Reply-To: References: Message-ID: On Mon, 12 Apr 2021 07:02:38 GMT, Xiaohong Gong wrote: >> src/hotspot/share/opto/vectornode.hpp line 1247: >> >>> 1245: assert(type2aelembytes(in_vt->element_basic_type()) == type2aelembytes(vt->element_basic_type()), "element size must match"); >>> 1246: } >>> 1247: >> >> FTR there's a way to avoid platform-specific changes (in AD files) if we don't envision any non-trivial conversions to be supported: just disable matching of the node by overriding `Node::ideal_reg()`: >> virtual uint ideal_reg() const { return 0; } // not matched in the AD file > > Good idea, I will try this way. Thanks! I met the bad AD file issue when I remove the changes in AD files and overriding `Node::ideal_reg()`. I guess if a node is not used as an input of other node, this can work well? For the `VectorMaskCastNode`, it must be an input for some other nodes. If just disable the matching of this node, how does its usage find the right input code? ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From jbhateja at openjdk.java.net Mon Apr 12 08:21:42 2021 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Mon, 12 Apr 2021 08:21:42 GMT Subject: RFR: 8264954: unified handling for VectorMask object re-materialization during de-optimization [v2] In-Reply-To: <4jMKSSVNKx5Fh7hek0pT5MMIYAnTjB7y7faOh_L60Pw=.2ce391ed-b29c-4a05-a412-bb3935126b2f@github.com> References: <4jMKSSVNKx5Fh7hek0pT5MMIYAnTjB7y7faOh_L60Pw=.2ce391ed-b29c-4a05-a412-bb3935126b2f@github.com> Message-ID: On Fri, 9 Apr 2021 19:12:27 GMT, Vladimir Ivanov wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> 8264954: Review comments resolution. > > src/hotspot/share/opto/vector.cpp line 270: > >> 268: bool is_mask = is_vector_mask(iklass); >> 269: if (is_mask) { >> 270: if (vec_value->Opcode() != Op_VectorStoreMask) { > > It's better to do the check against the type of `vec_value`: `VectorStoreMask` produces a vector of booleans and you can guard `gvn.transform(VectorStoreMaskNode::make(...))` call with `bt != T_BOOLEAN` check instead. In addition to bt != T_BOOLEAN check, we also need to check if type is a vectormask (explicit type added for mask generating nodes on targets supporting predicate registers). A similar check needs to be added at following location https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/vector.cpp#L329 Keeping this as it is i.e. an Opcode based check. ------------- PR: https://git.openjdk.java.net/jdk/pull/3408 From rcastanedalo at openjdk.java.net Mon Apr 12 08:37:18 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Mon, 12 Apr 2021 08:37:18 GMT Subject: RFR: 8263227: C2: inconsistent spilling due to dead nodes in exception block [v2] In-Reply-To: <9y8Bw6bniQX2kF659D_tyTfbZQc1W3ptA7xSr7fmsc8=.77b3fb6e-6e49-4861-9f1d-3ffbf0b075fd@github.com> References: <9y8Bw6bniQX2kF659D_tyTfbZQc1W3ptA7xSr7fmsc8=.77b3fb6e-6e49-4861-9f1d-3ffbf0b075fd@github.com> Message-ID: > This change eliminates dead multi-nodes created by call-catch cleanup after GCM. Eliminating all dead code created by call-catch cleanup avoids potential issues when splitting the live range of call result values, see the analysis in the [bug report](https://bugs.openjdk.java.net/browse/JDK-8263227) for details. This solution is the least invasive of the three alternatives proposed in the bug report (the other two are constraining global code motion and extending live-range splitting). > > The change also extends the control-flow graph verification pass to catch similar live-range splitting issues earlier (with `+VerifyRegisterAllocator`). > > Tested on: > - original bug reproducer > - hs-tier1-5 (windows-x64, linux-x64, linux-aarch64, and macosx-x64) with `+VerifyRegisterAllocator` > - hs-tier1-3 (windows-x64, linux-x64, linux-aarch64, and macosx-x64) with `+VerifyRegisterAllocator` and `+StressGCM` (5 repetitions) Roberto Casta?eda Lozano has updated the pull request incrementally with four additional commits since the last revision: - Complete projection assertions with basic checks - Enforce code style in surrounding 'for' statements - Simplify loop that removes dead projections, rename and comment for clarity - Assert that projections are stuck to their parents ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3303/files - new: https://git.openjdk.java.net/jdk/pull/3303/files/a4973ba0..808451de Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3303&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3303&range=00-01 Stats: 26 lines in 2 files changed: 13 ins; 1 del; 12 mod Patch: https://git.openjdk.java.net/jdk/pull/3303.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3303/head:pull/3303 PR: https://git.openjdk.java.net/jdk/pull/3303 From rcastanedalo at openjdk.java.net Mon Apr 12 08:37:23 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Mon, 12 Apr 2021 08:37:23 GMT Subject: RFR: 8263227: C2: inconsistent spilling due to dead nodes in exception block [v2] In-Reply-To: References: <9y8Bw6bniQX2kF659D_tyTfbZQc1W3ptA7xSr7fmsc8=.77b3fb6e-6e49-4861-9f1d-3ffbf0b075fd@github.com> <-OXlf8TYEEY3rOr6V8ae6zR2QtDnjfnso4R_CKA9alU=.5c7de539-587e-47dd-b25d-ef0443e024d6@github.com> Message-ID: On Thu, 8 Apr 2021 18:20:13 GMT, Vladimir Kozlov wrote: > I am not sure that no other nodes were inserted in between users. Isn't this enforced by local scheduling? From `PhaseCFG::select()`: https://github.com/openjdk/jdk/blob/b1ebf82269fa85bed859ffafacd59ed000f22bd0/src/hotspot/share/opto/lcm.cpp#L477-L478 https://github.com/openjdk/jdk/blob/b1ebf82269fa85bed859ffafacd59ed000f22bd0/src/hotspot/share/opto/lcm.cpp#L516-L523 `PhaseCFG::call_catch_cleanup()` runs right after local scheduling, and it doesn't reorder the cloned instructions before reaching the dead code elimination loop, so I don't see any place where other nodes might be inserted in between projections of the same node. I checked this with an assertion in `PhaseCFG::verify()` (commits 8d96589 and 808451d), and the following tests run without assertion failures: - hs-tier1-5 (windows-x64, linux-x64, linux-aarch64, and macosx-x64) with `+VerifyRegisterAllocator` - hs-tier6-9 (linux-x64) with `+VerifyRegisterAllocator` - hs-tier1-3 (windows-x64, linux-x64, linux-aarch64, and macosx-x64) with `+VerifyRegisterAllocator`, `+StressGCM`, and `+StressLCM` (5 repetitions) Based on above analysis, code comments in `lcm.cpp`, and test results, I suggest to treat the fact that projections are scheduled next to their parent nodes as an invariant, and exploit the invariant as in this (updated) change -- but I am fine with the more defensive alternative you propose if you still prefer that. > I also noticed wrong code stile in for() statements near your change. Please, fix them too. Done. ------------- PR: https://git.openjdk.java.net/jdk/pull/3303 From aph at redhat.com Mon Apr 12 08:38:29 2021 From: aph at redhat.com (Andrew Haley) Date: Mon, 12 Apr 2021 09:38:29 +0100 Subject: Conflict API definitions of Math.pow(x, 0.5) and Math.sqrt(x) for x={-0.0, Double.NEGATIVE_INFINITY} In-Reply-To: References: Message-ID: <0a111afe-0ac5-0d2e-a4b6-a2c3aedeebee@redhat.com> On 4/12/21 7:51 AM, jiefu(??) wrote: > So why Java creates conflict special rules for them? > Is it possible to let Math.pow(-0.0, 0.5) = -0.0 and Math.pow(Double.NEGATIVE_INFINITY, 0.5) = NaN also be allowed? These rules are in the C standard too, which is inherited by other programming languages. Also, please see https://bugs.openjdk.java.net/browse/JDK-8240632 -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From rcastanedalo at openjdk.java.net Mon Apr 12 08:40:52 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Mon, 12 Apr 2021 08:40:52 GMT Subject: RFR: 8263227: C2: inconsistent spilling due to dead nodes in exception block In-Reply-To: References: <9y8Bw6bniQX2kF659D_tyTfbZQc1W3ptA7xSr7fmsc8=.77b3fb6e-6e49-4861-9f1d-3ffbf0b075fd@github.com> <-vl-o8PemGJNrb-7AtTQoNB6YGnhxoA3CVWot_iihlU=.90713355-bd98-4c1c-b253-08d660c1590e@github.com> Message-ID: On Thu, 8 Apr 2021 17:53:08 GMT, Vladimir Kozlov wrote: > > > Also where is guarantee that all Proj users are in the same block. > > > > > > Isn't this guaranteed by `PhaseCFG::schedule_node_into_block()`? > > I will run some tests with extra assertions in `PhaseCFG::verify()` to double-check. > > Okay. Done, see results in the related discussion (https://github.com/openjdk/jdk/pull/3303#discussion_r611430316). ------------- PR: https://git.openjdk.java.net/jdk/pull/3303 From shade at openjdk.java.net Mon Apr 12 08:40:54 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 12 Apr 2021 08:40:54 GMT Subject: RFR: 8264972: Unused TypeFunc declared in OptoRuntime In-Reply-To: References: Message-ID: <4usgDpGn_k432DLdk3RrPRFwFpUd4Epl6Hl8c2m44X4=.4c1e0ed4-7df8-4d27-baa4-ce120d4b543e@github.com> On Mon, 12 Apr 2021 03:18:33 GMT, Yi Yang wrote: > When investigating some C2 related stuff, I noticed that some TypeFunc are declared in OptoRuntime since JDK6: > > // leaf on stack replacement interpreter accessor types > static const TypeFunc* fetch_int_Type(); > static const TypeFunc* fetch_long_Type(); > static const TypeFunc* fetch_float_Type(); > static const TypeFunc* fetch_double_Type(); > static const TypeFunc* fetch_oop_Type(); > static const TypeFunc* fetch_monitor_Type(); > > They are neither used nor implemented. It looks like we can remove them(I'm curious about their stories/history.) This apparently predates OpenJDK history. Looks fine. ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3429 From yyang at openjdk.java.net Mon Apr 12 08:45:16 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Mon, 12 Apr 2021 08:45:16 GMT Subject: Integrated: 8264972: Unused TypeFunc declared in OptoRuntime In-Reply-To: References: Message-ID: On Mon, 12 Apr 2021 03:18:33 GMT, Yi Yang wrote: > When investigating some C2 related stuff, I noticed that some TypeFunc are declared in OptoRuntime since JDK6: > > // leaf on stack replacement interpreter accessor types > static const TypeFunc* fetch_int_Type(); > static const TypeFunc* fetch_long_Type(); > static const TypeFunc* fetch_float_Type(); > static const TypeFunc* fetch_double_Type(); > static const TypeFunc* fetch_oop_Type(); > static const TypeFunc* fetch_monitor_Type(); > > They are neither used nor implemented. It looks like we can remove them(I'm curious about their stories/history.) This pull request has now been integrated. Changeset: ecef1fc8 Author: Yi Yang Committer: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/ecef1fc8 Stats: 8 lines in 1 file changed: 0 ins; 8 del; 0 mod 8264972: Unused TypeFunc declared in OptoRuntime Reviewed-by: thartmann, shade ------------- PR: https://git.openjdk.java.net/jdk/pull/3429 From yyang at openjdk.java.net Mon Apr 12 08:49:32 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Mon, 12 Apr 2021 08:49:32 GMT Subject: RFR: 8264972: Unused TypeFunc declared in OptoRuntime In-Reply-To: <4usgDpGn_k432DLdk3RrPRFwFpUd4Epl6Hl8c2m44X4=.4c1e0ed4-7df8-4d27-baa4-ce120d4b543e@github.com> References: <4usgDpGn_k432DLdk3RrPRFwFpUd4Epl6Hl8c2m44X4=.4c1e0ed4-7df8-4d27-baa4-ce120d4b543e@github.com> Message-ID: On Mon, 12 Apr 2021 08:36:32 GMT, Aleksey Shipilev wrote: > This apparently predates OpenJDK history. Looks fine. Yes, it seems an ancient code. I googled it but got nothing except source codes in OpenJDK. ------------- PR: https://git.openjdk.java.net/jdk/pull/3429 From rcastanedalo at openjdk.java.net Mon Apr 12 09:02:51 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Mon, 12 Apr 2021 09:02:51 GMT Subject: RFR: 8264795: IGV: Upgrade NetBeans platform [v2] In-Reply-To: References: Message-ID: <_Cu2871SERzpfwf5GrLz6SfQqhfOew58zQXGXYa4MBU=.bb49faf1-7b04-4d08-924c-49e3e132eb61@github.com> On Fri, 9 Apr 2021 23:07:39 GMT, Xin Liu wrote: > Thank you for modernizing IGV. I manage to import this project to IntelliJ. The process is hassle-free! > > I also ran using jdk8/jdk11 on macOS. I haven't identified any problem so far. Thanks for checking, Xin! ------------- PR: https://git.openjdk.java.net/jdk/pull/3361 From rcastanedalo at openjdk.java.net Mon Apr 12 09:09:47 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Mon, 12 Apr 2021 09:09:47 GMT Subject: RFR: 8264795: IGV: Upgrade NetBeans platform [v2] In-Reply-To: References: Message-ID: On Wed, 7 Apr 2021 11:59:28 GMT, Roberto Casta?eda Lozano wrote: >> This change upgrades the NetBeans platform on which IGV is based to its latest version ([12.3](https://netbeans.apache.org/download/nb123/index.html)) and switches IGV's build system from Ant to Maven. The upgrade introduces support for a wide range of JDK versions (from 8 to 15, the latest version supported by NetBeans 12.3), and the switch from Ant to Maven makes the IGV build simpler, faster (first-time build is approximately 5x faster), and more stable (all dependencies are fetched directly from the Maven central repository). >> >> The change also fixes broken unit tests in the Data module and runs them by default when building. >> >> #### Testing >> >> Regression-tested the following use cases manually on all combinations of (Linux, Windows, macOS) and (JDK 8, JDK 11, JDK 15): >> >> - build >> - open graph file (small.xml in [test-input.zip](https://bugs.openjdk.java.net/secure/attachment/93988/test-input.zip)) >> - import graphs via network (localhost) >> - expand groups in outline >> - open a graph >> - open a clone >> - zoom in and out >> - search a node >> - apply filters >> - extract a node >> - show all nodes >> - select nodes corresponding to a bytecode >> - view control flow >> - select nodes corresponding to a basic block >> - cluster nodes >> - show satellite view >> - compute the difference of two graphs >> - change node text >> - remove a group >> - save groups into a file >> - open a large graph file (large.xml in [test-input.zip](https://bugs.openjdk.java.net/secure/attachment/93988/test-input.zip)) >> - open a large graph ("After Escape Analysis" in large.xml) >> >> Thanks to Vladimir Ivanov for helping with testing on macOS. > > Roberto Casta?eda Lozano has updated the pull request incrementally with three additional commits since the last revision: > > - Document how to build and run on a specific JDK > - Use lambdas to define runnables > - Document latest JDK version supported > Nice work. It works on Windows home pc. rocket Thanks for checking, Yang! ------------- PR: https://git.openjdk.java.net/jdk/pull/3361 From rcastanedalo at openjdk.java.net Mon Apr 12 09:09:47 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Mon, 12 Apr 2021 09:09:47 GMT Subject: RFR: 8264795: IGV: Upgrade NetBeans platform [v2] In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 22:39:09 GMT, Xin Liu wrote: > Does this mean that module Filter will depend on external nashorn after jdk15? Yes (including JDK 15). > I know that jdk doesn't remove nashorn util jdk15, but does it make sense that IGV always picks up external nashorn. My concern is "javascript" in jdk8/11 may be not well-maintained and leave vulnerability behind. I assume potential issues in "built-in" Nashorn are addressed in JDK 8-14 as for any other OpenJDK component. Besides, [standalone (external) Nashorn only works on JDK >= 11](https://github.com/szegedi/nashorn/wiki/Using-Nashorn-with-different-Java-versions), and I think it makes sense that IGV stills support JDK 8 for minimal disruption (this was the only JDK version supported before this upgrade). ------------- PR: https://git.openjdk.java.net/jdk/pull/3361 From adinn at redhat.com Mon Apr 12 09:23:24 2021 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 12 Apr 2021 10:23:24 +0100 Subject: Conflict API definitions of Math.pow(x, 0.5) and Math.sqrt(x) for x={-0.0, Double.NEGATIVE_INFINITY} In-Reply-To: References: Message-ID: On 12/04/2021 07:51, jiefu(??) wrote: > I think most people will be confused by these rules because from the > view of mathematics, Math.pow(x, 0.5) should be equal to > Math.sqrt(x). This is already a confused situation from the point of view of mathematics. Consider these two expressions: Math.sqrt(-0.0) * Math.sqrt(-0.0) Math.pow(-0.0, 0.5) * Math.pow(-0.0, 0.5) It doesn't matter whether the functions sqrt and pow compute -0.0 or 0.0 as the value here. Either result will fail to satisfy the equality f(x) * f(x) == x The problem is that computation has already diverged from mathematical expectation by introducing the value -0.0. So, Java (and other languages) have to make up a rule here. > So why Java creates conflict special rules for them? Is it possible > to let Math.pow(-0.0, 0.5) = -0.0 and > Math.pow(Double.NEGATIVE_INFINITY, 0.5) = NaN also be allowed? It might well match expectations better if both functions were to generate the same value here. However, expectations have already been set by libc: $ cat > sqrt.c << END #include #include int main(int argc, char **argv) { printf("sqrt(-0.F) = %f\n", sqrt(-0.F)); printf("pow(-0.F, 0.5) = %f\n", pow(-0.F, 0.5)); } END $ make sqrt cc sqrt.c -o sqrt $ ./sqrt sqrt(-0.F) = -0.000000 pow(-0.F, 0.5) = 0.000000 I have no idea why these specific results were made up for C but Java really ought to follow them. regards, Andrew Dinn ----------- Red Hat Distinguished Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill From xgong at openjdk.java.net Mon Apr 12 09:29:38 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Mon, 12 Apr 2021 09:29:38 GMT Subject: RFR: 8264957: Type::dual_type array is not aligned with enum TYPES In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 11:02:06 GMT, Jie Fu wrote: >> This is a bug fix for [1] which adds a new vector mask type. The new added TYPE `"VectorMask"` is inserted into `enum TYPES`, while the array `"Type::dual_type"` is not updated. This makes the array elements are not aligned with TYPES. >> >> I met the following crash due to this issue when I was working on the masking feature support on panama-vector: >> >> Internal Error (/home/xiagon01/code/panama-vector/src/hotspot/share/opto/type.hpp:1727), pid=104432, tid=104449 >> # assert(_base >= AnyPtr && _base <= KlassPtr) failed: Not a pointer >> >> Adding a value like other vector types for the `"VectorMask"` in the array `"dual_type"` can fix it. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8262355 >> >> Tested with tier1 and jdk:tier3 > > This sounds reasonable to me. > Thanks. Thanks for your review @DamonFool @neliasso ! It seems the array `Type::dual_type` is not used anywhere currently. The same definition for `dual_type` is defined in https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/type.hpp#L140 and used in type.cpp, which uses `Type::dual_type` before. So I think the whole definition can be totally removed. The issue I met might not be related to it. I will have a test and remove it if everything works well. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/3410 From jbhateja at openjdk.java.net Mon Apr 12 09:33:14 2021 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Mon, 12 Apr 2021 09:33:14 GMT Subject: RFR: 8264954: unified handling for VectorMask object re-materialization during de-optimization [v3] In-Reply-To: References: Message-ID: > Following flow describes object reconstruction for de-optimization:- > 1) PhaseVector::scalarize_vbox_node() creates SafePointScalarObjectNode to captures the box type information, also it connects to node holding the boxed value. > 2) During code emit phase (PhaseOutput) C2 process above information to dumps ObjectValue holding the box information and LocationValue to holding the value information into ScopeDescriptor corresponding to Safepoint PC. > 3) De-optimization blobs dump the value held in registers to the stack locations using RegisterSave::save_live_registers() and a mapping b/w register and its stack location is added to RegisterMap. > 4) During de-optimization, compiled frame objects are re-allocated using identity information held in ObjectValue and their fields are initialized using values held in the stack locations accessed through register-stack mappings. > > By inserting a VectorStoreMaskNode before stitching the mask holding node to Safepoint we make sure that value held in opmask/vector register is transferred to a byte vector. Thus rest of the flow works as it is, stack location will hold the value in the form of a byte array irrespective of the box shape. > > tier1-tier3 regressions are clean with UseAVX=2/3. Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: 8264954: Incorporating closing comments. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3408/files - new: https://git.openjdk.java.net/jdk/pull/3408/files/239923fd..98b4778d Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3408&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3408&range=01-02 Stats: 13 lines in 2 files changed: 1 ins; 4 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/3408.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3408/head:pull/3408 PR: https://git.openjdk.java.net/jdk/pull/3408 From rcastanedalo at openjdk.java.net Mon Apr 12 10:26:49 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Mon, 12 Apr 2021 10:26:49 GMT Subject: RFR: 8263790: C2: new igv_print_immediately() for debugging purpose In-Reply-To: References: Message-ID: On Thu, 18 Mar 2021 10:05:12 GMT, Yi Yang wrote: > Add a new igv_print_immediately, it prints the current method immediately. This differs from other igv_print* methods, it creates a well-formed and complete ideal graph xml immediately, while others accomplish their ideal graph xml when VM exists (i.e. destructor of `IdealGraphPrinter::_xx_printer`). If VM crashes before VM exit, the ideal graph xml will be ill-formed, this is fairly a common case when debugging another crash. > > Test manually! > > Cheers, > Yang > In the catch clause, ex.getMessage() compares with ASCII characters, but ex.getMessage()gets characters that corresponding to their system locale settings. To support non-English system locale settings(if needed), we could code something like this: > > ```java > if (!(ex instanceof SAXParseException) || !"XML document structures must start and end within the same entity.".equals(disable_i18n(ex.getMessage()))) > ``` That makes sense, can you please try if this fix works for you? https://github.com/robcasloz/jdk/commit/702e763 If it does, please feel free to re-purpose this issue with the proposed fix (once https://github.com/openjdk/jdk/pull/3361 is integrated), and with a test case that parses an incomplete XML document in `src/utils/IdealGraphVisualizer/Data/src/test/java/com/sun/hotspot/igv/data/serialization/ParserTest.java`. Please make sure to test in on different JDK versions (I suggest JDK 8, 11, and 15). ------------- PR: https://git.openjdk.java.net/jdk/pull/3071 From jbhateja at openjdk.java.net Mon Apr 12 12:17:48 2021 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Mon, 12 Apr 2021 12:17:48 GMT Subject: Integrated: 8264954: unified handling for VectorMask object re-materialization during de-optimization In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 07:25:11 GMT, Jatin Bhateja wrote: > Following flow describes object reconstruction for de-optimization:- > 1) PhaseVector::scalarize_vbox_node() creates SafePointScalarObjectNode to captures the box type information, also it connects to node holding the boxed value. > 2) During code emit phase (PhaseOutput) C2 process above information to dumps ObjectValue holding the box information and LocationValue to holding the value information into ScopeDescriptor corresponding to Safepoint PC. > 3) De-optimization blobs dump the value held in registers to the stack locations using RegisterSave::save_live_registers() and a mapping b/w register and its stack location is added to RegisterMap. > 4) During de-optimization, compiled frame objects are re-allocated using identity information held in ObjectValue and their fields are initialized using values held in the stack locations accessed through register-stack mappings. > > By inserting a VectorStoreMaskNode before stitching the mask holding node to Safepoint we make sure that value held in opmask/vector register is transferred to a byte vector. Thus rest of the flow works as it is, stack location will hold the value in the form of a byte array irrespective of the box shape. > > tier1-tier3 regressions are clean with UseAVX=2/3. This pull request has now been integrated. Changeset: f71be8b5 Author: Jatin Bhateja URL: https://git.openjdk.java.net/jdk/commit/f71be8b5 Stats: 66 lines in 3 files changed: 22 ins; 22 del; 22 mod 8264954: unified handling for VectorMask object re-materialization during de-optimization Reviewed-by: vlivanov ------------- PR: https://git.openjdk.java.net/jdk/pull/3408 From jiefu at tencent.com Mon Apr 12 13:39:16 2021 From: jiefu at tencent.com (=?utf-8?B?amllZnUo5YKF5p2wKQ==?=) Date: Mon, 12 Apr 2021 13:39:16 +0000 Subject: Conflict API definitions of Math.pow(x, 0.5) and Math.sqrt(x) for x={-0.0, Double.NEGATIVE_INFINITY}(Internet mail) In-Reply-To: <05560ff3-7ea3-6e4d-ff0f-b25358b7f7b7@gmail.com> References: <72184DF4-C63B-488F-8490-074247292022@tencent.com> <05560ff3-7ea3-6e4d-ff0f-b25358b7f7b7@gmail.com> Message-ID: Hi Raffaello, Thanks for your execllent analysis. I agree with you now. And I'll close the optimization PR [1] tomorrow if there is no objections. Thanks. Best regards, Jie [1] https://github.com/openjdk/jdk/pull/3404/ ?On 2021/4/12, 9:06 PM, "Raffaello Giulietti" wrote: Hi Jie, I don't think that changing the spec of Math.pow() to be misaligned with IEEE 754 would be a wise option. IEEE is much more pervasive than Java. There are many aspects in IEEE that might be seen as questionable, but at least it is a widely adopted standard. AFAIU, the only reason you would like to "optimize" the special case of y = 0.5 in pow(x, y) to return sqrt(x) is for performance, more accuracy and some kind of consistency. But then, why not a special case for y = 0.25 as sqrt(sqrt(x))? And what about y = 0.75? Should this be translated to sqrt(sqrt(pow(x, 3)))? What about y = 1.0 / 3.0? Should this become cbrt(x)? And why not consider y = 2.0 / 3.0 in a special rule: cbrt(x * x)? You see, the special cases can quickly become unmanageable. Also, special rules would produce results which are "discontinuous" with nearby exponents, like y = 0.5000000000000001. That's probably why IEEE doesn't propose translation rules for finite numerical exponents that are not integers, except when x is a special value. Greetings Raffaello On 2021-04-12 13:44, jiefu(??) wrote: > Hi Andrew H, Andrew D, and Raffaello, > > Thank you all for your kind reply and helpful comments. > > Now I got where the rules come from. > But I don't think the IEEE standars are reasonable to specify conflits rules. > Maybe, these computations should be open to be implementation dependent. > > (If it's possible) I really hope the special cases of Math.pow(x, 0.5) can be aligned with Math.sqrt(x) in Java. > We already allow some plausible behaviors to be different with the IEEE recommendations for some special cases, right? > And in that case, we can replace pow(x, 0.5) with sqrt(x) safely. > > Thanks. > Best regards, > Jie > > > On 2021/4/12, 6:40 PM, "Raffaello Giulietti" wrote: > > Hi Jie, > > the behavior you report is the one specified by the standard IEEE 754. > Java follows this standard as closely as it can. > > The standard says that > * squareRoot(-0) = -0 > * squareRoot(-?) = NaN > > Also, the standard has a long lists of special cases for pow(x, y), > among them: > * pow(?0, y) is +0 for finite y > 0 and not an odd integer > * pow(-?, y) is +? for finite y > 0 and not an odd integer > > Thus, the conflicts you observe originate in following the standard, not > by special Java rules. > > Unfortunately, the IEEE standard does not explain the reasons for the > special rules. Some are obvious, some are not. > > > HTH > Raffaello > > > > Hi all, > > > > I found Math.pow(x, 0.5) and Math.sqrt(x) would compute different values as the following: > > ``` > > Math.pow(-0.0, 0.5) = 0.0 > > Math.sqrt(-0.0) = -0.0 > > > > Math.pow(Double.NEGATIVE_INFINITY, 0.5) = Infinity > > Math.sqrt(Double.NEGATIVE_INFINITY) = NaN > > ``` > > > > The reason is that both of pow and sqrt have special rules for these computations. > > For example, this rule [1] specifies Math.pow(-0.0, 0.5) must be 0.0. > > And this one [2] specifies Math.sqrt(-0.0) must be -0.0. > > And we do have rules for Math.pow(Double.NEGATIVE_INFINITY, 0.5) = Infinity and Math.sqrt(Double.NEGATIVE_INFINITY) = NaN too. > > > > I think most people will be confused by these rules because from the view of mathematics, Math.pow(x, 0.5) should be equal to Math.sqrt(x). > > > > So why Java creates conflict special rules for them? > > Is it possible to let Math.pow(-0.0, 0.5) = -0.0 and Math.pow(Double.NEGATIVE_INFINITY, 0.5) = NaN also be allowed? > > > > I came across this problem when I was trying to optimize pow(x, 0.5) with sqrt(x). > > If pow(x, 0.5)'s two special rules can be aligned with sqrt(x), then pow(x, 0.5)'s performance can be improved by 7x~14x [3]. > > > > Thanks. > > Best regards, > > Jie > > > From vlivanov at openjdk.java.net Mon Apr 12 15:04:39 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Mon, 12 Apr 2021 15:04:39 GMT Subject: RFR: 8264954: unified handling for VectorMask object re-materialization during de-optimization [v2] In-Reply-To: References: <4jMKSSVNKx5Fh7hek0pT5MMIYAnTjB7y7faOh_L60Pw=.2ce391ed-b29c-4a05-a412-bb3935126b2f@github.com> Message-ID: On Mon, 12 Apr 2021 08:16:13 GMT, Jatin Bhateja wrote: > In addition to bt != T_BOOLEAN check, we also need to check if type is a vectormask (explicit type added for mask generating nodes on targets supporting predicate registers). You check here that the mask is in canonical representation (which is a vector of booleans) before applying the canonicalizing conversion. Alternatively, you could check that the mask is in "native"/platform-dependent representation (`vec_value` has `TypeVectMask` type). So, my reading of your response is that `bt != T_BOOLEAN` check is not enough to reliably distinguish between the representations: a mask for vector of booleans (in native representation) will have the same type as the vector of booleans produced by the canonicalizing cast of a mask value. I agree with that. But until `TypeVectMask` is used uniformly and independently of predicate register support, `TypeVect` vs `TypeVectMask` doesn't cut the problem as well. And adding `bt != T_BOOLEAN` doesn't help. So, I'm fine with leaving the Opcode check for now. In the longer term, we need to come up with a uniform representation in type system for masks in native format to be able to reliably distinguish between masks and vectors in cross-platform manner. ------------- PR: https://git.openjdk.java.net/jdk/pull/3408 From vlivanov at openjdk.java.net Mon Apr 12 15:12:30 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Mon, 12 Apr 2021 15:12:30 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v3] In-Reply-To: References: Message-ID: On Mon, 12 Apr 2021 07:53:36 GMT, Xiaohong Gong wrote: >> Good idea, I will try this way. Thanks! > > I met the bad AD file issue when I remove the changes in AD files and overriding `Node::ideal_reg()`. I guess if a node is not used as an input of other node, this can work well? For the `VectorMaskCastNode`, it must be an input for some other nodes. If just disable the matching of this node, how does its usage find the right input code? Yeah, `ideal_reg() { return 0; }` won't work here. Sorry for the misleading suggestion. ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From erikj at openjdk.java.net Mon Apr 12 16:21:50 2021 From: erikj at openjdk.java.net (Erik Joelsson) Date: Mon, 12 Apr 2021 16:21:50 GMT Subject: RFR: 8264806: Remove the experimental JIT compiler [v2] In-Reply-To: <_PS0KHvkB_l9YrKjZ7wLAiKb-SK761YFub4XU5mrBRc=.c32572f6-063f-4503-a20f-aa6b9115f808@github.com> References: <_PS0KHvkB_l9YrKjZ7wLAiKb-SK761YFub4XU5mrBRc=.c32572f6-063f-4503-a20f-aa6b9115f808@github.com> Message-ID: On Fri, 9 Apr 2021 22:26:40 GMT, Vladimir Kozlov wrote: >> As part of [JEP 410](http://openjdk.java.net/jeps/410) remove code related to Java-based JIT compiler (Graal) from JDK: >> >> - `jdk.internal.vm.compiler` ? the Graal compiler >> - `jdk.internal.vm.compiler.management` ? Graal's `MBean` >> - `test/hotspot/jtreg/compiler/graalunit` ? Graal's unit tests >> >> Remove Graal related code in makefiles. >> >> Note, next two `module-info.java` files are preserved so that the JVMCI module `jdk.internal.vm.ci` continues to build: >> >> src/jdk.internal.vm.compiler/share/classes/module-info.java >> src/jdk.internal.vm.compiler.management/share/classes/module-info.java >> >> >> @AlanBateman suggested that we can avoid it by using Module API to export packages at runtime . It requires changes in GraalVM's JVMCI too so I will file followup RFE to implement it. >> >> Tested hs-tier1-4 > > Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Restore Graal Builder image makefile > - Merge latest changes from branch 'JDK-8264805' into JDK-8264806 > - 8264806: Remove the experimental JIT compiler make/common/Modules.gmk line 68: > 66: > 67: # Filter out Graal specific modules > 68: MODULES_FILTER += jdk.internal.vm.compiler If we are unconditionally filtering out these modules, then why leave the module-info.java files in at all? ------------- PR: https://git.openjdk.java.net/jdk/pull/3421 From iklam at openjdk.java.net Mon Apr 12 16:52:33 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 12 Apr 2021 16:52:33 GMT Subject: RFR: 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines [v4] In-Reply-To: <_q9nVMOqIePMWqifBZo_Cj9Vsal63l2BLN8X7x5S8iw=.abdcc58f-877d-4e36-b26e-3699cb05eeda@github.com> References: <_q9nVMOqIePMWqifBZo_Cj9Vsal63l2BLN8X7x5S8iw=.abdcc58f-877d-4e36-b26e-3699cb05eeda@github.com> Message-ID: <5dhg50Ktvcm23PgJR3WejtOxkozTDxNLjAn07pMzPAY=.7d29d3d1-8a14-403f-9cbe-acd1bfe05e11@github.com> On Fri, 9 Apr 2021 05:08:37 GMT, David Holmes wrote: >> The existing JRT_ENTRY (and related) macros require the function to which they are applied to declare a parameter "JavaThread* thread" which represents the current thread. These functions are all implicitly "traps" functions as they can result in exceptions, but they are not declared with TRAPS because the only caller of these functions is the runtime itself (via call_VM) and no callers need to be aware to use CHECK; further they need a JavaThread. So the macro declares the THREAD variable for use with other exception-producing functions and assigns it from "thread". >> >> The majority of this change replaces the parameter name "thread" with "current" so that it is clear that we are always dealing with the current thread. This affects the entry functions as well as the functions called therefrom. >> >> We can then also replace the use of "THREAD" with "current", in contexts that are not related to exception processing. >> >> Some methods called by entry functions were declared to have both a "thread" parameter and a "TRAPS" parameter - with nothing to tell you these are always the same, current, thread. So the "thread" parameter is removed and replaced with a local variable "current" obtained from THREAD->as_Java_thread(). >> >> Some missing CHECK_ uses were added. >> >> Testing: >> - tiers 1-3 >> >> Thanks, >> David > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Fix search&replace mistake LGTM ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3394 From vlivanov at openjdk.java.net Mon Apr 12 16:52:47 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Mon, 12 Apr 2021 16:52:47 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v3] In-Reply-To: References: Message-ID: On Mon, 12 Apr 2021 07:00:58 GMT, Xiaohong Gong wrote: >> src/hotspot/share/opto/vectornode.cpp line 1237: >> >>> 1235: if (in_vt->length_in_bytes() == out_vt->length_in_bytes() && >>> 1236: Matcher::match_rule_supported(Op_VectorMaskCast)) { >>> 1237: // VectorUnbox (VectorBox vmask) ==> VectorMaskCast (vmask) >> >> It's better to implement it as a 2-step transformation and place it in `VectorLoadMaskNode::Ideal()`: >> >> VectorUnbox (VectorBox vmask) ==> VectorLoadMask (VectorStoreMask vmask) => VectorMaskCast (vmask) > > Thanks for your comments. Yes, theoretically it's better to place it in `VectorLoadMaskNode::Ideal()`. Unfortunately, we met an issue that is related to optimization for `VectorStoreMask`. Considering the following case: > > LoadVector LoadVector > | | > VectorLoadMask (double) VectorLoadMask (double) > | | > VectorUnbox (long) ==> VectorStoreMask (double) > | > VectorLoadMask (long) > > This case loads the masking values for a double type, and does a bitwise `and` operation. Since the type is mismatched, the compiler will try to do `VectorUnbox (VectorBox vmask) ==> VectorLoadMask (VectorStoreMask vmask)`. However, since there is the transformation `VectorStoreMask (VectorLoadMask value) ==> value`, the above `VectorStoreMaskNode` will be optimized out. The final graph looks like: > > > LoadVector LoadVector > | / \ > VectorLoadMask (double) / \ > | ==> VectorLoadMask (double) \ > VectorStoreMask (double) VectorLoadMask (long) > | > VectorLoadMask (long) > > Since the two `VectorLoadMaskNode` have different element type, the GVN cannot optimize out one. So finally there will be two similar `VectorLoadMaskNode`s. That's also why I override the `cmp/hash` for `VectorLoadMaskNode` in the first version. > > So I prefer to add `VectorUnbox (VectorBox vmask) ==> VectorMaskCast (vmask)` directly. Ok, so you face a transformation ordering problem here. By working on `VectorUnbox (VectorBox vmask)` you effectively delay `VectorStoreMask (VectorLoadMask vmask) => vmask` transformation. As an alternative you could: (1) check for `VectorLoadMask` users before applying `VectorStoreMask (VectorLoadMask vmask) => vmask`; (2) nest adjacent casts: VectorLoadMask #double (1 LoadVector) VectorLoadMask #long (1 LoadVector) ==> VectorMaskCast #long (VectorLoadMask #double (1 LoadVector) The latter looks more powerful (and hence preferrable), but I'm fine with what you have right now. (It can be enhanced later.) Please, leave a comment describing the motivation for doing the transformation directly on `VectorUnbox (VectorBox ...)`. ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From kvn at openjdk.java.net Mon Apr 12 17:22:00 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 12 Apr 2021 17:22:00 GMT Subject: RFR: 8264806: Remove the experimental JIT compiler [v2] In-Reply-To: References: <_PS0KHvkB_l9YrKjZ7wLAiKb-SK761YFub4XU5mrBRc=.c32572f6-063f-4503-a20f-aa6b9115f808@github.com> Message-ID: On Mon, 12 Apr 2021 16:18:32 GMT, Erik Joelsson wrote: >> Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Restore Graal Builder image makefile >> - Merge latest changes from branch 'JDK-8264805' into JDK-8264806 >> - 8264806: Remove the experimental JIT compiler > > make/common/Modules.gmk line 68: > >> 66: >> 67: # Filter out Graal specific modules >> 68: MODULES_FILTER += jdk.internal.vm.compiler > > If we are unconditionally filtering out these modules, then why leave the module-info.java files in at all? We filter out because we can't build Graal anymore. But we need these module-info.java files because JVMCI's module-info.java references them: https://github.com/openjdk/jdk/blob/master/src/jdk.internal.vm.ci/share/classes/module-info.java#L26 Otherwise we can't build JVMCI which we continue to support. I filed followup RFE to implement Alan's suggestion to use Module API which will allow to remove these files later: https://bugs.openjdk.java.net/browse/JDK-8265091 ------------- PR: https://git.openjdk.java.net/jdk/pull/3421 From joe.darcy at oracle.com Mon Apr 12 18:20:28 2021 From: joe.darcy at oracle.com (Joe Darcy) Date: Mon, 12 Apr 2021 11:20:28 -0700 Subject: Conflict API definitions of Math.pow(x, 0.5) and Math.sqrt(x) for x={-0.0, Double.NEGATIVE_INFINITY}(Internet mail) In-Reply-To: References: <72184DF4-C63B-488F-8490-074247292022@tencent.com> <05560ff3-7ea3-6e4d-ff0f-b25358b7f7b7@gmail.com> Message-ID: Hello, Adding some additional context, more recent versions of the IEEE 754 standard have given explicit recommendations for math library function definitions, including pow. The Java definitions of those methods long predate the IEEE 754 coverage and there are a small number of differences, including in the pow special cases. These differences are now described more explicitly in the JDK 17 specs as of JDK-8240632: Note differences between IEEE 754-2019 math lib special cases and java.lang.Math The difference in question for pow is: > ???? * @apiNote > ???? * The special cases definitions of this method differ from the > ???? * special case definitions of the IEEE 754 recommended {@code > ???? * pow} operation for ±{@code 1.0} raised to an infinite > ???? * power. This method treats such cases as indeterminate and > ???? * specifies a NaN is returned. The IEEE 754 specification treats > ???? * the infinite power as a large integer (large-magnitude > ???? * floating-point numbers are numerically integers, specifically > ???? * even integers) and therefore specifies {@code 1.0} be returned. There are no plans to align the Java definition of pow with the IEEE 754 definition in these few cases. Note that the Java library implementation of pow does delegate to sqrt while respecting the relevant special cases: > ??????????? } else if (y == 0.5) { > ??????????????? if (x >= -Double.MAX_VALUE) // Handle x == -infinity later > ??????????????????? return Math.sqrt(x + 0.0); // Add 0.0 to properly > handle x == -0.0 https://github.com/openjdk/jdk/blob/d84a7e55be40eae57b6c322694d55661a5053a55/src/java.base/share/classes/java/lang/FdLibm.java#L366 Mathematically, the sqrt function is about as pleasant a function as you can be asked to approximate in floating-point arithmetic. The function is smooth, doesn't overflow or underflow, and has a simple Newton-iteration that can be expressed in basic arithmetic operations. The pow function doesn't enjoy these properties and has a larger design space for what a reasonable floating-point approximation could be. HTH, -Joe On 4/12/2021 6:39 AM, jiefu(??) wrote: > Hi Raffaello, > > Thanks for your execllent analysis. > I agree with you now. > And I'll close the optimization PR [1] tomorrow if there is no objections. > > Thanks. > Best regards, > Jie > > [1] https://github.com/openjdk/jdk/pull/3404/ > > > ?On 2021/4/12, 9:06 PM, "Raffaello Giulietti" wrote: > > Hi Jie, > > I don't think that changing the spec of Math.pow() to be misaligned with > IEEE 754 would be a wise option. IEEE is much more pervasive than Java. > There are many aspects in IEEE that might be seen as questionable, but > at least it is a widely adopted standard. > > AFAIU, the only reason you would like to "optimize" the special case of > y = 0.5 in pow(x, y) to return sqrt(x) is for performance, more accuracy > and some kind of consistency. > > But then, why not a special case for y = 0.25 as sqrt(sqrt(x))? > And what about y = 0.75? Should this be translated to sqrt(sqrt(pow(x, 3)))? > What about y = 1.0 / 3.0? Should this become cbrt(x)? > And why not consider y = 2.0 / 3.0 in a special rule: cbrt(x * x)? > > You see, the special cases can quickly become unmanageable. Also, > special rules would produce results which are "discontinuous" with > nearby exponents, like y = 0.5000000000000001. > > That's probably why IEEE doesn't propose translation rules for finite > numerical exponents that are not integers, except when x is a special value. > > > Greetings > Raffaello > > > > On 2021-04-12 13:44, jiefu(??) wrote: > > Hi Andrew H, Andrew D, and Raffaello, > > > > Thank you all for your kind reply and helpful comments. > > > > Now I got where the rules come from. > > But I don't think the IEEE standars are reasonable to specify conflits rules. > > Maybe, these computations should be open to be implementation dependent. > > > > (If it's possible) I really hope the special cases of Math.pow(x, 0.5) can be aligned with Math.sqrt(x) in Java. > > We already allow some plausible behaviors to be different with the IEEE recommendations for some special cases, right? > > And in that case, we can replace pow(x, 0.5) with sqrt(x) safely. > > > > Thanks. > > Best regards, > > Jie > > > > > > On 2021/4/12, 6:40 PM, "Raffaello Giulietti" wrote: > > > > Hi Jie, > > > > the behavior you report is the one specified by the standard IEEE 754. > > Java follows this standard as closely as it can. > > > > The standard says that > > * squareRoot(-0) = -0 > > * squareRoot(-?) = NaN > > > > Also, the standard has a long lists of special cases for pow(x, y), > > among them: > > * pow(?0, y) is +0 for finite y > 0 and not an odd integer > > * pow(-?, y) is +? for finite y > 0 and not an odd integer > > > > Thus, the conflicts you observe originate in following the standard, not > > by special Java rules. > > > > Unfortunately, the IEEE standard does not explain the reasons for the > > special rules. Some are obvious, some are not. > > > > > > HTH > > Raffaello > > > > > > > Hi all, > > > > > > I found Math.pow(x, 0.5) and Math.sqrt(x) would compute different values as the following: > > > ``` > > > Math.pow(-0.0, 0.5) = 0.0 > > > Math.sqrt(-0.0) = -0.0 > > > > > > Math.pow(Double.NEGATIVE_INFINITY, 0.5) = Infinity > > > Math.sqrt(Double.NEGATIVE_INFINITY) = NaN > > > ``` > > > > > > The reason is that both of pow and sqrt have special rules for these computations. > > > For example, this rule [1] specifies Math.pow(-0.0, 0.5) must be 0.0. > > > And this one [2] specifies Math.sqrt(-0.0) must be -0.0. > > > And we do have rules for Math.pow(Double.NEGATIVE_INFINITY, 0.5) = Infinity and Math.sqrt(Double.NEGATIVE_INFINITY) = NaN too. > > > > > > I think most people will be confused by these rules because from the view of mathematics, Math.pow(x, 0.5) should be equal to Math.sqrt(x). > > > > > > So why Java creates conflict special rules for them? > > > Is it possible to let Math.pow(-0.0, 0.5) = -0.0 and Math.pow(Double.NEGATIVE_INFINITY, 0.5) = NaN also be allowed? > > > > > > I came across this problem when I was trying to optimize pow(x, 0.5) with sqrt(x). > > > If pow(x, 0.5)'s two special rules can be aligned with sqrt(x), then pow(x, 0.5)'s performance can be improved by 7x~14x [3]. > > > > > > Thanks. > > > Best regards, > > > Jie > > > > > > > > > From dcubed at openjdk.java.net Mon Apr 12 20:57:50 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 12 Apr 2021 20:57:50 GMT Subject: RFR: 8264954: unified handling for VectorMask object re-materialization during de-optimization [v3] In-Reply-To: References: Message-ID: On Mon, 12 Apr 2021 09:33:14 GMT, Jatin Bhateja wrote: >> Following flow describes object reconstruction for de-optimization:- >> 1) PhaseVector::scalarize_vbox_node() creates SafePointScalarObjectNode to captures the box type information, also it connects to node holding the boxed value. >> 2) During code emit phase (PhaseOutput) C2 process above information to dumps ObjectValue holding the box information and LocationValue to holding the value information into ScopeDescriptor corresponding to Safepoint PC. >> 3) De-optimization blobs dump the value held in registers to the stack locations using RegisterSave::save_live_registers() and a mapping b/w register and its stack location is added to RegisterMap. >> 4) During de-optimization, compiled frame objects are re-allocated using identity information held in ObjectValue and their fields are initialized using values held in the stack locations accessed through register-stack mappings. >> >> By inserting a VectorStoreMaskNode before stitching the mask holding node to Safepoint we make sure that value held in opmask/vector register is transferred to a byte vector. Thus rest of the flow works as it is, stack location will hold the value in the form of a byte array irrespective of the box shape. >> >> tier1-tier3 regressions are clean with UseAVX=2/3. > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > 8264954: Incorporating closing comments. ------------- PR: https://git.openjdk.java.net/jdk/pull/3408 From erikj at openjdk.java.net Mon Apr 12 20:58:35 2021 From: erikj at openjdk.java.net (Erik Joelsson) Date: Mon, 12 Apr 2021 20:58:35 GMT Subject: RFR: 8264806: Remove the experimental JIT compiler [v2] In-Reply-To: References: <_PS0KHvkB_l9YrKjZ7wLAiKb-SK761YFub4XU5mrBRc=.c32572f6-063f-4503-a20f-aa6b9115f808@github.com> Message-ID: <7HS0ES8bIxSFNXrIiGQRIgm5w30UQqGIHP7TmfWNDAg=.3a2af723-79ee-4ce8-9e1c-3873b09ed9c0@github.com> On Mon, 12 Apr 2021 17:18:36 GMT, Vladimir Kozlov wrote: >> make/common/Modules.gmk line 68: >> >>> 66: >>> 67: # Filter out Graal specific modules >>> 68: MODULES_FILTER += jdk.internal.vm.compiler >> >> If we are unconditionally filtering out these modules, then why leave the module-info.java files in at all? > > We filter out because we can't build Graal anymore. But we need these module-info.java files because JVMCI's module-info.java references them: > https://github.com/openjdk/jdk/blob/master/src/jdk.internal.vm.ci/share/classes/module-info.java#L26 > > Otherwise we can't build JVMCI which we continue to support. > > I filed followup RFE to implement Alan's suggestion to use Module API which will allow to remove these files later: > https://bugs.openjdk.java.net/browse/JDK-8265091 Right, I thought I saw something about modules that Alan commented on, but couldn't find it. All good then. ------------- PR: https://git.openjdk.java.net/jdk/pull/3421 From erikj at openjdk.java.net Mon Apr 12 20:58:32 2021 From: erikj at openjdk.java.net (Erik Joelsson) Date: Mon, 12 Apr 2021 20:58:32 GMT Subject: RFR: 8264806: Remove the experimental JIT compiler [v2] In-Reply-To: <_PS0KHvkB_l9YrKjZ7wLAiKb-SK761YFub4XU5mrBRc=.c32572f6-063f-4503-a20f-aa6b9115f808@github.com> References: <_PS0KHvkB_l9YrKjZ7wLAiKb-SK761YFub4XU5mrBRc=.c32572f6-063f-4503-a20f-aa6b9115f808@github.com> Message-ID: On Fri, 9 Apr 2021 22:26:40 GMT, Vladimir Kozlov wrote: >> As part of [JEP 410](http://openjdk.java.net/jeps/410) remove code related to Java-based JIT compiler (Graal) from JDK: >> >> - `jdk.internal.vm.compiler` ? the Graal compiler >> - `jdk.internal.vm.compiler.management` ? Graal's `MBean` >> - `test/hotspot/jtreg/compiler/graalunit` ? Graal's unit tests >> >> Remove Graal related code in makefiles. >> >> Note, next two `module-info.java` files are preserved so that the JVMCI module `jdk.internal.vm.ci` continues to build: >> >> src/jdk.internal.vm.compiler/share/classes/module-info.java >> src/jdk.internal.vm.compiler.management/share/classes/module-info.java >> >> >> @AlanBateman suggested that we can avoid it by using Module API to export packages at runtime . It requires changes in GraalVM's JVMCI too so I will file followup RFE to implement it. >> >> Tested hs-tier1-4 > > Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Restore Graal Builder image makefile > - Merge latest changes from branch 'JDK-8264805' into JDK-8264806 > - 8264806: Remove the experimental JIT compiler Build changes look good. ------------- Marked as reviewed by erikj (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3421 From kvn at openjdk.java.net Mon Apr 12 21:09:09 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 12 Apr 2021 21:09:09 GMT Subject: Integrated: 8265084: [BACKOUT] 8264954: unified handling for VectorMask object re-materialization during de-optimization Message-ID: JDK-8264954 changes caused failures on Aarch64 in Tier3 testing and needs to be backed out. Tested tier1 and ran jdk/incubator/vector/ tests on linux-x64 and linux-aarch64 machines. ------------- Commit messages: - 8265084: [BACKOUT] 8264954: unified handling for VectorMask object re-materialization during de-optimization Changes: https://git.openjdk.java.net/jdk/pull/3440/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3440&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8265084 Stats: 66 lines in 3 files changed: 22 ins; 22 del; 22 mod Patch: https://git.openjdk.java.net/jdk/pull/3440.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3440/head:pull/3440 PR: https://git.openjdk.java.net/jdk/pull/3440 From dcubed at openjdk.java.net Mon Apr 12 21:09:10 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 12 Apr 2021 21:09:10 GMT Subject: Integrated: 8265084: [BACKOUT] 8264954: unified handling for VectorMask object re-materialization during de-optimization In-Reply-To: References: Message-ID: <4IBHq8yAHwOPlx3eckA29DFOpxTREHTG8yNRh0VteOc=.5b09d422-8fb8-4348-ae9c-17af1be83c8d@github.com> On Mon, 12 Apr 2021 18:46:53 GMT, Vladimir Kozlov wrote: > JDK-8264954 changes caused failures on Aarch64 in Tier3 testing and needs to be backed out. > > Tested tier1 and ran jdk/incubator/vector/ tests on linux-x64 and linux-aarch64 machines. This appears to be an accurate [BACKOUT] of the original patch. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3440 From kvn at openjdk.java.net Mon Apr 12 21:09:11 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 12 Apr 2021 21:09:11 GMT Subject: Integrated: 8265084: [BACKOUT] 8264954: unified handling for VectorMask object re-materialization during de-optimization In-Reply-To: References: Message-ID: On Mon, 12 Apr 2021 18:46:53 GMT, Vladimir Kozlov wrote: > JDK-8264954 changes caused failures on Aarch64 in Tier3 testing and needs to be backed out. > > Tested tier1 and ran jdk/incubator/vector/ tests on linux-x64 and linux-aarch64 machines. This pull request has now been integrated. Changeset: 18bec9cf Author: Vladimir Kozlov URL: https://git.openjdk.java.net/jdk/commit/18bec9cf Stats: 66 lines in 3 files changed: 22 ins; 22 del; 22 mod 8265084: [BACKOUT] 8264954: unified handling for VectorMask object re-materialization during de-optimization Reviewed-by: dcubed ------------- PR: https://git.openjdk.java.net/jdk/pull/3440 From dholmes at openjdk.java.net Mon Apr 12 21:55:58 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 12 Apr 2021 21:55:58 GMT Subject: RFR: 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines [v2] In-Reply-To: References: Message-ID: On Thu, 8 Apr 2021 13:00:03 GMT, Coleen Phillimore wrote: >> David Holmes has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed CHECK on return statement. > > I think substituting "JavaThread* thread" for "JavaThread* current" is a good change and convention that makes the code more clear, so worth the dull code review and diffs. Thanks for the reviews @coleenp , @hseigel and @iklam . Can someone from compiler team please take a look and give the okay? (I know things are a bit busy at the moment.) Thanks, David ------------- PR: https://git.openjdk.java.net/jdk/pull/3394 From kvn at openjdk.java.net Mon Apr 12 22:10:06 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 12 Apr 2021 22:10:06 GMT Subject: RFR: 8264806: Remove the experimental JIT compiler [v3] In-Reply-To: References: Message-ID: > As part of [JEP 410](http://openjdk.java.net/jeps/410) remove code related to Java-based JIT compiler (Graal) from JDK: > > - `jdk.internal.vm.compiler` ? the Graal compiler > - `jdk.internal.vm.compiler.management` ? Graal's `MBean` > - `test/hotspot/jtreg/compiler/graalunit` ? Graal's unit tests > > Remove Graal related code in makefiles. > > Note, next two `module-info.java` files are preserved so that the JVMCI module `jdk.internal.vm.ci` continues to build: > > src/jdk.internal.vm.compiler/share/classes/module-info.java > src/jdk.internal.vm.compiler.management/share/classes/module-info.java > > > @AlanBateman suggested that we can avoid it by using Module API to export packages at runtime . It requires changes in GraalVM's JVMCI too so I will file followup RFE to implement it. > > Tested hs-tier1-4 Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: Restore Compiler::isGraalEnabled() ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3421/files - new: https://git.openjdk.java.net/jdk/pull/3421/files/a246aaa6..9d6bd42c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3421&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3421&range=01-02 Stats: 25 lines in 1 file changed: 23 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/3421.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3421/head:pull/3421 PR: https://git.openjdk.java.net/jdk/pull/3421 From kvn at openjdk.java.net Mon Apr 12 22:26:09 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 12 Apr 2021 22:26:09 GMT Subject: RFR: 8264806: Remove the experimental JIT compiler [v2] In-Reply-To: References: <_PS0KHvkB_l9YrKjZ7wLAiKb-SK761YFub4XU5mrBRc=.c32572f6-063f-4503-a20f-aa6b9115f808@github.com> Message-ID: On Sun, 11 Apr 2021 10:25:47 GMT, Doug Simon wrote: >> Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Restore Graal Builder image makefile >> - Merge latest changes from branch 'JDK-8264805' into JDK-8264806 >> - 8264806: Remove the experimental JIT compiler > > We would definitely like to be able to continue testing of GraalVM with the JDK set of jtreg tests. So keeping `Compiler::isGraalEnabled()` working like it does today is important. @dougxc I restored Compiler::isGraalEnabled(). ------------- PR: https://git.openjdk.java.net/jdk/pull/3421 From jiefu at tencent.com Tue Apr 13 01:26:02 2021 From: jiefu at tencent.com (=?utf-8?B?amllZnUo5YKF5p2wKQ==?=) Date: Tue, 13 Apr 2021 01:26:02 +0000 Subject: Conflict API definitions of Math.pow(x, 0.5) and Math.sqrt(x) for x={-0.0, Double.NEGATIVE_INFINITY}(Internet mail) In-Reply-To: References: <72184DF4-C63B-488F-8490-074247292022@tencent.com> <05560ff3-7ea3-6e4d-ff0f-b25358b7f7b7@gmail.com> Message-ID: <0545E3C5-7216-4F39-AF95-D8854CA6B84E@tencent.com> Hi Joe, Thanks for your nice sharing. Very glad to know that the Java library implementation of pow does delegate to sqrt while respecting the relevant special cases. This implementation [1] has set a good example for us which means it's safe to replace pow(x, 0.5) with sqrt(x) for all x > 0.0. Thanks Best regards, Jie [1] https://github.com/openjdk/jdk/blob/d84a7e55be40eae57b6c322694d55661a5053a55/src/java.base/share/classes/java/lang/FdLibm.java#L366 ?On 2021/4/13, 2:21 AM, "Joe Darcy" wrote: Hello, Adding some additional context, more recent versions of the IEEE 754 standard have given explicit recommendations for math library function definitions, including pow. The Java definitions of those methods long predate the IEEE 754 coverage and there are a small number of differences, including in the pow special cases. These differences are now described more explicitly in the JDK 17 specs as of JDK-8240632: Note differences between IEEE 754-2019 math lib special cases and java.lang.Math The difference in question for pow is: > * @apiNote > * The special cases definitions of this method differ from the > * special case definitions of the IEEE 754 recommended {@code > * pow} operation for ±{@code 1.0} raised to an infinite > * power. This method treats such cases as indeterminate and > * specifies a NaN is returned. The IEEE 754 specification treats > * the infinite power as a large integer (large-magnitude > * floating-point numbers are numerically integers, specifically > * even integers) and therefore specifies {@code 1.0} be returned. There are no plans to align the Java definition of pow with the IEEE 754 definition in these few cases. Note that the Java library implementation of pow does delegate to sqrt while respecting the relevant special cases: > } else if (y == 0.5) { > if (x >= -Double.MAX_VALUE) // Handle x == -infinity later > return Math.sqrt(x + 0.0); // Add 0.0 to properly > handle x == -0.0 https://github.com/openjdk/jdk/blob/d84a7e55be40eae57b6c322694d55661a5053a55/src/java.base/share/classes/java/lang/FdLibm.java#L366 Mathematically, the sqrt function is about as pleasant a function as you can be asked to approximate in floating-point arithmetic. The function is smooth, doesn't overflow or underflow, and has a simple Newton-iteration that can be expressed in basic arithmetic operations. The pow function doesn't enjoy these properties and has a larger design space for what a reasonable floating-point approximation could be. HTH, -Joe On 4/12/2021 6:39 AM, jiefu(??) wrote: > Hi Raffaello, > > Thanks for your execllent analysis. > I agree with you now. > And I'll close the optimization PR [1] tomorrow if there is no objections. > > Thanks. > Best regards, > Jie > > [1] https://github.com/openjdk/jdk/pull/3404/ > > > On 2021/4/12, 9:06 PM, "Raffaello Giulietti" wrote: > > Hi Jie, > > I don't think that changing the spec of Math.pow() to be misaligned with > IEEE 754 would be a wise option. IEEE is much more pervasive than Java. > There are many aspects in IEEE that might be seen as questionable, but > at least it is a widely adopted standard. > > AFAIU, the only reason you would like to "optimize" the special case of > y = 0.5 in pow(x, y) to return sqrt(x) is for performance, more accuracy > and some kind of consistency. > > But then, why not a special case for y = 0.25 as sqrt(sqrt(x))? > And what about y = 0.75? Should this be translated to sqrt(sqrt(pow(x, 3)))? > What about y = 1.0 / 3.0? Should this become cbrt(x)? > And why not consider y = 2.0 / 3.0 in a special rule: cbrt(x * x)? > > You see, the special cases can quickly become unmanageable. Also, > special rules would produce results which are "discontinuous" with > nearby exponents, like y = 0.5000000000000001. > > That's probably why IEEE doesn't propose translation rules for finite > numerical exponents that are not integers, except when x is a special value. > > > Greetings > Raffaello > > > > On 2021-04-12 13:44, jiefu(??) wrote: > > Hi Andrew H, Andrew D, and Raffaello, > > > > Thank you all for your kind reply and helpful comments. > > > > Now I got where the rules come from. > > But I don't think the IEEE standars are reasonable to specify conflits rules. > > Maybe, these computations should be open to be implementation dependent. > > > > (If it's possible) I really hope the special cases of Math.pow(x, 0.5) can be aligned with Math.sqrt(x) in Java. > > We already allow some plausible behaviors to be different with the IEEE recommendations for some special cases, right? > > And in that case, we can replace pow(x, 0.5) with sqrt(x) safely. > > > > Thanks. > > Best regards, > > Jie > > > > > > On 2021/4/12, 6:40 PM, "Raffaello Giulietti" wrote: > > > > Hi Jie, > > > > the behavior you report is the one specified by the standard IEEE 754. > > Java follows this standard as closely as it can. > > > > The standard says that > > * squareRoot(-0) = -0 > > * squareRoot(-?) = NaN > > > > Also, the standard has a long lists of special cases for pow(x, y), > > among them: > > * pow(?0, y) is +0 for finite y > 0 and not an odd integer > > * pow(-?, y) is +? for finite y > 0 and not an odd integer > > > > Thus, the conflicts you observe originate in following the standard, not > > by special Java rules. > > > > Unfortunately, the IEEE standard does not explain the reasons for the > > special rules. Some are obvious, some are not. > > > > > > HTH > > Raffaello > > > > > > > Hi all, > > > > > > I found Math.pow(x, 0.5) and Math.sqrt(x) would compute different values as the following: > > > ``` > > > Math.pow(-0.0, 0.5) = 0.0 > > > Math.sqrt(-0.0) = -0.0 > > > > > > Math.pow(Double.NEGATIVE_INFINITY, 0.5) = Infinity > > > Math.sqrt(Double.NEGATIVE_INFINITY) = NaN > > > ``` > > > > > > The reason is that both of pow and sqrt have special rules for these computations. > > > For example, this rule [1] specifies Math.pow(-0.0, 0.5) must be 0.0. > > > And this one [2] specifies Math.sqrt(-0.0) must be -0.0. > > > And we do have rules for Math.pow(Double.NEGATIVE_INFINITY, 0.5) = Infinity and Math.sqrt(Double.NEGATIVE_INFINITY) = NaN too. > > > > > > I think most people will be confused by these rules because from the view of mathematics, Math.pow(x, 0.5) should be equal to Math.sqrt(x). > > > > > > So why Java creates conflict special rules for them? > > > Is it possible to let Math.pow(-0.0, 0.5) = -0.0 and Math.pow(Double.NEGATIVE_INFINITY, 0.5) = NaN also be allowed? > > > > > > I came across this problem when I was trying to optimize pow(x, 0.5) with sqrt(x). > > > If pow(x, 0.5)'s two special rules can be aligned with sqrt(x), then pow(x, 0.5)'s performance can be improved by 7x~14x [3]. > > > > > > Thanks. > > > Best regards, > > > Jie > > > > > > > > > From yyang at openjdk.java.net Tue Apr 13 03:16:58 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Tue, 13 Apr 2021 03:16:58 GMT Subject: RFR: 8263790: C2: new igv_print_immediately() for debugging purpose In-Reply-To: References: Message-ID: On Thu, 18 Mar 2021 10:05:12 GMT, Yi Yang wrote: > Add a new igv_print_immediately, it prints the current method immediately. This differs from other igv_print* methods, it creates a well-formed and complete ideal graph xml immediately, while others accomplish their ideal graph xml when VM exists (i.e. destructor of `IdealGraphPrinter::_xx_printer`). If VM crashes before VM exit, the ideal graph xml will be ill-formed, this is fairly a common case when debugging another crash. > > Test manually! > > Cheers, > Yang > > In the catch clause, ex.getMessage() compares with ASCII characters, but ex.getMessage()gets characters that corresponding to their system locale settings. To support non-English system locale settings(if needed), we could code something like this: > > ```java > > if (!(ex instanceof SAXParseException) || !"XML document structures must start and end within the same entity.".equals(disable_i18n(ex.getMessage()))) > > ``` > > That makes sense, can you please try if this fix works for you? [robcasloz at 702e763](https://github.com/robcasloz/jdk/commit/702e763) > > If it does, please feel free to re-purpose this issue with the proposed fix (once #3361 is integrated), and with a test case that parses an incomplete XML document in `src/utils/IdealGraphVisualizer/Data/src/test/java/com/sun/hotspot/igv/data/serialization/ParserTest.java`. Please make sure to test in on different JDK versions (I suggest JDK 8, 11, and 15). Thank you @robcasloz, I can verify it on the weekend. In order to not disturb others, I'd like to close this PR and I have filed this as https://bugs.openjdk.java.net/browse/JDK-8265106. ------------- PR: https://git.openjdk.java.net/jdk/pull/3071 From yyang at openjdk.java.net Tue Apr 13 03:16:59 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Tue, 13 Apr 2021 03:16:59 GMT Subject: Withdrawn: 8263790: C2: new igv_print_immediately() for debugging purpose In-Reply-To: References: Message-ID: <8Xr30HdfeM42JQOKfeX6Kh0WPEEI3APNGE58mVnrC38=.8d31d6fb-7740-4392-bd28-68ed1d8ce042@github.com> On Thu, 18 Mar 2021 10:05:12 GMT, Yi Yang wrote: > Add a new igv_print_immediately, it prints the current method immediately. This differs from other igv_print* methods, it creates a well-formed and complete ideal graph xml immediately, while others accomplish their ideal graph xml when VM exists (i.e. destructor of `IdealGraphPrinter::_xx_printer`). If VM crashes before VM exit, the ideal graph xml will be ill-formed, this is fairly a common case when debugging another crash. > > Test manually! > > Cheers, > Yang This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/3071 From xgong at openjdk.java.net Tue Apr 13 03:21:21 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Tue, 13 Apr 2021 03:21:21 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v4] In-Reply-To: References: Message-ID: > The Vector API defines different element types for floating point VectorMask. For example, the bitwise related APIs use "`long/int`", while data related APIs use "`double/float`". This makes some optimizations that based on the type checking cannot work well. > > For example, the VectorBox/Unbox elimination like `"VectorUnbox (VectorBox v) ==> v"` requires the types of output and > input are equal. Normally this is necessary. However, due to the different element type for floating point VectorMask with the same species, the VectorBox/Unbox pattern is optimized to: > > VectorLoadMask (VectorStoreMask vmask) > > Actually the types can be treated as the same one for such cases. And considering the vector mask representation is the same for > vectors with the same element size and vector length, it's safe to do the optimization: > > VectorLoadMask (VectorStoreMask vmask) ==> vmask > > The same issue exists for GVN which is based on the type of a node. Making the mask node's` hash()/cmp()` methods depends on the element size instead of the element type can fix it. Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: Use "Matcher::match_rule_supported_vector" for vector nodes checking ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3238/files - new: https://git.openjdk.java.net/jdk/pull/3238/files/ce3577ae..977787e4 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3238&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3238&range=02-03 Stats: 4 lines in 1 file changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/3238.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3238/head:pull/3238 PR: https://git.openjdk.java.net/jdk/pull/3238 From xgong at openjdk.java.net Tue Apr 13 04:16:58 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Tue, 13 Apr 2021 04:16:58 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v2] In-Reply-To: References: Message-ID: <2lJG4-6hFiftQkaVD0mthu9hLUPDSO1hwsJzlx4uPLk=.64989e5d-be84-418c-b284-d1ede3544663@github.com> On Thu, 8 Apr 2021 10:14:21 GMT, Vladimir Ivanov wrote: >>> @iwanowww should confirm correctness of such optimization. >>> Regarding changes - they seem fine to me. I notice that VectorNode and its subclasses do not check for TOP inputs. Since Vector API introduce vectors in graph before SuperWord transformation their input could become dead. How such cases handled? And why we did not hit them yet? is_vect() should hit assert. >> >> Thanks for looking at this PR @vnkozlov . To be honest, I'v no idea about the TOP checking issue to the inputs of the VectorNode. Hope @iwanowww could explain more. Thanks! > >> I notice that VectorNode and its subclasses do not check for TOP inputs. >> Since Vector API introduce vectors in graph before SuperWord transformation their input could become dead. >> How such cases handled? And why we did not hit them yet? is_vect() should hit assert. > > `VectorLoadMaskNode::Identity()` can't observe TOP types because it uses the type cached at construction (`type()` and `VectorNode` extends `TypeNode`). Still, a TOP input is possible and should be filtered out by opcode check (`in(1)->Opcode() == Op_VectorStoreMask`). Hi @iwanowww , all your review comments have been addressed. Would you mind having a look at it again? Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From xgong at openjdk.java.net Tue Apr 13 04:16:58 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Tue, 13 Apr 2021 04:16:58 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v3] In-Reply-To: References: Message-ID: On Mon, 12 Apr 2021 16:50:04 GMT, Vladimir Ivanov wrote: >> Thanks for your comments. Yes, theoretically it's better to place it in `VectorLoadMaskNode::Ideal()`. Unfortunately, we met an issue that is related to optimization for `VectorStoreMask`. Considering the following case: >> >> LoadVector LoadVector >> | | >> VectorLoadMask (double) VectorLoadMask (double) >> | | >> VectorUnbox (long) ==> VectorStoreMask (double) >> | >> VectorLoadMask (long) >> >> This case loads the masking values for a double type, and does a bitwise `and` operation. Since the type is mismatched, the compiler will try to do `VectorUnbox (VectorBox vmask) ==> VectorLoadMask (VectorStoreMask vmask)`. However, since there is the transformation `VectorStoreMask (VectorLoadMask value) ==> value`, the above `VectorStoreMaskNode` will be optimized out. The final graph looks like: >> >> >> LoadVector LoadVector >> | / \ >> VectorLoadMask (double) / \ >> | ==> VectorLoadMask (double) \ >> VectorStoreMask (double) VectorLoadMask (long) >> | >> VectorLoadMask (long) >> >> Since the two `VectorLoadMaskNode` have different element type, the GVN cannot optimize out one. So finally there will be two similar `VectorLoadMaskNode`s. That's also why I override the `cmp/hash` for `VectorLoadMaskNode` in the first version. >> >> So I prefer to add `VectorUnbox (VectorBox vmask) ==> VectorMaskCast (vmask)` directly. > > Ok, so you face a transformation ordering problem here. > > By working on `VectorUnbox (VectorBox vmask)` you effectively delay `VectorStoreMask (VectorLoadMask vmask) => vmask` transformation. > > As an alternative you could: > > (1) check for `VectorLoadMask` users before applying `VectorStoreMask (VectorLoadMask vmask) => vmask`; > > (2) nest adjacent casts: > > VectorLoadMask #double (1 LoadVector) > VectorLoadMask #long (1 LoadVector) > ==> > VectorMaskCast #long (VectorLoadMask #double (1 LoadVector) > > > The latter looks more powerful (and hence preferrable), but I'm fine with what you have right now. (It can be enhanced later.) > Please, leave a comment describing the motivation for doing the transformation directly on `VectorUnbox (VectorBox ...)`. Thanks for your alternative advice! I prefer to keep the code as it is right now. Also the comments have been added. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From xgong at openjdk.java.net Tue Apr 13 04:16:59 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Tue, 13 Apr 2021 04:16:59 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v3] In-Reply-To: References: Message-ID: <2hVkatGHq3Sdxeba9QrM3VVSEth1uaoYFDdUrojGyKQ=.004ae672-bdbc-45b3-bbfc-b16a55ad6390@github.com> On Mon, 12 Apr 2021 15:09:56 GMT, Vladimir Ivanov wrote: >> I met the bad AD file issue when I remove the changes in AD files and overriding `Node::ideal_reg()`. I guess if a node is not used as an input of other node, this can work well? For the `VectorMaskCastNode`, it must be an input for some other nodes. If just disable the matching of this node, how does its usage find the right input code? > > Yeah, `ideal_reg() { return 0; }` won't work here. Sorry for the misleading suggestion. That's ok. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From xgong at openjdk.java.net Tue Apr 13 06:39:18 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Tue, 13 Apr 2021 06:39:18 GMT Subject: RFR: 8264957: Type::dual_type array is not aligned with enum TYPES [v2] In-Reply-To: References: Message-ID: > This is a bug fix for [1] which adds a new vector mask type. The new added TYPE `"VectorMask"` is inserted into `enum TYPES`, while the array `"Type::dual_type"` is not updated. This makes the array elements are not aligned with TYPES. > > I met the following crash due to this issue when I was working on the masking feature support on panama-vector: > > Internal Error (/home/xiagon01/code/panama-vector/src/hotspot/share/opto/type.hpp:1727), pid=104432, tid=104449 > # assert(_base >= AnyPtr && _base <= KlassPtr) failed: Not a pointer > > Adding a value like other vector types for the `"VectorMask"` in the array `"dual_type"` can fix it. > > [1] https://bugs.openjdk.java.net/browse/JDK-8262355 > > Tested with tier1 and jdk:tier3 Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: Remove array Type::dual_type ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3410/files - new: https://git.openjdk.java.net/jdk/pull/3410/files/b1c68018..28df8046 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3410&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3410&range=00-01 Stats: 44 lines in 2 files changed: 0 ins; 44 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3410.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3410/head:pull/3410 PR: https://git.openjdk.java.net/jdk/pull/3410 From xgong at openjdk.java.net Tue Apr 13 06:39:19 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Tue, 13 Apr 2021 06:39:19 GMT Subject: RFR: 8264957: Type::dual_type array is not aligned with enum TYPES [v2] In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 11:02:06 GMT, Jie Fu wrote: >> Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove array Type::dual_type > > This sounds reasonable to me. > Thanks. Hi, the array `Type:;dual_type` is totally removed. Tested `hotspot:hotspot_all_no_apps, langtools:tier1, jdk:jdk_core and jdk:tier3` internally. Would you mind having a look at it again @DamonFool @neliasso ? Thanks so much! ------------- PR: https://git.openjdk.java.net/jdk/pull/3410 From jiefu at openjdk.java.net Tue Apr 13 07:03:04 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Tue, 13 Apr 2021 07:03:04 GMT Subject: RFR: 8264957: Type::dual_type array is not aligned with enum TYPES [v2] In-Reply-To: References: Message-ID: On Tue, 13 Apr 2021 06:39:18 GMT, Xiaohong Gong wrote: >> This is a bug fix for [1] which adds a new vector mask type. The new added TYPE `"VectorMask"` is inserted into `enum TYPES`, while the array `"Type::dual_type"` is not updated. This makes the array elements are not aligned with TYPES. >> >> I met the following crash due to this issue when I was working on the masking feature support on panama-vector: >> >> Internal Error (/home/xiagon01/code/panama-vector/src/hotspot/share/opto/type.hpp:1727), pid=104432, tid=104449 >> # assert(_base >= AnyPtr && _base <= KlassPtr) failed: Not a pointer >> >> Adding a value like other vector types for the `"VectorMask"` in the array `"dual_type"` can fix it. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8262355 >> >> Tested with tier1 and jdk:tier3 > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Remove array Type::dual_type I didn't find the usage of that array either. So the change looks fine to me. I'm afraid the JBS needs to be updated as a cleanup issue. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/3410 From xgong at openjdk.java.net Tue Apr 13 07:05:57 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Tue, 13 Apr 2021 07:05:57 GMT Subject: RFR: 8264957: Type::dual_type array is not aligned with enum TYPES [v2] In-Reply-To: References: Message-ID: On Tue, 13 Apr 2021 06:59:46 GMT, Jie Fu wrote: > I didn't find the usage of that array either. > So the change looks fine to me. > > I'm afraid the JBS needs to be updated as a cleanup issue. > Thanks. Seems reasonable. I will update the JBS and the PR title. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/3410 From jiefu at openjdk.java.net Tue Apr 13 07:31:59 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Tue, 13 Apr 2021 07:31:59 GMT Subject: RFR: 8264957: Cleanup unused array Type::dual_type [v2] In-Reply-To: References: Message-ID: On Tue, 13 Apr 2021 06:39:18 GMT, Xiaohong Gong wrote: >> This is a bug fix for [1] which adds a new vector mask type. The new added TYPE `"VectorMask"` is inserted into `enum TYPES`, while the array `"Type::dual_type"` is not updated. This makes the array elements are not aligned with TYPES. >> >> I met the following crash due to this issue when I was working on the masking feature support on panama-vector: >> >> Internal Error (/home/xiagon01/code/panama-vector/src/hotspot/share/opto/type.hpp:1727), pid=104432, tid=104449 >> # assert(_base >= AnyPtr && _base <= KlassPtr) failed: Not a pointer >> >> Adding a value like other vector types for the `"VectorMask"` in the array `"dual_type"` can fix it. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8262355 >> >> Tested with tier1 and jdk:tier3 > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Remove array Type::dual_type Thanks for your update. ------------- Marked as reviewed by jiefu (Committer). PR: https://git.openjdk.java.net/jdk/pull/3410 From thartmann at openjdk.java.net Tue Apr 13 07:39:59 2021 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Tue, 13 Apr 2021 07:39:59 GMT Subject: RFR: 8264957: Cleanup unused array Type::dual_type [v2] In-Reply-To: References: Message-ID: On Tue, 13 Apr 2021 06:39:18 GMT, Xiaohong Gong wrote: >> This is a bug fix for [1] which adds a new vector mask type. The new added TYPE `"VectorMask"` is inserted into `enum TYPES`, while the array `"Type::dual_type"` is not updated. This makes the array elements are not aligned with TYPES. >> >> I met the following crash due to this issue when I was working on the masking feature support on panama-vector: >> >> Internal Error (/home/xiagon01/code/panama-vector/src/hotspot/share/opto/type.hpp:1727), pid=104432, tid=104449 >> # assert(_base >= AnyPtr && _base <= KlassPtr) failed: Not a pointer >> >> Adding a value like other vector types for the `"VectorMask"` in the array `"dual_type"` can fix it. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8262355 >> >> Tested with tier1 and jdk:tier3 > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Remove array Type::dual_type Looks good to me. ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3410 From xgong at openjdk.java.net Tue Apr 13 07:40:00 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Tue, 13 Apr 2021 07:40:00 GMT Subject: RFR: 8264957: Cleanup unused array Type::dual_type [v2] In-Reply-To: References: Message-ID: On Tue, 13 Apr 2021 06:59:46 GMT, Jie Fu wrote: >> Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove array Type::dual_type > > I didn't find the usage of that array either. > So the change looks fine to me. > > I'm afraid the JBS needs to be updated as a cleanup issue. > Thanks. Thanks for the review @DamonFool @TobiHartmann ! ------------- PR: https://git.openjdk.java.net/jdk/pull/3410 From neliasso at openjdk.java.net Tue Apr 13 07:46:05 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Tue, 13 Apr 2021 07:46:05 GMT Subject: RFR: 8264957: Cleanup unused array Type::dual_type [v2] In-Reply-To: References: Message-ID: On Tue, 13 Apr 2021 06:39:18 GMT, Xiaohong Gong wrote: >> This is a bug fix for [1] which adds a new vector mask type. The new added TYPE `"VectorMask"` is inserted into `enum TYPES`, while the array `"Type::dual_type"` is not updated. This makes the array elements are not aligned with TYPES. >> >> I met the following crash due to this issue when I was working on the masking feature support on panama-vector: >> >> Internal Error (/home/xiagon01/code/panama-vector/src/hotspot/share/opto/type.hpp:1727), pid=104432, tid=104449 >> # assert(_base >= AnyPtr && _base <= KlassPtr) failed: Not a pointer >> >> Adding a value like other vector types for the `"VectorMask"` in the array `"dual_type"` can fix it. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8262355 >> >> Tested with tier1 and jdk:tier3 > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Remove array Type::dual_type Marked as reviewed by neliasso (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3410 From xgong at openjdk.java.net Tue Apr 13 07:46:07 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Tue, 13 Apr 2021 07:46:07 GMT Subject: RFR: 8264957: Cleanup unused array Type::dual_type [v2] In-Reply-To: References: Message-ID: On Tue, 13 Apr 2021 07:41:38 GMT, Nils Eliasson wrote: >> Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove array Type::dual_type > > Marked as reviewed by neliasso (Reviewer). Thanks for the review @neliasso ! ------------- PR: https://git.openjdk.java.net/jdk/pull/3410 From rcastanedalo at openjdk.java.net Tue Apr 13 09:27:58 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Tue, 13 Apr 2021 09:27:58 GMT Subject: RFR: 8264795: IGV: Upgrade NetBeans platform [v2] In-Reply-To: References: Message-ID: On Wed, 7 Apr 2021 11:59:28 GMT, Roberto Casta?eda Lozano wrote: >> This change upgrades the NetBeans platform on which IGV is based to its latest version ([12.3](https://netbeans.apache.org/download/nb123/index.html)) and switches IGV's build system from Ant to Maven. The upgrade introduces support for a wide range of JDK versions (from 8 to 15, the latest version supported by NetBeans 12.3), and the switch from Ant to Maven makes the IGV build simpler, faster (first-time build is approximately 5x faster), and more stable (all dependencies are fetched directly from the Maven central repository). >> >> The change also fixes broken unit tests in the Data module and runs them by default when building. >> >> #### Testing >> >> Regression-tested the following use cases manually on all combinations of (Linux, Windows, macOS) and (JDK 8, JDK 11, JDK 15): >> >> - build >> - open graph file (small.xml in [test-input.zip](https://bugs.openjdk.java.net/secure/attachment/93988/test-input.zip)) >> - import graphs via network (localhost) >> - expand groups in outline >> - open a graph >> - open a clone >> - zoom in and out >> - search a node >> - apply filters >> - extract a node >> - show all nodes >> - select nodes corresponding to a bytecode >> - view control flow >> - select nodes corresponding to a basic block >> - cluster nodes >> - show satellite view >> - compute the difference of two graphs >> - change node text >> - remove a group >> - save groups into a file >> - open a large graph file (large.xml in [test-input.zip](https://bugs.openjdk.java.net/secure/attachment/93988/test-input.zip)) >> - open a large graph ("After Escape Analysis" in large.xml) >> >> Thanks to Vladimir Ivanov for helping with testing on macOS. > > Roberto Casta?eda Lozano has updated the pull request incrementally with three additional commits since the last revision: > > - Document how to build and run on a specific JDK > - Use lambdas to define runnables > - Document latest JDK version supported ------------- PR: https://git.openjdk.java.net/jdk/pull/3361 From rcastanedalo at openjdk.java.net Tue Apr 13 09:30:58 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Tue, 13 Apr 2021 09:30:58 GMT Subject: Integrated: 8264795: IGV: Upgrade NetBeans platform In-Reply-To: References: Message-ID: On Tue, 6 Apr 2021 18:34:54 GMT, Roberto Casta?eda Lozano wrote: > This change upgrades the NetBeans platform on which IGV is based to its latest version ([12.3](https://netbeans.apache.org/download/nb123/index.html)) and switches IGV's build system from Ant to Maven. The upgrade introduces support for a wide range of JDK versions (from 8 to 15, the latest version supported by NetBeans 12.3), and the switch from Ant to Maven makes the IGV build simpler, faster (first-time build is approximately 5x faster), and more stable (all dependencies are fetched directly from the Maven central repository). > > The change also fixes broken unit tests in the Data module and runs them by default when building. > > #### Testing > > Regression-tested the following use cases manually on all combinations of (Linux, Windows, macOS) and (JDK 8, JDK 11, JDK 15): > > - build > - open graph file (small.xml in [test-input.zip](https://bugs.openjdk.java.net/secure/attachment/93988/test-input.zip)) > - import graphs via network (localhost) > - expand groups in outline > - open a graph > - open a clone > - zoom in and out > - search a node > - apply filters > - extract a node > - show all nodes > - select nodes corresponding to a bytecode > - view control flow > - select nodes corresponding to a basic block > - cluster nodes > - show satellite view > - compute the difference of two graphs > - change node text > - remove a group > - save groups into a file > - open a large graph file (large.xml in [test-input.zip](https://bugs.openjdk.java.net/secure/attachment/93988/test-input.zip)) > - open a large graph ("After Escape Analysis" in large.xml) > > Thanks to Vladimir Ivanov for helping with testing on macOS. This pull request has now been integrated. Changeset: 954b9a1c Author: Roberto Casta?eda Lozano URL: https://git.openjdk.java.net/jdk/commit/954b9a1c Stats: 5940 lines in 499 files changed: 2727 ins; 3110 del; 103 mod 8264795: IGV: Upgrade NetBeans platform Upgrade IGV's underlying NetBeans platform to version 12.3, switch build system from Ant to Maven, and fix broken unit tests in Data module. Reviewed-by: kvn, chagedorn, neliasso, xliu ------------- PR: https://git.openjdk.java.net/jdk/pull/3361 From dnsimon at openjdk.java.net Tue Apr 13 09:33:11 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Tue, 13 Apr 2021 09:33:11 GMT Subject: RFR: 8264806: Remove the experimental JIT compiler [v2] In-Reply-To: References: <_PS0KHvkB_l9YrKjZ7wLAiKb-SK761YFub4XU5mrBRc=.c32572f6-063f-4503-a20f-aa6b9115f808@github.com> Message-ID: <_hHXNVNqB4NJAmS2mndxsKnFCg7fWWooaWMuWVL0bQA=.b8397a2a-0482-4851-9889-0432057070da@github.com> On Sun, 11 Apr 2021 10:25:47 GMT, Doug Simon wrote: >> Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Restore Graal Builder image makefile >> - Merge latest changes from branch 'JDK-8264805' into JDK-8264806 >> - 8264806: Remove the experimental JIT compiler > > We would definitely like to be able to continue testing of GraalVM with the JDK set of jtreg tests. So keeping `Compiler::isGraalEnabled()` working like it does today is important. > @dougxc I restored Compiler::isGraalEnabled(). Thanks. I guess this should really be named `isJVMCICompilerEnabled` now and the `vm.graal.enabled` predicate renamed to `vm.jvmcicompiler.enabled` but maybe that's too big a change (or can be done later). ------------- PR: https://git.openjdk.java.net/jdk/pull/3421 From xgong at openjdk.java.net Tue Apr 13 10:01:59 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Tue, 13 Apr 2021 10:01:59 GMT Subject: Integrated: 8264957: Cleanup unused array Type::dual_type In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 10:04:10 GMT, Xiaohong Gong wrote: > This is a bug fix for [1] which adds a new vector mask type. The new added TYPE `"VectorMask"` is inserted into `enum TYPES`, while the array `"Type::dual_type"` is not updated. This makes the array elements are not aligned with TYPES. > > I met the following crash due to this issue when I was working on the masking feature support on panama-vector: > > Internal Error (/home/xiagon01/code/panama-vector/src/hotspot/share/opto/type.hpp:1727), pid=104432, tid=104449 > # assert(_base >= AnyPtr && _base <= KlassPtr) failed: Not a pointer > > Adding a value like other vector types for the `"VectorMask"` in the array `"dual_type"` can fix it. > > [1] https://bugs.openjdk.java.net/browse/JDK-8262355 > > Tested with tier1 and jdk:tier3 This pull request has now been integrated. Changeset: 19356556 Author: Xiaohong Gong Committer: Ningsheng Jian URL: https://git.openjdk.java.net/jdk/commit/19356556 Stats: 43 lines in 2 files changed: 0 ins; 43 del; 0 mod 8264957: Cleanup unused array Type::dual_type Reviewed-by: jiefu, neliasso, thartmann ------------- PR: https://git.openjdk.java.net/jdk/pull/3410 From jbhateja at openjdk.java.net Tue Apr 13 10:54:02 2021 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Tue, 13 Apr 2021 10:54:02 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v4] In-Reply-To: References: Message-ID: On Tue, 13 Apr 2021 03:21:21 GMT, Xiaohong Gong wrote: >> The Vector API defines different element types for floating point VectorMask. For example, the bitwise related APIs use "`long/int`", while data related APIs use "`double/float`". This makes some optimizations that based on the type checking cannot work well. >> >> For example, the VectorBox/Unbox elimination like `"VectorUnbox (VectorBox v) ==> v"` requires the types of output and >> input are equal. Normally this is necessary. However, due to the different element type for floating point VectorMask with the same species, the VectorBox/Unbox pattern is optimized to: >> >> VectorLoadMask (VectorStoreMask vmask) >> >> Actually the types can be treated as the same one for such cases. And considering the vector mask representation is the same for >> vectors with the same element size and vector length, it's safe to do the optimization: >> >> VectorLoadMask (VectorStoreMask vmask) ==> vmask >> >> The same issue exists for GVN which is based on the type of a node. Making the mask node's` hash()/cmp()` methods depends on the element size instead of the element type can fix it. > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Use "Matcher::match_rule_supported_vector" for vector nodes checking src/hotspot/cpu/aarch64/aarch64_sve_ad.m4 line 861: > 859: instruct vmaskcast(vReg dst) %{ > 860: predicate(UseSVE > 0); > 861: match(Set dst (VectorMaskCast dst)); A strict check based on in-type and out-type in predicate could strengthen the pattern. src/hotspot/share/opto/vectornode.hpp line 1240: > 1238: }; > 1239: > 1240: class VectorMaskCastNode : public VectorNode { VectorMaskReinterpret seems better choice, since its a re-interpretation and not a casting (up/down). ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From vlivanov at openjdk.java.net Tue Apr 13 11:21:05 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Tue, 13 Apr 2021 11:21:05 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v4] In-Reply-To: References: Message-ID: <1mjRZSrgWpn61bMsZ_hioYpmWFjGtySndhAmC0zkGu0=.9684321b-a412-4907-8367-c6071405655d@github.com> On Tue, 13 Apr 2021 10:49:51 GMT, Jatin Bhateja wrote: >> Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: >> >> Use "Matcher::match_rule_supported_vector" for vector nodes checking > > src/hotspot/cpu/aarch64/aarch64_sve_ad.m4 line 861: > >> 859: instruct vmaskcast(vReg dst) %{ >> 860: predicate(UseSVE > 0); >> 861: match(Set dst (VectorMaskCast dst)); > > A strict check based on in-type and out-type in predicate could strengthen the pattern. I agree. > src/hotspot/share/opto/vectornode.hpp line 1240: > >> 1238: }; >> 1239: >> 1240: class VectorMaskCastNode : public VectorNode { > > VectorMaskReinterpret seems better choice, since its a re-interpretation and not a casting (up/down). Considering masks have platform-specific representation, full-blown casts between different element types look more appropriate here. In this particular case, the focus is on the cheapest possible case when representations share the same bit pattern and the cast degenerates into a no-op. But in the longer term, it makes perfect sense to support the full matrix of conversions and don't rely on `VectorLoadMask <=> VectorStoreMask` and intermediate canonical vector representation. ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From rcastanedalo at openjdk.java.net Tue Apr 13 11:35:03 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Tue, 13 Apr 2021 11:35:03 GMT Subject: RFR: 8265125: IGV: cannot edit forms with NetBeans GUI builder Message-ID: This change moves all NetBeans-generated .form files to the same directory as their corresponding Java source files. This allows NetBeans to recognize the corresponding Java source files as Swing forms, and open them in "Design" mode: ![design](https://user-images.githubusercontent.com/8792647/114543087-07187d80-9c59-11eb-8d30-6d53db35db17.png) The .form files are only processed by NetBeans, so this change does not affect how IGV is built and run. Tested manually by reproducing the steps in the [JBS bug report](https://bugs.openjdk.java.net/browse/JDK-8265125) and by checking that IGV can be built and run following the steps in IGV's [README.md](https://github.com/openjdk/jdk/tree/master/src/utils/IdealGraphVisualizer#building-and-running) file. ------------- Commit messages: - Move NetBeans .form files to Java directories Changes: https://git.openjdk.java.net/jdk/pull/3461/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3461&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8265125 Stats: 0 lines in 7 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3461.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3461/head:pull/3461 PR: https://git.openjdk.java.net/jdk/pull/3461 From jbhateja at openjdk.java.net Tue Apr 13 15:53:06 2021 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Tue, 13 Apr 2021 15:53:06 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v4] In-Reply-To: <1mjRZSrgWpn61bMsZ_hioYpmWFjGtySndhAmC0zkGu0=.9684321b-a412-4907-8367-c6071405655d@github.com> References: <1mjRZSrgWpn61bMsZ_hioYpmWFjGtySndhAmC0zkGu0=.9684321b-a412-4907-8367-c6071405655d@github.com> Message-ID: On Tue, 13 Apr 2021 11:16:33 GMT, Vladimir Ivanov wrote: >> src/hotspot/share/opto/vectornode.hpp line 1240: >> >>> 1238: }; >>> 1239: >>> 1240: class VectorMaskCastNode : public VectorNode { >> >> VectorMaskReinterpret seems better choice, since its a re-interpretation and not a casting (up/down). > > Considering masks have platform-specific representation, full-blown casts between different element types look more appropriate here. > > In this particular case, the focus is on the cheapest possible case when representations share the same bit pattern and the cast degenerates into a no-op. But in the longer term, it makes perfect sense to support the full matrix of conversions and don't rely on `VectorLoadMask <=> VectorStoreMask` and intermediate canonical vector representation. Got it. ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From kvn at openjdk.java.net Tue Apr 13 17:02:02 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 13 Apr 2021 17:02:02 GMT Subject: RFR: 8265125: IGV: cannot edit forms with NetBeans GUI builder In-Reply-To: References: Message-ID: On Tue, 13 Apr 2021 11:12:19 GMT, Roberto Casta?eda Lozano wrote: > This change moves all NetBeans-generated .form files to the same directory as their corresponding Java source files. This allows NetBeans to recognize the corresponding Java source files as Swing forms, and open them in "Design" mode: > > ![design](https://user-images.githubusercontent.com/8792647/114543087-07187d80-9c59-11eb-8d30-6d53db35db17.png) > > The .form files are only processed by NetBeans, so this change does not affect how IGV is built and run. > > Tested manually by reproducing the steps in the [JBS bug report](https://bugs.openjdk.java.net/browse/JDK-8265125) and by checking that IGV can be built and run following the steps in IGV's [README.md](https://github.com/openjdk/jdk/tree/master/src/utils/IdealGraphVisualizer#building-and-running) file. Trivial ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3461 From rcastanedalo at openjdk.java.net Tue Apr 13 17:11:04 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Tue, 13 Apr 2021 17:11:04 GMT Subject: RFR: 8265125: IGV: cannot edit forms with NetBeans GUI builder In-Reply-To: References: Message-ID: On Tue, 13 Apr 2021 16:59:34 GMT, Vladimir Kozlov wrote: > Trivial Thanks for reviewing, Vladimir! ------------- PR: https://git.openjdk.java.net/jdk/pull/3461 From rcastanedalo at openjdk.java.net Tue Apr 13 17:11:05 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Tue, 13 Apr 2021 17:11:05 GMT Subject: Integrated: 8265125: IGV: cannot edit forms with NetBeans GUI builder In-Reply-To: References: Message-ID: On Tue, 13 Apr 2021 11:12:19 GMT, Roberto Casta?eda Lozano wrote: > This change moves all NetBeans-generated .form files to the same directory as their corresponding Java source files. This allows NetBeans to recognize the corresponding Java source files as Swing forms, and open them in "Design" mode: > > ![design](https://user-images.githubusercontent.com/8792647/114543087-07187d80-9c59-11eb-8d30-6d53db35db17.png) > > The .form files are only processed by NetBeans, so this change does not affect how IGV is built and run. > > Tested manually by reproducing the steps in the [JBS bug report](https://bugs.openjdk.java.net/browse/JDK-8265125) and by checking that IGV can be built and run following the steps in IGV's [README.md](https://github.com/openjdk/jdk/tree/master/src/utils/IdealGraphVisualizer#building-and-running) file. This pull request has now been integrated. Changeset: 8df8512b Author: Roberto Casta?eda Lozano URL: https://git.openjdk.java.net/jdk/commit/8df8512b Stats: 0 lines in 7 files changed: 0 ins; 0 del; 0 mod 8265125: IGV: cannot edit forms with NetBeans GUI builder Reviewed-by: kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/3461 From henri.tremblay at gmail.com Tue Apr 13 11:26:21 2021 From: henri.tremblay at gmail.com (Henri Tremblay) Date: Tue, 13 Apr 2021 07:26:21 -0400 Subject: New candidate JEP: 410: Remove the Experimental AOT and JIT Compiler In-Reply-To: <20210407214719.2E6553DF0DA@eggemoggin.niobe.net> References: <20210407214719.2E6553DF0DA@eggemoggin.niobe.net> Message-ID: Hi, Does it mean that the chances to see GraalVM becoming the default JIT are now becoming thiner and thiner? Thanks, Henri On Wed, 7 Apr 2021 at 17:47, wrote: > https://openjdk.java.net/jeps/410 > > Summary: Remove the experimental Java-based ahead-of-time (AOT) and > just-in-time (JIT) compiler. This compiler has seen little use since > its introduction and the effort required to maintain it is significant. > Retain the experimental Java-level JVM compiler interface (JVMCI) so > that developers can continue to use externally-built versions of the > compiler for JIT compilation. > > - Mark > From kvn at openjdk.java.net Tue Apr 13 17:55:20 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 13 Apr 2021 17:55:20 GMT Subject: RFR: 8264806: Remove the experimental JIT compiler [v2] In-Reply-To: <_hHXNVNqB4NJAmS2mndxsKnFCg7fWWooaWMuWVL0bQA=.b8397a2a-0482-4851-9889-0432057070da@github.com> References: <_PS0KHvkB_l9YrKjZ7wLAiKb-SK761YFub4XU5mrBRc=.c32572f6-063f-4503-a20f-aa6b9115f808@github.com> <_hHXNVNqB4NJAmS2mndxsKnFCg7fWWooaWMuWVL0bQA=.b8397a2a-0482-4851-9889-0432057070da@github.com> Message-ID: On Tue, 13 Apr 2021 09:30:23 GMT, Doug Simon wrote: >> We would definitely like to be able to continue testing of GraalVM with the JDK set of jtreg tests. So keeping `Compiler::isGraalEnabled()` working like it does today is important. > >> @dougxc I restored Compiler::isGraalEnabled(). > > Thanks. I guess this should really be named `isJVMCICompilerEnabled` now and the `vm.graal.enabled` predicate renamed to `vm.jvmcicompiler.enabled` but maybe that's too big a change (or can be done later). @dougxc Renaming should be done separately. May be use RFE I filed: https://bugs.openjdk.java.net/browse/JDK-8265032 Do you approve these Graal removal changes? ------------- PR: https://git.openjdk.java.net/jdk/pull/3421 From lutz.schmidt at sap.com Tue Apr 13 19:43:28 2021 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Tue, 13 Apr 2021 19:43:28 +0000 Subject: [11u] RFR(S): 8250635 backport: MethodArityHistogram should use Compile_lock in favour of fancy checks Message-ID: <18C76401-DE61-4B8F-821B-9136B3B00E20@sap.com> Dear Community,? I would appreciate receiving reviews for this downport change. It is a small change in one file only. Unfortunately, it did not apply clean. Original bug: https://bugs.openjdk.java.net/browse/JDK-8250635 Downport webrev: https://cr.openjdk.java.net/~lucy/webrevs/8250635.11u.01/ Tests: SAP's internal build and test farm. Results pending. Thank you! Lutz From lutz.schmidt at sap.com Tue Apr 13 20:02:27 2021 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Tue, 13 Apr 2021 20:02:27 +0000 Subject: [11u] RFR(S): 8250635 backport: MethodArityHistogram should use Compile_lock in favour of fancy checks In-Reply-To: <18C76401-DE61-4B8F-821B-9136B3B00E20@sap.com> References: <18C76401-DE61-4B8F-821B-9136B3B00E20@sap.com> Message-ID: <0D8921EA-F9CA-493F-AA7F-A0BFBF992B21@sap.com> Sorry for spamming, forgot jdk-updates-dev. Lutz ?On 13.04.21, 21:43, "Schmidt, Lutz" wrote: Dear Community, I would appreciate receiving reviews for this downport change. It is a small change in one file only. Unfortunately, it did not apply clean. Original bug: https://bugs.openjdk.java.net/browse/JDK-8250635 Downport webrev: https://cr.openjdk.java.net/~lucy/webrevs/8250635.11u.01/ Tests: SAP's internal build and test farm. Results pending. Thank you! Lutz From dnsimon at openjdk.java.net Tue Apr 13 21:22:14 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Tue, 13 Apr 2021 21:22:14 GMT Subject: RFR: 8264806: Remove the experimental JIT compiler [v3] In-Reply-To: References: Message-ID: On Mon, 12 Apr 2021 22:10:06 GMT, Vladimir Kozlov wrote: >> As part of [JEP 410](http://openjdk.java.net/jeps/410) remove code related to Java-based JIT compiler (Graal) from JDK: >> >> - `jdk.internal.vm.compiler` ? the Graal compiler >> - `jdk.internal.vm.compiler.management` ? Graal's `MBean` >> - `test/hotspot/jtreg/compiler/graalunit` ? Graal's unit tests >> >> Remove Graal related code in makefiles. >> >> Note, next two `module-info.java` files are preserved so that the JVMCI module `jdk.internal.vm.ci` continues to build: >> >> src/jdk.internal.vm.compiler/share/classes/module-info.java >> src/jdk.internal.vm.compiler.management/share/classes/module-info.java >> >> >> @AlanBateman suggested that we can avoid it by using Module API to export packages at runtime . It requires changes in GraalVM's JVMCI too so I will file followup RFE to implement it. >> >> Tested hs-tier1-4 > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Restore Compiler::isGraalEnabled() Approved. ------------- PR: https://git.openjdk.java.net/jdk/pull/3421 From hohensee at amazon.com Tue Apr 13 23:05:56 2021 From: hohensee at amazon.com (Hohensee, Paul) Date: Tue, 13 Apr 2021 23:05:56 +0000 Subject: [11u] RFR(S): 8250635 backport: MethodArityHistogram should use Compile_lock in favour of fancy checks Message-ID: <5C8796FB-7411-42B6-BE8B-96A9DC9E8107@amazon.com> Lgtm, assuming tests pass. Thanks, Paul ?-----Original Message----- From: jdk-updates-dev on behalf of "Schmidt, Lutz" Date: Tuesday, April 13, 2021 at 1:03 PM To: "hotspot-compiler-dev at openjdk.java.net" , "jdk-updates-dev at openjdk.java.net" Subject: RE: [11u] RFR(S): 8250635 backport: MethodArityHistogram should use Compile_lock in favour of fancy checks Sorry for spamming, forgot jdk-updates-dev. Lutz On 13.04.21, 21:43, "Schmidt, Lutz" wrote: Dear Community, I would appreciate receiving reviews for this downport change. It is a small change in one file only. Unfortunately, it did not apply clean. Original bug: https://bugs.openjdk.java.net/browse/JDK-8250635 Downport webrev: https://cr.openjdk.java.net/~lucy/webrevs/8250635.11u.01/ Tests: SAP's internal build and test farm. Results pending. Thank you! Lutz From xgong at openjdk.java.net Wed Apr 14 01:55:57 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Wed, 14 Apr 2021 01:55:57 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v4] In-Reply-To: <1mjRZSrgWpn61bMsZ_hioYpmWFjGtySndhAmC0zkGu0=.9684321b-a412-4907-8367-c6071405655d@github.com> References: <1mjRZSrgWpn61bMsZ_hioYpmWFjGtySndhAmC0zkGu0=.9684321b-a412-4907-8367-c6071405655d@github.com> Message-ID: On Tue, 13 Apr 2021 11:18:25 GMT, Vladimir Ivanov wrote: >> src/hotspot/cpu/aarch64/aarch64_sve_ad.m4 line 861: >> >>> 859: instruct vmaskcast(vReg dst) %{ >>> 860: predicate(UseSVE > 0); >>> 861: match(Set dst (VectorMaskCast dst)); >> >> A strict check based on in-type and out-type in predicate could strengthen the pattern. > > I agree. I'v added the type assertion in the constructor of the `"VectorMaskCastNode"`. See : assert(in_vt->length() == vt->length(), "vector length must match"); assert(type2aelembytes(in_vt->element_basic_type()) == type2aelembytes(vt->element_basic_type()), "element size must match"); That's why I didn't add the check in predicate. So do you think it's enough? ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From dholmes at openjdk.java.net Wed Apr 14 02:31:58 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 14 Apr 2021 02:31:58 GMT Subject: RFR: 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines [v4] In-Reply-To: <_q9nVMOqIePMWqifBZo_Cj9Vsal63l2BLN8X7x5S8iw=.abdcc58f-877d-4e36-b26e-3699cb05eeda@github.com> References: <_q9nVMOqIePMWqifBZo_Cj9Vsal63l2BLN8X7x5S8iw=.abdcc58f-877d-4e36-b26e-3699cb05eeda@github.com> Message-ID: On Fri, 9 Apr 2021 05:08:37 GMT, David Holmes wrote: >> The existing JRT_ENTRY (and related) macros require the function to which they are applied to declare a parameter "JavaThread* thread" which represents the current thread. These functions are all implicitly "traps" functions as they can result in exceptions, but they are not declared with TRAPS because the only caller of these functions is the runtime itself (via call_VM) and no callers need to be aware to use CHECK; further they need a JavaThread. So the macro declares the THREAD variable for use with other exception-producing functions and assigns it from "thread". >> >> The majority of this change replaces the parameter name "thread" with "current" so that it is clear that we are always dealing with the current thread. This affects the entry functions as well as the functions called therefrom. >> >> We can then also replace the use of "THREAD" with "current", in contexts that are not related to exception processing. >> >> Some methods called by entry functions were declared to have both a "thread" parameter and a "TRAPS" parameter - with nothing to tell you these are always the same, current, thread. So the "thread" parameter is removed and replaced with a local variable "current" obtained from THREAD->as_Java_thread(). >> >> Some missing CHECK_ uses were added. >> >> Testing: >> - tiers 1-3 >> >> Thanks, >> David > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Fix search&replace mistake Repeating this as the bots failed to send out an email. Thanks for the reviews @coleenp , @hseigel and @iklam . Can someone from compiler team please take a look and give the okay? (I know things are a bit busy at the moment.) Thanks, David ------------- PR: https://git.openjdk.java.net/jdk/pull/3394 From sviswanathan at openjdk.java.net Wed Apr 14 05:06:05 2021 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Wed, 14 Apr 2021 05:06:05 GMT Subject: RFR: 8265154: vinserti128 operand mix up for KNL platforms Message-ID: There is a bug in macro assembler in vinserti128 special handling for platforms like KNL that do not support AVX512VL. The following: void vinserti128(XMMRegister dst, XMMRegister nds, XMMRegister src, uint8_t imm8) { if (UseAVX > 2 && VM_Version::supports_avx512novl()) { Assembler::vinserti32x4(dst, dst, src, imm8); } ... } Should have been: void vinserti128(XMMRegister dst, XMMRegister nds, XMMRegister src, uint8_t imm8) { if (UseAVX > 2 && VM_Version::supports_avx512novl()) { Assembler::vinserti32x4(dst, nds, src, imm8); } ... } Best Regards, Sandhya ------------- Commit messages: - 8265154: vinserti128 operand mix up for KNL platforms Changes: https://git.openjdk.java.net/jdk/pull/3480/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3480&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8265154 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/3480.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3480/head:pull/3480 PR: https://git.openjdk.java.net/jdk/pull/3480 From zhaixiang at loongson.cn Wed Apr 14 05:09:21 2021 From: zhaixiang at loongson.cn (Leslie Zhai) Date: Wed, 14 Apr 2021 13:09:21 +0800 Subject: 8230015: [instruction selector] generic vector operands support. In-Reply-To: References: Message-ID: Hi Jatin, Thanks for your great work! libjvm.so reduced +1MB will help SPECjvm2008 some benchmark +44% speed up. Then I tried to port 8234391: C2: Generic vector operands for `YOURARCH` about LoadVector and StoreVector at first. Only `Matcher::regmask_for_ideal_register` was able to call `Matcher::is_generic_vector`, but compared with X86, `Matcher::do_postselect_cleanup` was also able to call `Matcher::is_generic_vector` too. I debug the `Matcher::do_postselect_cleanup`: diff --git a/src/hotspot/share/opto/matcher.cpp b/src/hotspot/share/opto/matcher.cpp index 0846cad3c3f..8fd644d2d93 100644 --- a/src/hotspot/share/opto/matcher.cpp +++ b/src/hotspot/share/opto/matcher.cpp @@ -309,6 +309,9 @@ void Matcher::match( ) { C->record_method_not_compilable("must be able to represent all call arguments in reg mask"); } +#ifdef YOURARCH64 + do_postselect_cleanup(); +#endif if (C->failing()) return; // bailed out on incoming arg failure // --------------- @@ -2630,8 +2633,10 @@ void Matcher::specialize_generic_vector_operands() { int opnd_idx = m->operand_index(1); Node* def = m->in(opnd_idx); m->subsume_by(def, C); +#if !defined(YOURARCH64) } else if (m->is_MachTemp()) { // process MachTemp nodes at use site (see Matcher::specialize_vector_operand) +#endif } else { specialize_mach_node(m); } But `Matcher::do_postselect_cleanup` was still not able to be called. > Current patch enables this support only for x86 target, to get a feedback from community. Then how to port Generic vector operands for other targets? Thanks, Leslie Zhai ? 2019?08?22? 14:49, Bhateja, Jatin ??: > Hi All, > > Please find below a patch for generic vector operands[1] support during instruction selection. > > Motivation behind the patch is to reduce the number of vector selection patterns whose operands meagerly differ in vector lengths. > This will not only result in lesser code being generated by ADLC which effectively translates to size reduction in libjvm.so but also > help in better maintenance of AD files. > > Using generic operands we were able to collapse multiple vector patterns over mainline > Initial number of vector instruction patterns (vec[XYZSD] + legVec[ZXYSD] : 510 > Reduced vector instruction patterns (vecG + legVecG) : 222 > > With this we could see around 1MB size reduction in libjvm.so. > > In order to have minimal impact over downstream compiler passes, a post-selection pass has been introduced (currently enabled only for X86 target) > which replaces these generic operands with their corresponding concreter vector length variants. > > JBS : https://bugs.openjdk.java.net/browse/JDK-8230015 > Patch : http://cr.openjdk.java.net/~jbhateja/genericVectorOperands/webrev.00/ > > Kindly review and share your feedback. > > Best Regards, > Jatin Bhateja > > [1] http://cr.openjdk.java.net/~jbhateja/genericVectorOperands/generic_operands_support_v1.0.pdf > From thartmann at openjdk.java.net Wed Apr 14 06:40:03 2021 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Wed, 14 Apr 2021 06:40:03 GMT Subject: RFR: 8265154: vinserti128 operand mix up for KNL platforms In-Reply-To: References: Message-ID: On Wed, 14 Apr 2021 00:25:40 GMT, Sandhya Viswanathan wrote: > There is a bug in macro assembler in vinserti128 special handling for platforms like KNL that do not support AVX512VL. > > The following: > void vinserti128(XMMRegister dst, XMMRegister nds, XMMRegister src, uint8_t imm8) { > if (UseAVX > 2 && VM_Version::supports_avx512novl()) { > Assembler::vinserti32x4(dst, dst, src, imm8); > } > ... > } > > Should have been: > void vinserti128(XMMRegister dst, XMMRegister nds, XMMRegister src, uint8_t imm8) { > if (UseAVX > 2 && VM_Version::supports_avx512novl()) { > Assembler::vinserti32x4(dst, nds, src, imm8); > } > ... > } > > Best Regards, > Sandhya Looks good to me. Just wondering, why did this never show up? Are we missing tests exercising this code path? ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3480 From rbackman at openjdk.java.net Wed Apr 14 08:34:19 2021 From: rbackman at openjdk.java.net (Rickard =?UTF-8?B?QsOkY2ttYW4=?=) Date: Wed, 14 Apr 2021 08:34:19 GMT Subject: RFR: 8260255: C1: LoopInvariantCodeMotion constructor can leave some fields uninitialized Message-ID: Initialize instance variables to default values to avoid uninitialized values for early return. ------------- Commit messages: - 8260255: C1: LoopInvariantCodeMotion constructor can leave some fields uninitialized Changes: https://git.openjdk.java.net/jdk/pull/3484/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3484&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8260255 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3484.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3484/head:pull/3484 PR: https://git.openjdk.java.net/jdk/pull/3484 From martin.doerr at sap.com Wed Apr 14 09:06:52 2021 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 14 Apr 2021 09:06:52 +0000 Subject: [11u] RFR(S): 8250635 backport: MethodArityHistogram should use Compile_lock in favour of fancy checks In-Reply-To: <0D8921EA-F9CA-493F-AA7F-A0BFBF992B21@sap.com> References: <18C76401-DE61-4B8F-821B-9136B3B00E20@sap.com> <0D8921EA-F9CA-493F-AA7F-A0BFBF992B21@sap.com> Message-ID: Hi Lutz, your backport looks good. Best regards, Martin > -----Original Message----- > From: jdk-updates-dev On > Behalf Of Schmidt, Lutz > Sent: Dienstag, 13. April 2021 22:02 > To: hotspot-compiler-dev at openjdk.java.net; jdk-updates- > dev at openjdk.java.net > Subject: Re: [11u] RFR(S): 8250635 backport: MethodArityHistogram should > use Compile_lock in favour of fancy checks > > Sorry for spamming, forgot jdk-updates-dev. > Lutz > > ?On 13.04.21, 21:43, "Schmidt, Lutz" wrote: > > Dear Community, > > I would appreciate receiving reviews for this downport change. It is a small > change in one file only. Unfortunately, it did not apply clean. > > Original bug: https://bugs.openjdk.java.net/browse/JDK-8250635 > Downport webrev: > https://cr.openjdk.java.net/~lucy/webrevs/8250635.11u.01/ > > Tests: > SAP's internal build and test farm. Results pending. > > Thank you! > Lutz > > From lutz.schmidt at sap.com Wed Apr 14 09:18:18 2021 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Wed, 14 Apr 2021 09:18:18 +0000 Subject: [11u] RFR(S): 8250635 backport: MethodArityHistogram should use Compile_lock in favour of fancy checks In-Reply-To: References: <18C76401-DE61-4B8F-821B-9136B3B00E20@sap.com> <0D8921EA-F9CA-493F-AA7F-A0BFBF992B21@sap.com> Message-ID: <40507412-5F59-4623-BA39-728CA14F9C64@sap.com> Thank you, Martin and Paul, for your reviews. Submit will have to wait another day. I was too late for testing last night. I will need a sponsor anyway. Is it ok to contact you, Martin, once the tests are done successfully? Thanks, Lutz ?On 14.04.21, 11:06, "Doerr, Martin" wrote: Hi Lutz, your backport looks good. Best regards, Martin > -----Original Message----- > From: jdk-updates-dev On > Behalf Of Schmidt, Lutz > Sent: Dienstag, 13. April 2021 22:02 > To: hotspot-compiler-dev at openjdk.java.net; jdk-updates- > dev at openjdk.java.net > Subject: Re: [11u] RFR(S): 8250635 backport: MethodArityHistogram should > use Compile_lock in favour of fancy checks > > Sorry for spamming, forgot jdk-updates-dev. > Lutz > > On 13.04.21, 21:43, "Schmidt, Lutz" wrote: > > Dear Community, > > I would appreciate receiving reviews for this downport change. It is a small > change in one file only. Unfortunately, it did not apply clean. > > Original bug: https://bugs.openjdk.java.net/browse/JDK-8250635 > Downport webrev: > https://cr.openjdk.java.net/~lucy/webrevs/8250635.11u.01/ > > Tests: > SAP's internal build and test farm. Results pending. > > Thank you! > Lutz > > From martin.doerr at sap.com Wed Apr 14 09:21:31 2021 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 14 Apr 2021 09:21:31 +0000 Subject: [11u] RFR(S): 8250635 backport: MethodArityHistogram should use Compile_lock in favour of fancy checks In-Reply-To: <40507412-5F59-4623-BA39-728CA14F9C64@sap.com> References: <18C76401-DE61-4B8F-821B-9136B3B00E20@sap.com> <0D8921EA-F9CA-493F-AA7F-A0BFBF992B21@sap.com> <40507412-5F59-4623-BA39-728CA14F9C64@sap.com> Message-ID: Yes. Just let me know when it's ready. Best regards, Martin > -----Original Message----- > From: Schmidt, Lutz > Sent: Mittwoch, 14. April 2021 11:18 > To: Doerr, Martin ; hotspot-compiler- > dev at openjdk.java.net; jdk-updates-dev at openjdk.java.net; Hohensee, > Paul > Subject: Re: [11u] RFR(S): 8250635 backport: MethodArityHistogram should > use Compile_lock in favour of fancy checks > > Thank you, Martin and Paul, for your reviews. > > Submit will have to wait another day. I was too late for testing last night. > > I will need a sponsor anyway. Is it ok to contact you, Martin, once the tests > are done successfully? > > Thanks, > Lutz > > ?On 14.04.21, 11:06, "Doerr, Martin" wrote: > > Hi Lutz, > > your backport looks good. > > Best regards, > Martin > > > > -----Original Message----- > > From: jdk-updates-dev On > > Behalf Of Schmidt, Lutz > > Sent: Dienstag, 13. April 2021 22:02 > > To: hotspot-compiler-dev at openjdk.java.net; jdk-updates- > > dev at openjdk.java.net > > Subject: Re: [11u] RFR(S): 8250635 backport: MethodArityHistogram > should > > use Compile_lock in favour of fancy checks > > > > Sorry for spamming, forgot jdk-updates-dev. > > Lutz > > > > On 13.04.21, 21:43, "Schmidt, Lutz" wrote: > > > > Dear Community, > > > > I would appreciate receiving reviews for this downport change. It is a > small > > change in one file only. Unfortunately, it did not apply clean. > > > > Original bug: https://bugs.openjdk.java.net/browse/JDK-8250635 > > Downport webrev: > > https://cr.openjdk.java.net/~lucy/webrevs/8250635.11u.01/ > > > > Tests: > > SAP's internal build and test farm. Results pending. > > > > Thank you! > > Lutz > > > > > From neliasso at openjdk.java.net Wed Apr 14 09:32:57 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Wed, 14 Apr 2021 09:32:57 GMT Subject: RFR: 8260255: C1: LoopInvariantCodeMotion constructor can leave some fields uninitialized In-Reply-To: References: Message-ID: On Wed, 14 Apr 2021 08:27:16 GMT, Rickard B?ckman wrote: > Initialize instance variables to default values to avoid uninitialized values for early return. Looks good. ------------- Marked as reviewed by neliasso (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3484 From thartmann at openjdk.java.net Wed Apr 14 09:39:59 2021 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Wed, 14 Apr 2021 09:39:59 GMT Subject: RFR: 8260255: C1: LoopInvariantCodeMotion constructor can leave some fields uninitialized In-Reply-To: References: Message-ID: On Wed, 14 Apr 2021 08:27:16 GMT, Rickard B?ckman wrote: > Initialize instance variables to default values to avoid uninitialized values for early return. Looks good to me too. ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3484 From eliu at openjdk.java.net Wed Apr 14 10:05:56 2021 From: eliu at openjdk.java.net (Eric Liu) Date: Wed, 14 Apr 2021 10:05:56 GMT Subject: RFR: 8262916: Merge LShiftCntV and RShiftCntV into a single node In-Reply-To: References: <9T0fatLfUStsY_1SA94rlhKokfqT7TeySYeFzIuO7Nk=.7394e20e-a4a3-4786-a2db-f5d8acc2b74a@github.com> Message-ID: On Fri, 9 Apr 2021 09:26:19 GMT, Andrew Haley wrote: >> The vector shift count was defined by two separate nodes(LShiftCntV and >> RShiftCntV), which would prevent them from being shared when the shift >> counts are the same. >> >> >> public static void test_shiftv(int sh) { >> for (int i = 0; i < N; i+=1) { >> a0[i] = a1[i] << sh; >> b0[i] = b1[i] >> sh; >> } >> } >> >> >> Given the example above, by merging the same shift counts into one >> node, they could be shared by shift nodes(RShiftV or LShiftV) like >> below: >> >> >> Before: >> 1184 LShiftCntV === _ 1189 [[ 1185 ... ]] >> 1190 RShiftCntV === _ 1189 [[ 1191 ... ]] >> 1185 LShiftVI === _ 1181 1184 [[ 1186 ]] >> 1191 RShiftVI === _ 1187 1190 [[ 1192 ]] >> >> After: >> 1190 ShiftCntV === _ 1189 [[ 1191 1204 ... ]] >> 1204 LShiftVI === _ 1211 1190 [[ 1203 ]] >> 1191 RShiftVI === _ 1187 1190 [[ 1192 ]] >> >> >> The final code could remove one redundant ?dup?(scalar->vector), >> with one register saved. >> >> >> Before: >> dup v16.16b, w12 >> dup v17.16b, w12 >> ... >> ldr q18, [x13, #16] >> sshl v18.4s, v18.4s, v16.4s >> add x18, x16, x12 ; iaload >> >> add x4, x15, x12 >> str q18, [x4, #16] ; iastore >> >> ldr q18, [x18, #16] >> add x12, x14, x12 >> neg v19.16b, v17.16b >> sshl v18.4s, v18.4s, v19.4s >> str q18, [x12, #16] ; iastore >> >> After: >> dup v16.16b, w11 >> ... >> ldr q17, [x13, #16] >> sshl v17.4s, v17.4s, v16.4s >> add x2, x22, x11 ; iaload >> >> add x4, x16, x11 >> str q17, [x4, #16] ; iastore >> >> ldr q17, [x2, #16] >> add x11, x21, x11 >> neg v18.16b, v16.16b >> sshl v17.4s, v17.4s, v18.4s >> str q17, [x11, #16] ; iastore > >> >> It seems that keeping those two RShiftCntV and LShiftCntV is friendly to AArch32/64 in this case, but AArch64 should changed to what AArch32 dose. @theRealAph > > Thanks, but it's been a while since I looked at the vector code. Can you point me to the AArch32 patterns in question, to show me the AArch64 changes needed? Thanks. @theRealAph Could you please take a look at this? ------------- PR: https://git.openjdk.java.net/jdk/pull/3371 From github.com+40024232+sunny868 at openjdk.java.net Wed Apr 14 14:52:51 2021 From: github.com+40024232+sunny868 at openjdk.java.net (SUN Guoyun) Date: Wed, 14 Apr 2021 14:52:51 GMT Subject: RFR: 8265105: gc/arguments/TestSelectDefaultGC.java fails when compiler1 is disabled Message-ID: On MIPS64 platform has not impliment C1,only has C2. so when tiered compilation is off, it is unnecessary to set client emulation mode flags. perhaps this bug be included by https://bugs.openjdk.java.net/browse/JDK-8251462 ------------- Commit messages: - 8265105: gc/arguments/TestSelectDefaultGC.java fails when compiler1 is disabled Changes: https://git.openjdk.java.net/jdk/pull/3449/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3449&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8265105 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3449.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3449/head:pull/3449 PR: https://git.openjdk.java.net/jdk/pull/3449 From github.com+40024232+sunny868 at openjdk.java.net Wed Apr 14 14:52:51 2021 From: github.com+40024232+sunny868 at openjdk.java.net (SUN Guoyun) Date: Wed, 14 Apr 2021 14:52:51 GMT Subject: RFR: 8265105: gc/arguments/TestSelectDefaultGC.java fails when compiler1 is disabled In-Reply-To: References: Message-ID: <2ndqbMB_O7VQOeF6IrOcbZwoBt2HA05D3S4zMHtWDzw=.bc0b0e6e-d358-4687-a7a0-de15ea1bb342@github.com> On Tue, 13 Apr 2021 05:46:15 GMT, SUN Guoyun wrote: > On MIPS64 platform has not impliment C1,only has C2. > so when tiered compilation is off, it is unnecessary to set client emulation mode flags. > perhaps this bug be included by https://bugs.openjdk.java.net/browse/JDK-8251462 ------------- PR: https://git.openjdk.java.net/jdk/pull/3449 From thartmann at openjdk.java.net Wed Apr 14 14:52:51 2021 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Wed, 14 Apr 2021 14:52:51 GMT Subject: RFR: 8265105: gc/arguments/TestSelectDefaultGC.java fails when compiler1 is disabled In-Reply-To: References: Message-ID: On Tue, 13 Apr 2021 05:46:15 GMT, SUN Guoyun wrote: > On MIPS64 platform has not impliment C1,only has C2. > so when tiered compilation is off, it is unnecessary to set client emulation mode flags. > perhaps this bug be included by https://bugs.openjdk.java.net/browse/JDK-8251462 Please add a description of your fix to the PR. ------------- PR: https://git.openjdk.java.net/jdk/pull/3449 From kvn at openjdk.java.net Wed Apr 14 15:26:38 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 14 Apr 2021 15:26:38 GMT Subject: RFR: 8265154: vinserti128 operand mix up for KNL platforms In-Reply-To: References: Message-ID: On Wed, 14 Apr 2021 00:25:40 GMT, Sandhya Viswanathan wrote: > There is a bug in macro assembler in vinserti128 special handling for platforms like KNL that do not support AVX512VL. > > The following: > void vinserti128(XMMRegister dst, XMMRegister nds, XMMRegister src, uint8_t imm8) { > if (UseAVX > 2 && VM_Version::supports_avx512novl()) { > Assembler::vinserti32x4(dst, dst, src, imm8); > } > ... > } > > Should have been: > void vinserti128(XMMRegister dst, XMMRegister nds, XMMRegister src, uint8_t imm8) { > if (UseAVX > 2 && VM_Version::supports_avx512novl()) { > Assembler::vinserti32x4(dst, nds, src, imm8); > } > ... > } > > Best Regards, > Sandhya What problems this is causing? Would be nice to have a test to show the issue (if possible). ------------- PR: https://git.openjdk.java.net/jdk/pull/3480 From vladimir.x.ivanov at oracle.com Wed Apr 14 17:11:10 2021 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 14 Apr 2021 20:11:10 +0300 Subject: 8230015: [instruction selector] generic vector operands support. In-Reply-To: References: Message-ID: <17d6fd4c-196d-ea5b-f2a9-a2175e6ee652@oracle.com> Hi Leslie, I'm not sure what you are trying to achieve with the port. Currently, there are 3 platform-specific methods which Matcher::do_postselect_cleanup() relies on: Matcher::pd_specialize_generic_vector_operand() Matcher::is_generic_reg2reg_move() Matcher::is_generic_vector() It should be enough to provide implementations for those methods on YOURARCH to make Matcher::do_postselect_cleanup() working. Can you elaborate on what exactly you are trying to accomplish? Best regards, Vladimir Ivanov On 14.04.2021 08:09, Leslie Zhai wrote: > Hi Jatin, > > Thanks for your great work! > > libjvm.so reduced +1MB will help SPECjvm2008 some benchmark +44% speed up. > > Then I tried to port 8234391: C2: Generic vector operands for `YOURARCH` > about LoadVector and StoreVector at first. > > Only `Matcher::regmask_for_ideal_register` was able to call > `Matcher::is_generic_vector`, but compared with X86, > `Matcher::do_postselect_cleanup` was also able to call > `Matcher::is_generic_vector` too. > > I debug the `Matcher::do_postselect_cleanup`: > > diff --git a/src/hotspot/share/opto/matcher.cpp > b/src/hotspot/share/opto/matcher.cpp > index 0846cad3c3f..8fd644d2d93 100644 > --- a/src/hotspot/share/opto/matcher.cpp > +++ b/src/hotspot/share/opto/matcher.cpp > @@ -309,6 +309,9 @@ void Matcher::match( ) { > ??? C->record_method_not_compilable("must be able to represent all call > arguments in reg mask"); > ? } > > +#ifdef YOURARCH64 > +? do_postselect_cleanup(); > +#endif > ? if (C->failing())? return;? // bailed out on incoming arg failure > > ? // --------------- > @@ -2630,8 +2633,10 @@ void Matcher::specialize_generic_vector_operands() { > ??????? int opnd_idx = m->operand_index(1); > ??????? Node* def = m->in(opnd_idx); > ??????? m->subsume_by(def, C); > +#if !defined(YOURARCH64) > ????? } else if (m->is_MachTemp()) { > ??????? // process MachTemp nodes at use site (see > Matcher::specialize_vector_operand) > +#endif > ????? } else { > ??????? specialize_mach_node(m); > ????? } > > But `Matcher::do_postselect_cleanup` was still not able to be called. > >> Current patch enables this support only for x86 target, to get a > feedback from community. > > Then how to port Generic vector operands for other targets? > > Thanks, > > Leslie Zhai > > > ? 2019?08?22? 14:49, Bhateja, Jatin ??: >> Hi All, >> >> Please find below a patch for generic vector operands[1] support >> during instruction selection. >> >> Motivation behind the patch is to reduce the number of vector >> selection patterns whose operands meagerly differ in vector lengths. >> This will not only result in lesser code being generated by ADLC which >> effectively translates to size reduction in libjvm.so but also >> help in better maintenance of AD files. >> >> Using generic operands we were able to collapse multiple vector >> patterns over mainline >> ????????????? Initial number of vector instruction patterns >> (vec[XYZSD] + legVec[ZXYSD]?? :? 510 >> ????????????? Reduced vector instruction patterns? (vecG + >> legVecG)????????????????????????????????? :? 222 >> >> With this we could see around 1MB size reduction in libjvm.so. >> >> In order to have minimal impact over downstream compiler passes, a >> post-selection pass has been introduced (currently enabled only for >> X86 target) >> which replaces these generic operands with their corresponding >> concreter vector length variants. >> >> JBS????? : https://bugs.openjdk.java.net/browse/JDK-8230015 >> Patch? : >> http://cr.openjdk.java.net/~jbhateja/genericVectorOperands/webrev.00/ >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin Bhateja >> >> [1] >> http://cr.openjdk.java.net/~jbhateja/genericVectorOperands/generic_operands_support_v1.0.pdf >> >> > From vlivanov at openjdk.java.net Wed Apr 14 17:17:40 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Wed, 14 Apr 2021 17:17:40 GMT Subject: RFR: 8264188: Improve handling of assembly files in the JDK [v2] In-Reply-To: References: Message-ID: On Mon, 29 Mar 2021 10:48:55 GMT, Magnus Ihse Bursie wrote: >> We have a handful of assembly files in the JDK. They have long been left aside, with a "if it ain't broken, don't fix it" attitude. >> >> In the current panama-vector, there is a lot more assembly files incoming, including for the Windows platforrm, which has not existed for a long time in the JDK. >> >> It is time to give assembly files some more love and care. This patch cleans up the handling in the build system, and it unifies between .s and .S files. >> >> For historical reasons, .s has been the suffix used in the posix world to signify assembly output as generated by a compiler, and .S to signify "hand-written" precious assembly. One effect of this is that gcc and clang will run the preprocessor on files named .S but not on files named .s. >> >> All our files are "hand-written" in this sense, and should have the .S suffix. But not all had. On mac, it was even worse, where the files were named .s but the option `-x assembler-with-cpp` was used to force clang to treat them as .S files instead... This change however made the preprocesser try to parse comments of the form >> >> # if (a) { >> >> as preprocessor directives, and balk at them. In one of the files, I had to wrap this in preprocessor comments (`/* ... */`). >> >> We also had inconsistent handling on dependencies. For preprocessed assembly files, it really makes sense to have dependency tracking, exactly as for C/C++ files. Now the dependency tracking in NativeCompilation is simplified, and applies to all files. (The sole exception is Windows assembly, since masm is unable to output dependency information, even though it is able to include files :-(). >> >> This patch has been partly written by Sandhya Viswanathan for the panama-vector repo. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Use OPENJDK_BUILD_CPU_BITS instead Looks good! Thanks a lot for taking care of it. ------------- Marked as reviewed by vlivanov (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3198 From vlivanov at openjdk.java.net Wed Apr 14 17:25:37 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Wed, 14 Apr 2021 17:25:37 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v4] In-Reply-To: References: <1mjRZSrgWpn61bMsZ_hioYpmWFjGtySndhAmC0zkGu0=.9684321b-a412-4907-8367-c6071405655d@github.com> Message-ID: On Wed, 14 Apr 2021 01:53:15 GMT, Xiaohong Gong wrote: >> I agree. > > I'v added the type assertion in the constructor of the `"VectorMaskCastNode"`. See : > > assert(in_vt->length() == vt->length(), "vector length must match"); > assert(type2aelembytes(in_vt->element_basic_type()) == type2aelembytes(vt->element_basic_type()), "element size must match"); > > That's why I didn't add the check in predicate. So do you think it's enough? IMO AD instructions should also be accompanied either by predicates or asserts which stress the conditions for the no-op implementation. ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From iignatyev at openjdk.java.net Wed Apr 14 17:26:38 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Wed, 14 Apr 2021 17:26:38 GMT Subject: RFR: 8262060: compiler/whitebox/BlockingCompilation.java timed out [v2] In-Reply-To: References: Message-ID: On Tue, 30 Mar 2021 08:24:09 GMT, Evgeny Nikitin wrote: >> Please review this small fix for a test. >> >> Test fails sometimes when run with UsageTracker enabled. For some reason, a loading of ThreadLocalRandom can happen during the test run, and this invalidates Random.nextInt method (because it's not the only implementation now). >> >> Fixed by pre-loading ThreadLocalRandom. Tested by multiple runs with UsageTracker enabled - approx. 1 out of 20-30 test runs fails without the fix and no failures spotted with the fix. > > Evgeny Nikitin has updated the pull request incrementally with one additional commit since the last revision: > > Get rid of dependency on the Random class LGTM ------------- Marked as reviewed by iignatyev (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3224 From kvn at openjdk.java.net Wed Apr 14 23:52:34 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 14 Apr 2021 23:52:34 GMT Subject: RFR: 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines [v4] In-Reply-To: <_q9nVMOqIePMWqifBZo_Cj9Vsal63l2BLN8X7x5S8iw=.abdcc58f-877d-4e36-b26e-3699cb05eeda@github.com> References: <_q9nVMOqIePMWqifBZo_Cj9Vsal63l2BLN8X7x5S8iw=.abdcc58f-877d-4e36-b26e-3699cb05eeda@github.com> Message-ID: <7Z7WVmrAwRFOZLRr7WpgBhbfpuCXOwMHZQ7xPM50Vt0=.b5cee83a-1bd6-4838-92d5-e9e1057479a5@github.com> On Fri, 9 Apr 2021 05:08:37 GMT, David Holmes wrote: >> The existing JRT_ENTRY (and related) macros require the function to which they are applied to declare a parameter "JavaThread* thread" which represents the current thread. These functions are all implicitly "traps" functions as they can result in exceptions, but they are not declared with TRAPS because the only caller of these functions is the runtime itself (via call_VM) and no callers need to be aware to use CHECK; further they need a JavaThread. So the macro declares the THREAD variable for use with other exception-producing functions and assigns it from "thread". >> >> The majority of this change replaces the parameter name "thread" with "current" so that it is clear that we are always dealing with the current thread. This affects the entry functions as well as the functions called therefrom. >> >> We can then also replace the use of "THREAD" with "current", in contexts that are not related to exception processing. >> >> Some methods called by entry functions were declared to have both a "thread" parameter and a "TRAPS" parameter - with nothing to tell you these are always the same, current, thread. So the "thread" parameter is removed and replaced with a local variable "current" obtained from THREAD->as_Java_thread(). >> >> Some missing CHECK_ uses were added. >> >> Testing: >> - tiers 1-3 >> >> Thanks, >> David > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Fix search&replace mistake Why you kept `JavaThread *thread` parameter in some methods and renamed in others in deoptimization.* files. src/hotspot/share/c1/c1_Runtime1.cpp line 178: > 176: // Stress deoptimization > 177: static void deopt_caller(JavaThread* current) { > 178: if ( !caller_is_deopted(current)) { Can you remove space after `( `? ------------- PR: https://git.openjdk.java.net/jdk/pull/3394 From kvn at openjdk.java.net Thu Apr 15 00:06:35 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 15 Apr 2021 00:06:35 GMT Subject: RFR: 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines [v4] In-Reply-To: <_q9nVMOqIePMWqifBZo_Cj9Vsal63l2BLN8X7x5S8iw=.abdcc58f-877d-4e36-b26e-3699cb05eeda@github.com> References: <_q9nVMOqIePMWqifBZo_Cj9Vsal63l2BLN8X7x5S8iw=.abdcc58f-877d-4e36-b26e-3699cb05eeda@github.com> Message-ID: On Fri, 9 Apr 2021 05:08:37 GMT, David Holmes wrote: >> The existing JRT_ENTRY (and related) macros require the function to which they are applied to declare a parameter "JavaThread* thread" which represents the current thread. These functions are all implicitly "traps" functions as they can result in exceptions, but they are not declared with TRAPS because the only caller of these functions is the runtime itself (via call_VM) and no callers need to be aware to use CHECK; further they need a JavaThread. So the macro declares the THREAD variable for use with other exception-producing functions and assigns it from "thread". >> >> The majority of this change replaces the parameter name "thread" with "current" so that it is clear that we are always dealing with the current thread. This affects the entry functions as well as the functions called therefrom. >> >> We can then also replace the use of "THREAD" with "current", in contexts that are not related to exception processing. >> >> Some methods called by entry functions were declared to have both a "thread" parameter and a "TRAPS" parameter - with nothing to tell you these are always the same, current, thread. So the "thread" parameter is removed and replaced with a local variable "current" obtained from THREAD->as_Java_thread(). >> >> Some missing CHECK_ uses were added. >> >> Testing: >> - tiers 1-3 >> >> Thanks, >> David > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Fix search&replace mistake I also don't see changes in `src/hotspot/share/ci/`. Is code there okay? ------------- PR: https://git.openjdk.java.net/jdk/pull/3394 From david.holmes at oracle.com Thu Apr 15 00:16:26 2021 From: david.holmes at oracle.com (David Holmes) Date: Thu, 15 Apr 2021 10:16:26 +1000 Subject: RFR: 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines [v4] In-Reply-To: <7Z7WVmrAwRFOZLRr7WpgBhbfpuCXOwMHZQ7xPM50Vt0=.b5cee83a-1bd6-4838-92d5-e9e1057479a5@github.com> References: <_q9nVMOqIePMWqifBZo_Cj9Vsal63l2BLN8X7x5S8iw=.abdcc58f-877d-4e36-b26e-3699cb05eeda@github.com> <7Z7WVmrAwRFOZLRr7WpgBhbfpuCXOwMHZQ7xPM50Vt0=.b5cee83a-1bd6-4838-92d5-e9e1057479a5@github.com> Message-ID: <3352b004-d6ba-6bdf-3ee6-663a12414f1c@oracle.com> Hi Vladimir, Thanks for looking at this. On 15/04/2021 9:52 am, Vladimir Kozlov wrote: > On Fri, 9 Apr 2021 05:08:37 GMT, David Holmes wrote: > >>> The existing JRT_ENTRY (and related) macros require the function to which they are applied to declare a parameter "JavaThread* thread" which represents the current thread. These functions are all implicitly "traps" functions as they can result in exceptions, but they are not declared with TRAPS because the only caller of these functions is the runtime itself (via call_VM) and no callers need to be aware to use CHECK; further they need a JavaThread. So the macro declares the THREAD variable for use with other exception-producing functions and assigns it from "thread". >>> >>> The majority of this change replaces the parameter name "thread" with "current" so that it is clear that we are always dealing with the current thread. This affects the entry functions as well as the functions called therefrom. >>> >>> We can then also replace the use of "THREAD" with "current", in contexts that are not related to exception processing. >>> >>> Some methods called by entry functions were declared to have both a "thread" parameter and a "TRAPS" parameter - with nothing to tell you these are always the same, current, thread. So the "thread" parameter is removed and replaced with a local variable "current" obtained from THREAD->as_Java_thread(). >>> >>> Some missing CHECK_ uses were added. >>> >>> Testing: >>> - tiers 1-3 >>> >>> Thanks, >>> David >> >> David Holmes has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix search&replace mistake > > Why you kept `JavaThread *thread` parameter in some methods and renamed in others in deoptimization.* files. I only changed those parameters involved with JRT_* routines, as those can obviously be seen to be the current thread. For other code in that file it probably deals with the current thread, but that isn't necessarily obvious and would need more rigorous checking. That is beyond the scope of the current set of changes and would need a future RFE. Similarly for the lack of changes in the ci files - if they weren't affected by the JRT changes they weren't changed. > src/hotspot/share/c1/c1_Runtime1.cpp line 178: > >> 176: // Stress deoptimization >> 177: static void deopt_caller(JavaThread* current) { >> 178: if ( !caller_is_deopted(current)) { > > Can you remove space after `( `? Sure. :) Thanks, David ----- > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/3394 > From kvn at openjdk.java.net Thu Apr 15 01:08:36 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 15 Apr 2021 01:08:36 GMT Subject: RFR: 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines [v4] In-Reply-To: <_q9nVMOqIePMWqifBZo_Cj9Vsal63l2BLN8X7x5S8iw=.abdcc58f-877d-4e36-b26e-3699cb05eeda@github.com> References: <_q9nVMOqIePMWqifBZo_Cj9Vsal63l2BLN8X7x5S8iw=.abdcc58f-877d-4e36-b26e-3699cb05eeda@github.com> Message-ID: On Fri, 9 Apr 2021 05:08:37 GMT, David Holmes wrote: >> The existing JRT_ENTRY (and related) macros require the function to which they are applied to declare a parameter "JavaThread* thread" which represents the current thread. These functions are all implicitly "traps" functions as they can result in exceptions, but they are not declared with TRAPS because the only caller of these functions is the runtime itself (via call_VM) and no callers need to be aware to use CHECK; further they need a JavaThread. So the macro declares the THREAD variable for use with other exception-producing functions and assigns it from "thread". >> >> The majority of this change replaces the parameter name "thread" with "current" so that it is clear that we are always dealing with the current thread. This affects the entry functions as well as the functions called therefrom. >> >> We can then also replace the use of "THREAD" with "current", in contexts that are not related to exception processing. >> >> Some methods called by entry functions were declared to have both a "thread" parameter and a "TRAPS" parameter - with nothing to tell you these are always the same, current, thread. So the "thread" parameter is removed and replaced with a local variable "current" obtained from THREAD->as_Java_thread(). >> >> Some missing CHECK_ uses were added. >> >> Testing: >> - tiers 1-3 >> >> Thanks, >> David > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Fix search&replace mistake Okay. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3394 From dholmes at openjdk.java.net Thu Apr 15 02:20:09 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 15 Apr 2021 02:20:09 GMT Subject: RFR: 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines [v5] In-Reply-To: References: Message-ID: > The existing JRT_ENTRY (and related) macros require the function to which they are applied to declare a parameter "JavaThread* thread" which represents the current thread. These functions are all implicitly "traps" functions as they can result in exceptions, but they are not declared with TRAPS because the only caller of these functions is the runtime itself (via call_VM) and no callers need to be aware to use CHECK; further they need a JavaThread. So the macro declares the THREAD variable for use with other exception-producing functions and assigns it from "thread". > > The majority of this change replaces the parameter name "thread" with "current" so that it is clear that we are always dealing with the current thread. This affects the entry functions as well as the functions called therefrom. > > We can then also replace the use of "THREAD" with "current", in contexts that are not related to exception processing. > > Some methods called by entry functions were declared to have both a "thread" parameter and a "TRAPS" parameter - with nothing to tell you these are always the same, current, thread. So the "thread" parameter is removed and replaced with a local variable "current" obtained from THREAD->as_Java_thread(). > > Some missing CHECK_ uses were added. > > Testing: > - tiers 1-3 > > Thanks, > David David Holmes has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: - Merge branch 'master' into jrt_entry_v2 - Fix style nit for kvn - Merge branch 'master' into jrt_entry_v2 - Fix search&replace mistake - Fix minor nits reported by @coleenp and @hseigel - @coleenp review comment - Avoid manifesting JavaThread::current() when it can be passed in - Added in missed JRT_BLOCK_ENTRY methods in AOT-related jvmci/compilerRuntime.cpp - Fixed CHECK on return statement. - 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3394/files - new: https://git.openjdk.java.net/jdk/pull/3394/files/9cfc43c2..166c7fe5 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3394&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3394&range=03-04 Stats: 17123 lines in 917 files changed: 10458 ins; 5286 del; 1379 mod Patch: https://git.openjdk.java.net/jdk/pull/3394.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3394/head:pull/3394 PR: https://git.openjdk.java.net/jdk/pull/3394 From dholmes at openjdk.java.net Thu Apr 15 02:25:35 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 15 Apr 2021 02:25:35 GMT Subject: Integrated: 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines In-Reply-To: References: Message-ID: On Thu, 8 Apr 2021 06:52:15 GMT, David Holmes wrote: > The existing JRT_ENTRY (and related) macros require the function to which they are applied to declare a parameter "JavaThread* thread" which represents the current thread. These functions are all implicitly "traps" functions as they can result in exceptions, but they are not declared with TRAPS because the only caller of these functions is the runtime itself (via call_VM) and no callers need to be aware to use CHECK; further they need a JavaThread. So the macro declares the THREAD variable for use with other exception-producing functions and assigns it from "thread". > > The majority of this change replaces the parameter name "thread" with "current" so that it is clear that we are always dealing with the current thread. This affects the entry functions as well as the functions called therefrom. > > We can then also replace the use of "THREAD" with "current", in contexts that are not related to exception processing. > > Some methods called by entry functions were declared to have both a "thread" parameter and a "TRAPS" parameter - with nothing to tell you these are always the same, current, thread. So the "thread" parameter is removed and replaced with a local variable "current" obtained from THREAD->as_Java_thread(). > > Some missing CHECK_ uses were added. > > Testing: > - tiers 1-3 > > Thanks, > David This pull request has now been integrated. Changeset: 79bff21b Author: David Holmes URL: https://git.openjdk.java.net/jdk/commit/79bff21b Stats: 936 lines in 23 files changed: 16 ins; 29 del; 891 mod 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines Reviewed-by: coleenp, hseigel, iklam, kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/3394 From dholmes at openjdk.java.net Thu Apr 15 02:25:34 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 15 Apr 2021 02:25:34 GMT Subject: RFR: 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines [v4] In-Reply-To: References: <_q9nVMOqIePMWqifBZo_Cj9Vsal63l2BLN8X7x5S8iw=.abdcc58f-877d-4e36-b26e-3699cb05eeda@github.com> Message-ID: On Thu, 15 Apr 2021 00:03:22 GMT, Vladimir Kozlov wrote: >> David Holmes has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix search&replace mistake > > I also don't see changes in `src/hotspot/share/ci/`. Is code there okay? Thanks for the review @vnkozlov ! David ------------- PR: https://git.openjdk.java.net/jdk/pull/3394 From whuang at openjdk.java.net Thu Apr 15 03:04:44 2021 From: whuang at openjdk.java.net (Wang Huang) Date: Thu, 15 Apr 2021 03:04:44 GMT Subject: RFR: 8265244: assert(false) failed: bad AD file Message-ID: * aarch64 can only accept `VectorReinterpret` with 64/128 bits. * I will fix this bug by adding a rule for `VectorReinterpret` in `match_rule_supported_vector` * after changing note with @nsjian and @XiaohongGong , I think that two checks in `inline_vector_conver` is useless now. However, these checks impact other cpus, so I need more reviewers. ------------- Commit messages: - 8265244: assert(false) failed: bad AD file Changes: https://git.openjdk.java.net/jdk/pull/3507/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3507&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8265244 Stats: 82 lines in 3 files changed: 80 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/3507.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3507/head:pull/3507 PR: https://git.openjdk.java.net/jdk/pull/3507 From zhaixiang at loongson.cn Thu Apr 15 03:57:05 2021 From: zhaixiang at loongson.cn (Leslie Zhai) Date: Thu, 15 Apr 2021 11:57:05 +0800 Subject: 8230015: [instruction selector] generic vector operands support. In-Reply-To: <17d6fd4c-196d-ea5b-f2a9-a2175e6ee652@oracle.com> References: <17d6fd4c-196d-ea5b-f2a9-a2175e6ee652@oracle.com> Message-ID: <15686c57-66ff-05a4-6f99-bd495ceb0a31@loongson.cn> Hi Vladimir, Thanks for your kind response! ? 2021?04?15? 01:11, Vladimir Ivanov ??: > Hi Leslie, > > I'm not sure what you are trying to achieve with the port. SPECjvm2008 some benchmarks performed better after ported Vector API for YOURARCH, but libjvm.so became fatter. So I tried to port Generic Vector Operands for losing weight. > > > Currently, there are 3 platform-specific methods which > Matcher::do_postselect_cleanup() relies on: > > Matcher::pd_specialize_generic_vector_operand() > Matcher::is_generic_reg2reg_move() > Matcher::is_generic_vector() There is no `vec` move to `legVec` or `legVec` to `vec`: diff --git a/src/hotspot/cpu/yourarch/yourarch_64.ad b/src/hotspot/cpu/yourarch/yourarch_64.ad index f3cdc560372..a2beb6bc776 100644 --- a/src/hotspot/cpu/yourarch/yourarch_64.ad +++ b/src/hotspot/cpu/yourarch/yourarch_64.ad @@ -1104,22 +1104,32 @@ const bool Matcher::require_postalloc_expand = false; // the cpu only look at the lower 5/6 bits anyway? const bool Matcher::need_masked_shift_count = false; -// No support for generic vector operands. -const bool Matcher::supports_generic_vector_operands = false; +const bool Matcher::supports_generic_vector_operands = true; MachOper* Matcher::pd_specialize_generic_vector_operand(MachOper* generic_opnd, uint ideal_reg, bool is_temp) { - ShouldNotReachHere(); // generic vector operands not supported + assert(Matcher::is_generic_vector(generic_opnd), "not generic"); + switch (ideal_reg) { + case Op_VecS: return new vecSOper(); + case Op_VecD: return new vecDOper(); + case Op_VecX: return new vecXOper(); + case Op_VecY: return new vecYOper(); + } + ShouldNotReachHere(); return NULL; } +// No vec move to legVec or legVec to vec bool Matcher::is_generic_reg2reg_move(MachNode* m) { - ShouldNotReachHere(); // generic vector operands not supported return false; } bool Matcher::is_generic_vector(MachOper* opnd) { - ShouldNotReachHere(); // generic vector operands not supported - return false; + switch (opnd->opcode()) { + case VEC: + return true; + default: + return false; + } } bool Matcher::supports_vector_variable_shifts(void) { @@ -2745,6 +2755,19 @@ ins_attrib ins_alignment(4); // Required alignment attribute (must be a power // Vectors +// Dummy generic vector class. Should be used for all vector operands. +// Replaced with vec[SDXY] during post-selection pass. +operand vec() %{ + constraint(ALLOC_IN_RC(dynamic)); + match(VecX); + match(VecY); + match(VecS); + match(VecD); + + format %{ %} + interface(REG_INTER); +%} + operand vecS() %{ constraint(ALLOC_IN_RC(vectors_reg)); match(VecS); Then merged loadV4/8/16/32 into loadV and storeV4/8/16/32 to storeV with the helper functions `vector_length_in_bytes` for LoadVector and StoreVector at first. > > > It should be enough to provide implementations for those methods on > YOURARCH to make Matcher::do_postselect_cleanup() working. It is my fault: other VectorNodes, for example, `MulVD` and `ReplicateD` also need to change `vecX/Y` to `vec` in the very beginning, not only just LoadVector/StoreVector. Sorry that I am too hurry to see the optimization effect. > > > Can you elaborate on what exactly you are trying to accomplish? Better SPECjvm2008 benchmark performance and meanwhile libjvm.so keep slim :) Thanks, Leslie Zhai > > > Best regards, > Vladimir Ivanov > > On 14.04.2021 08:09, Leslie Zhai wrote: >> Hi Jatin, >> >> Thanks for your great work! >> >> libjvm.so reduced +1MB will help SPECjvm2008 some benchmark +44% >> speed up. >> >> Then I tried to port 8234391: C2: Generic vector operands for >> `YOURARCH` about LoadVector and StoreVector at first. >> >> Only `Matcher::regmask_for_ideal_register` was able to call >> `Matcher::is_generic_vector`, but compared with X86, >> `Matcher::do_postselect_cleanup` was also able to call >> `Matcher::is_generic_vector` too. >> >> I debug the `Matcher::do_postselect_cleanup`: >> >> diff --git a/src/hotspot/share/opto/matcher.cpp >> b/src/hotspot/share/opto/matcher.cpp >> index 0846cad3c3f..8fd644d2d93 100644 >> --- a/src/hotspot/share/opto/matcher.cpp >> +++ b/src/hotspot/share/opto/matcher.cpp >> @@ -309,6 +309,9 @@ void Matcher::match( ) { >> C->record_method_not_compilable("must be able to represent all >> call arguments in reg mask"); >> } >> >> +#ifdef YOURARCH64 >> + do_postselect_cleanup(); >> +#endif >> if (C->failing()) return; // bailed out on incoming arg failure >> >> // --------------- >> @@ -2630,8 +2633,10 @@ void >> Matcher::specialize_generic_vector_operands() { >> int opnd_idx = m->operand_index(1); >> Node* def = m->in(opnd_idx); >> m->subsume_by(def, C); >> +#if !defined(YOURARCH64) >> } else if (m->is_MachTemp()) { >> // process MachTemp nodes at use site (see >> Matcher::specialize_vector_operand) >> +#endif >> } else { >> specialize_mach_node(m); >> } >> >> But `Matcher::do_postselect_cleanup` was still not able to be called. >> >>> Current patch enables this support only for x86 target, to get a >> feedback from community. >> >> Then how to port Generic vector operands for other targets? >> >> Thanks, >> >> Leslie Zhai >> >> >> ? 2019?08?22? 14:49, Bhateja, Jatin ??: >>> Hi All, >>> >>> Please find below a patch for generic vector operands[1] support >>> during instruction selection. >>> >>> Motivation behind the patch is to reduce the number of vector >>> selection patterns whose operands meagerly differ in vector lengths. >>> This will not only result in lesser code being generated by ADLC >>> which effectively translates to size reduction in libjvm.so but also >>> help in better maintenance of AD files. >>> >>> Using generic operands we were able to collapse multiple vector >>> patterns over mainline >>> Initial number of vector instruction patterns >>> (vec[XYZSD] + legVec[ZXYSD] : 510 >>> Reduced vector instruction patterns (vecG + >>> legVecG) : 222 >>> >>> With this we could see around 1MB size reduction in libjvm.so. >>> >>> In order to have minimal impact over downstream compiler passes, a >>> post-selection pass has been introduced (currently enabled only for >>> X86 target) >>> which replaces these generic operands with their corresponding >>> concreter vector length variants. >>> >>> JBS : https://bugs.openjdk.java.net/browse/JDK-8230015 >>> Patch : >>> http://cr.openjdk.java.net/~jbhateja/genericVectorOperands/webrev.00/ >>> >>> Kindly review and share your feedback. >>> >>> Best Regards, >>> Jatin Bhateja >>> >>> [1] >>> http://cr.openjdk.java.net/~jbhateja/genericVectorOperands/generic_operands_support_v1.0.pdf >>> >>> >> From xgong at openjdk.java.net Thu Apr 15 04:02:03 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Thu, 15 Apr 2021 04:02:03 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v5] In-Reply-To: References: Message-ID: > The Vector API defines different element types for floating point VectorMask. For example, the bitwise related APIs use "`long/int`", while data related APIs use "`double/float`". This makes some optimizations that based on the type checking cannot work well. > > For example, the VectorBox/Unbox elimination like `"VectorUnbox (VectorBox v) ==> v"` requires the types of output and > input are equal. Normally this is necessary. However, due to the different element type for floating point VectorMask with the same species, the VectorBox/Unbox pattern is optimized to: > > VectorLoadMask (VectorStoreMask vmask) > > Actually the types can be treated as the same one for such cases. And considering the vector mask representation is the same for > vectors with the same element size and vector length, it's safe to do the optimization: > > VectorLoadMask (VectorStoreMask vmask) ==> vmask > > The same issue exists for GVN which is based on the type of a node. Making the mask node's` hash()/cmp()` methods depends on the element size instead of the element type can fix it. Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: Add type restrict in the match rule predicate ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3238/files - new: https://git.openjdk.java.net/jdk/pull/3238/files/977787e4..8232bd96 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3238&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3238&range=03-04 Stats: 13 lines in 4 files changed: 8 ins; 0 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/3238.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3238/head:pull/3238 PR: https://git.openjdk.java.net/jdk/pull/3238 From xgong at openjdk.java.net Thu Apr 15 04:02:04 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Thu, 15 Apr 2021 04:02:04 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v4] In-Reply-To: References: Message-ID: On Tue, 13 Apr 2021 03:21:21 GMT, Xiaohong Gong wrote: >> The Vector API defines different element types for floating point VectorMask. For example, the bitwise related APIs use "`long/int`", while data related APIs use "`double/float`". This makes some optimizations that based on the type checking cannot work well. >> >> For example, the VectorBox/Unbox elimination like `"VectorUnbox (VectorBox v) ==> v"` requires the types of output and >> input are equal. Normally this is necessary. However, due to the different element type for floating point VectorMask with the same species, the VectorBox/Unbox pattern is optimized to: >> >> VectorLoadMask (VectorStoreMask vmask) >> >> Actually the types can be treated as the same one for such cases. And considering the vector mask representation is the same for >> vectors with the same element size and vector length, it's safe to do the optimization: >> >> VectorLoadMask (VectorStoreMask vmask) ==> vmask >> >> The same issue exists for GVN which is based on the type of a node. Making the mask node's` hash()/cmp()` methods depends on the element size instead of the element type can fix it. > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Use "Matcher::match_rule_supported_vector" for vector nodes checking Hi @jatin-bhateja @iwanowww , all your comments have been addressed. Could you please take a look at it again? Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From xgong at openjdk.java.net Thu Apr 15 04:02:04 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Thu, 15 Apr 2021 04:02:04 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v4] In-Reply-To: References: <1mjRZSrgWpn61bMsZ_hioYpmWFjGtySndhAmC0zkGu0=.9684321b-a412-4907-8367-c6071405655d@github.com> Message-ID: On Wed, 14 Apr 2021 17:22:41 GMT, Vladimir Ivanov wrote: >> I'v added the type assertion in the constructor of the `"VectorMaskCastNode"`. See : >> >> assert(in_vt->length() == vt->length(), "vector length must match"); >> assert(type2aelembytes(in_vt->element_basic_type()) == type2aelembytes(vt->element_basic_type()), "element size must match"); >> >> That's why I didn't add the check in predicate. So do you think it's enough? > > IMO AD instructions should also be accompanied either by predicates or asserts which stress the conditions for the no-op implementation. OK, thanks! I will add the type restrict in the predicate. ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From jatin.bhateja at intel.com Thu Apr 15 06:48:38 2021 From: jatin.bhateja at intel.com (Bhateja, Jatin) Date: Thu, 15 Apr 2021 06:48:38 +0000 Subject: 8230015: [instruction selector] generic vector operands support. In-Reply-To: <17d6fd4c-196d-ea5b-f2a9-a2175e6ee652@oracle.com> References: <17d6fd4c-196d-ea5b-f2a9-a2175e6ee652@oracle.com> Message-ID: Hi Vladimir, Thanks for responding on my behalf. Best Regards, Jatin > -----Original Message----- > From: Vladimir Ivanov > Sent: Wednesday, April 14, 2021 10:41 PM > To: Leslie Zhai > Cc: Bhateja, Jatin ; hotspot-compiler- > dev at openjdk.java.net > Subject: Re: 8230015: [instruction selector] generic vector operands > support. > > Hi Leslie, > > I'm not sure what you are trying to achieve with the port. > > Currently, there are 3 platform-specific methods which > Matcher::do_postselect_cleanup() relies on: > > Matcher::pd_specialize_generic_vector_operand() > Matcher::is_generic_reg2reg_move() > Matcher::is_generic_vector() > > It should be enough to provide implementations for those methods on > YOURARCH to make Matcher::do_postselect_cleanup() working. > > Can you elaborate on what exactly you are trying to accomplish? > > Best regards, > Vladimir Ivanov > > On 14.04.2021 08:09, Leslie Zhai wrote: > > Hi Jatin, > > > > Thanks for your great work! > > > > libjvm.so reduced +1MB will help SPECjvm2008 some benchmark +44% speed > up. > > > > Then I tried to port 8234391: C2: Generic vector operands for > > `YOURARCH` about LoadVector and StoreVector at first. > > > > Only `Matcher::regmask_for_ideal_register` was able to call > > `Matcher::is_generic_vector`, but compared with X86, > > `Matcher::do_postselect_cleanup` was also able to call > > `Matcher::is_generic_vector` too. > > > > I debug the `Matcher::do_postselect_cleanup`: > > > > diff --git a/src/hotspot/share/opto/matcher.cpp > > b/src/hotspot/share/opto/matcher.cpp > > index 0846cad3c3f..8fd644d2d93 100644 > > --- a/src/hotspot/share/opto/matcher.cpp > > +++ b/src/hotspot/share/opto/matcher.cpp > > @@ -309,6 +309,9 @@ void Matcher::match( ) { > > ??? C->record_method_not_compilable("must be able to represent all > > call arguments in reg mask"); > > ? } > > > > +#ifdef YOURARCH64 > > +? do_postselect_cleanup(); > > +#endif > > ? if (C->failing())? return;? // bailed out on incoming arg failure > > > > ? // --------------- > > @@ -2630,8 +2633,10 @@ void > > Matcher::specialize_generic_vector_operands() { > > ??????? int opnd_idx = m->operand_index(1); > > ??????? Node* def = m->in(opnd_idx); > > ??????? m->subsume_by(def, C); > > +#if !defined(YOURARCH64) > > ????? } else if (m->is_MachTemp()) { > > ??????? // process MachTemp nodes at use site (see > > Matcher::specialize_vector_operand) > > +#endif > > ????? } else { > > ??????? specialize_mach_node(m); > > ????? } > > > > But `Matcher::do_postselect_cleanup` was still not able to be called. > > > >> Current patch enables this support only for x86 target, to get a > > feedback from community. > > > > Then how to port Generic vector operands for other targets? > > > > Thanks, > > > > Leslie Zhai > > > > > > ? 2019?08?22? 14:49, Bhateja, Jatin ??: > >> Hi All, > >> > >> Please find below a patch for generic vector operands[1] support > >> during instruction selection. > >> > >> Motivation behind the patch is to reduce the number of vector > >> selection patterns whose operands meagerly differ in vector lengths. > >> This will not only result in lesser code being generated by ADLC > >> which effectively translates to size reduction in libjvm.so but also > >> help in better maintenance of AD files. > >> > >> Using generic operands we were able to collapse multiple vector > >> patterns over mainline > >> ????????????? Initial number of vector instruction patterns > >> (vec[XYZSD] + legVec[ZXYSD]?? :? 510 > >> ????????????? Reduced vector instruction patterns? (vecG + > >> legVecG)????????????????????????????????? :? 222 > >> > >> With this we could see around 1MB size reduction in libjvm.so. > >> > >> In order to have minimal impact over downstream compiler passes, a > >> post-selection pass has been introduced (currently enabled only for > >> X86 target) > >> which replaces these generic operands with their corresponding > >> concreter vector length variants. > >> > >> JBS????? : https://bugs.openjdk.java.net/browse/JDK-8230015 > >> Patch? : > >> http://cr.openjdk.java.net/~jbhateja/genericVectorOperands/webrev.00/ > >> > >> Kindly review and share your feedback. > >> > >> Best Regards, > >> Jatin Bhateja > >> > >> [1] > >> http://cr.openjdk.java.net/~jbhateja/genericVectorOperands/generic_op > >> erands_support_v1.0.pdf > >> > >> > > From lutz.schmidt at sap.com Thu Apr 15 07:46:08 2021 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Thu, 15 Apr 2021 07:46:08 +0000 Subject: [11u] RFR(S): 8250635 backport: MethodArityHistogram should use Compile_lock in favour of fancy checks In-Reply-To: References: <18C76401-DE61-4B8F-821B-9136B3B00E20@sap.com> <0D8921EA-F9CA-493F-AA7F-A0BFBF992B21@sap.com> <40507412-5F59-4623-BA39-728CA14F9C64@sap.com> Message-ID: <9967BB60-EF86-4EC7-9F56-B8E3E7A5D02D@sap.com> Tests did not reveal any issues. From this end, downport is ready to be pushed. Bug was updated accordingly. Thanks, Lutz ?On 14.04.21, 11:21, "Doerr, Martin" wrote: Yes. Just let me know when it's ready. Best regards, Martin > -----Original Message----- > From: Schmidt, Lutz > Sent: Mittwoch, 14. April 2021 11:18 > To: Doerr, Martin ; hotspot-compiler- > dev at openjdk.java.net; jdk-updates-dev at openjdk.java.net; Hohensee, > Paul > Subject: Re: [11u] RFR(S): 8250635 backport: MethodArityHistogram should > use Compile_lock in favour of fancy checks > > Thank you, Martin and Paul, for your reviews. > > Submit will have to wait another day. I was too late for testing last night. > > I will need a sponsor anyway. Is it ok to contact you, Martin, once the tests > are done successfully? > > Thanks, > Lutz > > On 14.04.21, 11:06, "Doerr, Martin" wrote: > > Hi Lutz, > > your backport looks good. > > Best regards, > Martin > > > > -----Original Message----- > > From: jdk-updates-dev On > > Behalf Of Schmidt, Lutz > > Sent: Dienstag, 13. April 2021 22:02 > > To: hotspot-compiler-dev at openjdk.java.net; jdk-updates- > > dev at openjdk.java.net > > Subject: Re: [11u] RFR(S): 8250635 backport: MethodArityHistogram > should > > use Compile_lock in favour of fancy checks > > > > Sorry for spamming, forgot jdk-updates-dev. > > Lutz > > > > On 13.04.21, 21:43, "Schmidt, Lutz" wrote: > > > > Dear Community, > > > > I would appreciate receiving reviews for this downport change. It is a > small > > change in one file only. Unfortunately, it did not apply clean. > > > > Original bug: https://bugs.openjdk.java.net/browse/JDK-8250635 > > Downport webrev: > > https://cr.openjdk.java.net/~lucy/webrevs/8250635.11u.01/ > > > > Tests: > > SAP's internal build and test farm. Results pending. > > > > Thank you! > > Lutz > > > > > From rbackman at openjdk.java.net Thu Apr 15 07:51:35 2021 From: rbackman at openjdk.java.net (Rickard =?UTF-8?B?QsOkY2ttYW4=?=) Date: Thu, 15 Apr 2021 07:51:35 GMT Subject: Integrated: 8260255: C1: LoopInvariantCodeMotion constructor can leave some fields uninitialized In-Reply-To: References: Message-ID: On Wed, 14 Apr 2021 08:27:16 GMT, Rickard B?ckman wrote: > Initialize instance variables to default values to avoid uninitialized values for early return. This pull request has now been integrated. Changeset: 0793fcbb Author: Rickard B?ckman URL: https://git.openjdk.java.net/jdk/commit/0793fcbb Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8260255: C1: LoopInvariantCodeMotion constructor can leave some fields uninitialized Reviewed-by: neliasso, thartmann ------------- PR: https://git.openjdk.java.net/jdk/pull/3484 From jbhateja at openjdk.java.net Thu Apr 15 09:56:35 2021 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Thu, 15 Apr 2021 09:56:35 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v4] In-Reply-To: References: Message-ID: On Thu, 15 Apr 2021 03:58:26 GMT, Xiaohong Gong wrote: > Hi @jatin-bhateja @iwanowww , all your comments have been addressed. Could you please take a look at it again? Thanks! Thanks for addressing it. as Vladimir suggested in long term we can just emit this node instead of VectorStoreMask + VectorLoadMask combination when mask types are non-conformal to emit efficient instruction sequence using input mask packing/unpacking. ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From chagedorn at openjdk.java.net Thu Apr 15 10:15:51 2021 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Thu, 15 Apr 2021 10:15:51 GMT Subject: RFR: 8254129: IR Test Framework to support regex-based matching on the IR in JTreg compiler tests Message-ID: <2iYQOJ5yeu7SvGcScLPBOWCPMLv69e1ksOL1vW3ytL8=.0c27621d-ef3d-422c-9d8c-922078ca3160@github.com> This RFE provides an IR test framework to perform regex-based checks on the C2 IR shape of test methods emitted by the VM flags `-XX:+PrintIdeal` and `-XX:+PrintOptoAssembly`. The framework can also be used for other non-IR matching (and non-compiler) tests by providing easy to use annotations for commonly used testing patterns and compiler control flags. The framework is based on the ideas of the currently present IR test framework in [Valhalla](https://github.com/openjdk/valhalla/blob/e9c78ce4fcfd01361c35883e0d68f9ae5a80d079/test/hotspot/jtreg/compiler/valhalla/inlinetypes/InlineTypeTest.java) (mainly implemented by @TobiHartmann) which is being used with great success. This new framework aims to replace the old one in Valhalla at some point. A detailed description about how this new IR test framework works and how it is used is provided in the [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) file and in the [Javadocs](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc/jdk/test/lib/hotspot/ir_framework/package-summary.html) written for the framework classes. To finish a first version of this framework for JDK 17, I decided to leave some improvement possibilities and ideas to be followed up on in additional RFEs. Some ideas are mentioned in "Future Work" in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) and were also created as subtasks of this RFE. Testing (also described in "Internal Framework Tests in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md)): There are various tests to verify the correctness of the test framework which can be found as JTreg tests in the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) folder. Additional testing was performed by converting all compiler Inline Types test of project Valhalla (done by @katyapav in [JDK-8263024](https://bugs.openjdk.java.net/browse/JDK-8263024)) that used the old framework to the new framework. This provided additional testing for the framework itself. We ran the converted tests with all the flag settings used in hs-tier1-9 and hs-precheckin-comp. For sanity checking, this was also done with a sample IR test in mainline. Some stats about the framework code added to [ir_framework](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework): - without the [Javadocs files](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc) : 60 changed files, 13212 insertions, 0 deletions. - without the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) and [examples](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/examples) folder: 40 files changed, 6781 insertions - comments: 2399 insertions (calculated with `git diff --cached !(tests|examples) | grep -c -E "(^[+-]\s*(/)?*)|(^[+-]\s*//)"`) - which leaves 4382 lines of code inserted Big thanks to: - @TobiHartmann for all his help by discussing the new framework and for providing insights from his IR test framework in Valhalla. - @katyapav for converting the Valhalla tests to use the new framework which found some harder to catch bugs in the framework and also some actual C2 bugs. - @iignatev for helping to simplify the framework usage with JTreg and with the framework internal VM calling structure. - and others who provided valuable feedback. Thanks, Christian ------------- Commit messages: - Fix comment - Remove accidentally added print statement - Fix whitespaces + minor fix - Fix whitespaces - Add Javadoc files - 8254129: IR Test Framework to support regex-based matching on the IR in JTreg compiler tests Changes: https://git.openjdk.java.net/jdk/pull/3508/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3508&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254129 Stats: 24497 lines in 144 files changed: 24497 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3508.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3508/head:pull/3508 PR: https://git.openjdk.java.net/jdk/pull/3508 From vlivanov at openjdk.java.net Thu Apr 15 11:06:36 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Thu, 15 Apr 2021 11:06:36 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v5] In-Reply-To: References: Message-ID: <65k0L3mzv75D0GxrUlzwB0e3_wts1XA3nzGxpPVbKRI=.2152c65f-6ece-4856-a1ef-d040c4d84d66@github.com> On Thu, 15 Apr 2021 04:02:03 GMT, Xiaohong Gong wrote: >> The Vector API defines different element types for floating point VectorMask. For example, the bitwise related APIs use "`long/int`", while data related APIs use "`double/float`". This makes some optimizations that based on the type checking cannot work well. >> >> For example, the VectorBox/Unbox elimination like `"VectorUnbox (VectorBox v) ==> v"` requires the types of output and >> input are equal. Normally this is necessary. However, due to the different element type for floating point VectorMask with the same species, the VectorBox/Unbox pattern is optimized to: >> >> VectorLoadMask (VectorStoreMask vmask) >> >> Actually the types can be treated as the same one for such cases. And considering the vector mask representation is the same for >> vectors with the same element size and vector length, it's safe to do the optimization: >> >> VectorLoadMask (VectorStoreMask vmask) ==> vmask >> >> The same issue exists for GVN which is based on the type of a node. Making the mask node's` hash()/cmp()` methods depends on the element size instead of the element type can fix it. > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Add type restrict in the match rule predicate Looks good. ------------- Marked as reviewed by vlivanov (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3238 From zhaixiang at loongson.cn Thu Apr 15 11:06:42 2021 From: zhaixiang at loongson.cn (Leslie Zhai) Date: Thu, 15 Apr 2021 19:06:42 +0800 Subject: 8230015: [instruction selector] generic vector operands support. In-Reply-To: References: <17d6fd4c-196d-ea5b-f2a9-a2175e6ee652@oracle.com> Message-ID: <2ae8b6a9-beef-c0ee-8733-31c2cbd06606@loongson.cn> Hi Jatin, Thanks again for your great work! Before vs After ported 16137992 vs 15803384 Cheers, Leslie Zhai ? 2021?04?15? 14:48, Bhateja, Jatin ??: > Hi Vladimir, > Thanks for responding on my behalf. > > Best Regards, > Jatin > >> -----Original Message----- >> From: Vladimir Ivanov >> Sent: Wednesday, April 14, 2021 10:41 PM >> To: Leslie Zhai >> Cc: Bhateja, Jatin ; hotspot-compiler- >> dev at openjdk.java.net >> Subject: Re: 8230015: [instruction selector] generic vector operands >> support. >> >> Hi Leslie, >> >> I'm not sure what you are trying to achieve with the port. >> >> Currently, there are 3 platform-specific methods which >> Matcher::do_postselect_cleanup() relies on: >> >> Matcher::pd_specialize_generic_vector_operand() >> Matcher::is_generic_reg2reg_move() >> Matcher::is_generic_vector() >> >> It should be enough to provide implementations for those methods on >> YOURARCH to make Matcher::do_postselect_cleanup() working. >> >> Can you elaborate on what exactly you are trying to accomplish? >> >> Best regards, >> Vladimir Ivanov >> >> On 14.04.2021 08:09, Leslie Zhai wrote: >>> Hi Jatin, >>> >>> Thanks for your great work! >>> >>> libjvm.so reduced +1MB will help SPECjvm2008 some benchmark +44% speed >> up. >>> Then I tried to port 8234391: C2: Generic vector operands for >>> `YOURARCH` about LoadVector and StoreVector at first. >>> >>> Only `Matcher::regmask_for_ideal_register` was able to call >>> `Matcher::is_generic_vector`, but compared with X86, >>> `Matcher::do_postselect_cleanup` was also able to call >>> `Matcher::is_generic_vector` too. >>> >>> I debug the `Matcher::do_postselect_cleanup`: >>> >>> diff --git a/src/hotspot/share/opto/matcher.cpp >>> b/src/hotspot/share/opto/matcher.cpp >>> index 0846cad3c3f..8fd644d2d93 100644 >>> --- a/src/hotspot/share/opto/matcher.cpp >>> +++ b/src/hotspot/share/opto/matcher.cpp >>> @@ -309,6 +309,9 @@ void Matcher::match( ) { >>> C->record_method_not_compilable("must be able to represent all >>> call arguments in reg mask"); >>> } >>> >>> +#ifdef YOURARCH64 >>> + do_postselect_cleanup(); >>> +#endif >>> if (C->failing()) return; // bailed out on incoming arg failure >>> >>> // --------------- >>> @@ -2630,8 +2633,10 @@ void >>> Matcher::specialize_generic_vector_operands() { >>> int opnd_idx = m->operand_index(1); >>> Node* def = m->in(opnd_idx); >>> m->subsume_by(def, C); >>> +#if !defined(YOURARCH64) >>> } else if (m->is_MachTemp()) { >>> // process MachTemp nodes at use site (see >>> Matcher::specialize_vector_operand) >>> +#endif >>> } else { >>> specialize_mach_node(m); >>> } >>> >>> But `Matcher::do_postselect_cleanup` was still not able to be called. >>> >>>> Current patch enables this support only for x86 target, to get a >>> feedback from community. >>> >>> Then how to port Generic vector operands for other targets? >>> >>> Thanks, >>> >>> Leslie Zhai >>> >>> >>> ? 2019?08?22? 14:49, Bhateja, Jatin ??: >>>> Hi All, >>>> >>>> Please find below a patch for generic vector operands[1] support >>>> during instruction selection. >>>> >>>> Motivation behind the patch is to reduce the number of vector >>>> selection patterns whose operands meagerly differ in vector lengths. >>>> This will not only result in lesser code being generated by ADLC >>>> which effectively translates to size reduction in libjvm.so but also >>>> help in better maintenance of AD files. >>>> >>>> Using generic operands we were able to collapse multiple vector >>>> patterns over mainline >>>> Initial number of vector instruction patterns >>>> (vec[XYZSD] + legVec[ZXYSD] : 510 >>>> Reduced vector instruction patterns (vecG + >>>> legVecG) : 222 >>>> >>>> With this we could see around 1MB size reduction in libjvm.so. >>>> >>>> In order to have minimal impact over downstream compiler passes, a >>>> post-selection pass has been introduced (currently enabled only for >>>> X86 target) >>>> which replaces these generic operands with their corresponding >>>> concreter vector length variants. >>>> >>>> JBS : https://bugs.openjdk.java.net/browse/JDK-8230015 >>>> Patch : >>>> http://cr.openjdk.java.net/~jbhateja/genericVectorOperands/webrev.00/ >>>> >>>> Kindly review and share your feedback. >>>> >>>> Best Regards, >>>> Jatin Bhateja >>>> >>>> [1] >>>> http://cr.openjdk.java.net/~jbhateja/genericVectorOperands/generic_op >>>> erands_support_v1.0.pdf >>>> >>>> From vladimir.x.ivanov at oracle.com Thu Apr 15 11:09:30 2021 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 15 Apr 2021 14:09:30 +0300 Subject: [External] : Re: 8230015: [instruction selector] generic vector operands support. In-Reply-To: <15686c57-66ff-05a4-6f99-bd495ceb0a31@loongson.cn> References: <17d6fd4c-196d-ea5b-f2a9-a2175e6ee652@oracle.com> <15686c57-66ff-05a4-6f99-bd495ceb0a31@loongson.cn> Message-ID: >> It should be enough to provide implementations for those methods on >> YOURARCH to make Matcher::do_postselect_cleanup() working. > > It is my fault: other VectorNodes, for example, `MulVD` and `ReplicateD` > also need to change `vecX/Y` to `vec` in the very beginning, not only > just LoadVector/StoreVector.? Sorry that I am too hurry to see the > optimization effect. Yes, the downside is that there's no implicit operand conversion supported between concrete and generic vector operands. You have to migrate everything at once. Best regards, Vladimir Ivanov >> >> Can you elaborate on what exactly you are trying to accomplish? > > Better SPECjvm2008 benchmark performance and meanwhile libjvm.so keep > slim :) > > Thanks, > Leslie Zhai > >> >> >> Best regards, >> Vladimir Ivanov >> >> On 14.04.2021 08:09, Leslie Zhai wrote: >>> Hi Jatin, >>> >>> Thanks for your great work! >>> >>> libjvm.so reduced +1MB will help SPECjvm2008 some benchmark +44% >>> speed up. >>> >>> Then I tried to port 8234391: C2: Generic vector operands for >>> `YOURARCH` about LoadVector and StoreVector at first. >>> >>> Only `Matcher::regmask_for_ideal_register` was able to call >>> `Matcher::is_generic_vector`, but compared with X86, >>> `Matcher::do_postselect_cleanup` was also able to call >>> `Matcher::is_generic_vector` too. >>> >>> I debug the `Matcher::do_postselect_cleanup`: >>> >>> diff --git a/src/hotspot/share/opto/matcher.cpp >>> b/src/hotspot/share/opto/matcher.cpp >>> index 0846cad3c3f..8fd644d2d93 100644 >>> --- a/src/hotspot/share/opto/matcher.cpp >>> +++ b/src/hotspot/share/opto/matcher.cpp >>> @@ -309,6 +309,9 @@ void Matcher::match( ) { >>> ???? C->record_method_not_compilable("must be able to represent all >>> call arguments in reg mask"); >>> ?? } >>> >>> +#ifdef YOURARCH64 >>> +? do_postselect_cleanup(); >>> +#endif >>> ?? if (C->failing())? return;? // bailed out on incoming arg failure >>> >>> ?? // --------------- >>> @@ -2630,8 +2633,10 @@ void >>> Matcher::specialize_generic_vector_operands() { >>> ???????? int opnd_idx = m->operand_index(1); >>> ???????? Node* def = m->in(opnd_idx); >>> ???????? m->subsume_by(def, C); >>> +#if !defined(YOURARCH64) >>> ?????? } else if (m->is_MachTemp()) { >>> ???????? // process MachTemp nodes at use site (see >>> Matcher::specialize_vector_operand) >>> +#endif >>> ?????? } else { >>> ???????? specialize_mach_node(m); >>> ?????? } >>> >>> But `Matcher::do_postselect_cleanup` was still not able to be called. >>> >>>> Current patch enables this support only for x86 target, to get a >>> feedback from community. >>> >>> Then how to port Generic vector operands for other targets? >>> >>> Thanks, >>> >>> Leslie Zhai >>> >>> >>> ? 2019?08?22? 14:49, Bhateja, Jatin ??: >>>> Hi All, >>>> >>>> Please find below a patch for generic vector operands[1] support >>>> during instruction selection. >>>> >>>> Motivation behind the patch is to reduce the number of vector >>>> selection patterns whose operands meagerly differ in vector lengths. >>>> This will not only result in lesser code being generated by ADLC >>>> which effectively translates to size reduction in libjvm.so but also >>>> help in better maintenance of AD files. >>>> >>>> Using generic operands we were able to collapse multiple vector >>>> patterns over mainline >>>> ????????????? Initial number of vector instruction patterns >>>> (vec[XYZSD] + legVec[ZXYSD]?? :? 510 >>>> ????????????? Reduced vector instruction patterns? (vecG + >>>> legVecG)????????????????????????????????? :? 222 >>>> >>>> With this we could see around 1MB size reduction in libjvm.so. >>>> >>>> In order to have minimal impact over downstream compiler passes, a >>>> post-selection pass has been introduced (currently enabled only for >>>> X86 target) >>>> which replaces these generic operands with their corresponding >>>> concreter vector length variants. >>>> >>>> JBS????? : https://bugs.openjdk.java.net/browse/JDK-8230015 >>>> Patch? : >>>> http://cr.openjdk.java.net/~jbhateja/genericVectorOperands/webrev.00/ >>>> >>>> Kindly review and share your feedback. >>>> >>>> Best Regards, >>>> Jatin Bhateja >>>> >>>> [1] >>>> http://cr.openjdk.java.net/~jbhateja/genericVectorOperands/generic_operands_support_v1.0.pdf >>>> >>>> >>> > From whuang at openjdk.java.net Thu Apr 15 11:46:43 2021 From: whuang at openjdk.java.net (Wang Huang) Date: Thu, 15 Apr 2021 11:46:43 GMT Subject: RFR: 8263006: Add optimization for Max(*)Node and Min(*)Node Message-ID: * I optimize `max` and `min` by using these identities - op (max(a,b) , min(a,b))=== op(a,b) - if op is commutable - example : - max(a,b) + min(a,b))=== a + b // op = add - max(a,b) * min(a,b))=== a * b // op = mul - max( max(a,b) , min(a,b)))=== max(a,b) // op = max() - min( max(a,b) , min(a,b)))=== max(a,b) // op = min() * Test case ```java /* * Copyright (c) 2021, Huawei Technologies Co. Ltd. All rights reserved. * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. * * This code is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License version 2 only, as * published by the Free Software Foundation. * * This code is distributed in the hope that it will be useful, but WITHOUT * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License * version 2 for more details (a copy is included in the LICENSE file that * accompanied this code). * * You should have received a copy of the GNU General Public License version * 2 along with this work; if not, write to the Free Software Foundation, * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. * * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA * or visit www.oracle.com if you need additional information or have any * questions. */ package org.sample; import org.openjdk.jmh.annotations.Benchmark; import org.openjdk.jmh.annotations.*; import java.util.Random; import java.util.concurrent.TimeUnit; import org.openjdk.jmh.infra.Blackhole; @BenchmarkMode({Mode.AverageTime}) @OutputTimeUnit(TimeUnit.MICROSECONDS) public class MyBenchmark { static int length = 100000; static double[] data1 = new double[length]; static double[] data2 = new double[length]; static Random random = new Random(); static { for(int i = 0; i < length; ++i) { data1[i] = random.nextDouble(); data2[i] = random.nextDouble(); } } @Benchmark public void testAdd(Blackhole bh) { double sum = 0; for (int i = 0; i < length; i++) { sum += Math.max(data1[i], data2[i]) + Math.min(data1[i], data2[i]); } bh.consume(sum); } @Benchmark public void testMax(Blackhole bh) { double sum = 0; for (int i = 0; i < length; i++) { sum += Math.max(Math.max(data1[i], data2[i]), Math.min(data1[i], data2[i])); } bh.consume(sum); } @Benchmark public void testMin(Blackhole bh) { double sum = 0; for (int i = 0; i < length; i++) { sum += Math.min(Math.max(data1[i], data2[i]), Math.min(data1[i], data2[i])); } bh.consume(sum); } @Benchmark public void testMul(Blackhole bh) { double sum = 0; for (int i = 0; i < length; i++) { sum += (Math.max(data1[i], data2[i]) * Math.min(data1[i], data2[i])); } bh.consume(sum); } } ``` * The result is listed here: before: Benchmark Mode Samples Score Score error Units o.s.MyBenchmark.testAdd avgt 10 556.048 32.368 us/op o.s.MyBenchmark.testMax avgt 10 543.065 54.221 us/op o.s.MyBenchmark.testMin avgt 10 570.731 37.630 us/op o.s.MyBenchmark.testMul avgt 10 531.906 20.518 us/op after: Benchmark Mode Samples Score Score error Units o.s.MyBenchmark.testAdd avgt 10 319.350 9.248 us/op o.s.MyBenchmark.testMax avgt 10 356.138 10.736 us/op o.s.MyBenchmark.testMin avgt 10 323.731 16.621 us/op o.s.MyBenchmark.testMul avgt 10 338.458 23.755 us/op ------------- Commit messages: - 8263006: Add optimization for Max(*)Node and Min(*)Node Changes: https://git.openjdk.java.net/jdk/pull/3513/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3513&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263006 Stats: 117 lines in 4 files changed: 107 ins; 4 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/3513.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3513/head:pull/3513 PR: https://git.openjdk.java.net/jdk/pull/3513 From rehn at openjdk.java.net Thu Apr 15 13:02:52 2021 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 15 Apr 2021 13:02:52 GMT Subject: RFR: 8264480: Unreachable code in nmethod.cpp inside #ifdef DEBUG Message-ID: Fixed ifdef to use correct define. Stress tested locally, running t1-5. ------------- Commit messages: - Fixed ifdef Changes: https://git.openjdk.java.net/jdk/pull/3517/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3517&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264480 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3517.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3517/head:pull/3517 PR: https://git.openjdk.java.net/jdk/pull/3517 From chagedorn at openjdk.java.net Thu Apr 15 13:15:34 2021 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Thu, 15 Apr 2021 13:15:34 GMT Subject: RFR: 8264480: Unreachable code in nmethod.cpp inside #ifdef DEBUG In-Reply-To: References: Message-ID: On Thu, 15 Apr 2021 12:50:14 GMT, Robbin Ehn wrote: > Fixed ifdef to use correct define. > > Stress tested locally, running t1-5. Looks good. ------------- Marked as reviewed by chagedorn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3517 From chagedorn at openjdk.java.net Thu Apr 15 13:19:56 2021 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Thu, 15 Apr 2021 13:19:56 GMT Subject: RFR: 8254129: IR Test Framework to support regex-based matching on the IR in JTreg compiler tests [v2] In-Reply-To: <2iYQOJ5yeu7SvGcScLPBOWCPMLv69e1ksOL1vW3ytL8=.0c27621d-ef3d-422c-9d8c-922078ca3160@github.com> References: <2iYQOJ5yeu7SvGcScLPBOWCPMLv69e1ksOL1vW3ytL8=.0c27621d-ef3d-422c-9d8c-922078ca3160@github.com> Message-ID: > This RFE provides an IR test framework to perform regex-based checks on the C2 IR shape of test methods emitted by the VM flags `-XX:+PrintIdeal` and `-XX:+PrintOptoAssembly`. The framework can also be used for other non-IR matching (and non-compiler) tests by providing easy to use annotations for commonly used testing patterns and compiler control flags. > > The framework is based on the ideas of the currently present IR test framework in [Valhalla](https://github.com/openjdk/valhalla/blob/e9c78ce4fcfd01361c35883e0d68f9ae5a80d079/test/hotspot/jtreg/compiler/valhalla/inlinetypes/InlineTypeTest.java) (mainly implemented by @TobiHartmann) which is being used with great success. This new framework aims to replace the old one in Valhalla at some point. > > A detailed description about how this new IR test framework works and how it is used is provided in the [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) file and in the [Javadocs](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc/jdk/test/lib/hotspot/ir_framework/package-summary.html) written for the framework classes. > > To finish a first version of this framework for JDK 17, I decided to leave some improvement possibilities and ideas to be followed up on in additional RFEs. Some ideas are mentioned in "Future Work" in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) and were also created as subtasks of this RFE. > > Testing (also described in "Internal Framework Tests in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md)): > There are various tests to verify the correctness of the test framework which can be found as JTreg tests in the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) folder. Additional testing was performed by converting all compiler Inline Types test of project Valhalla (done by @katyapav in [JDK-8263024](https://bugs.openjdk.java.net/browse/JDK-8263024)) that used the old framework to the new framework. This provided additional testing for the framework itself. We ran the converted tests with all the flag settings used in hs-tier1-9 and hs-precheckin-comp. For sanity checking, this was also done with a sample IR test in mainline. > > Some stats about the framework code added to [ir_framework](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework): > > - without the [Javadocs files](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc) : 60 changed files, 13212 insertions, 0 deletions. > - without the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) and [examples](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/examples) folder: 40 files changed, 6781 insertions > - comments: 2399 insertions (calculated with `git diff --cached !(tests|examples) | grep -c -E "(^[+-]\s*(/)?*)|(^[+-]\s*//)"`) > - which leaves 4382 lines of code inserted > > Big thanks to: > - @TobiHartmann for all his help by discussing the new framework and for providing insights from his IR test framework in Valhalla. > - @katyapav for converting the Valhalla tests to use the new framework which found some harder to catch bugs in the framework and also some actual C2 bugs. > - @iignatev for helping to simplify the framework usage with JTreg and with the framework internal VM calling structure. > - and others who provided valuable feedback. > > Thanks, > Christian Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: Adjust whitelist ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3508/files - new: https://git.openjdk.java.net/jdk/pull/3508/files/e2843410..7ed789dc Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3508&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3508&range=00-01 Stats: 9 lines in 1 file changed: 2 ins; 6 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3508.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3508/head:pull/3508 PR: https://git.openjdk.java.net/jdk/pull/3508 From kvn at openjdk.java.net Thu Apr 15 16:52:34 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 15 Apr 2021 16:52:34 GMT Subject: RFR: 8264480: Unreachable code in nmethod.cpp inside #ifdef DEBUG In-Reply-To: References: Message-ID: <-Hx4zlH3PdtjGg-iyXFXscPQw-Aa1gPTeN7QrnZeU98=.6e52ab74-90aa-4833-ac07-09eccd1b836e@github.com> On Thu, 15 Apr 2021 12:50:14 GMT, Robbin Ehn wrote: > Fixed ifdef to use correct define. > > Stress tested locally, running t1-5. Marked as reviewed by kvn (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3517 From mikael at openjdk.java.net Thu Apr 15 17:33:58 2021 From: mikael at openjdk.java.net (Mikael Vidstedt) Date: Thu, 15 Apr 2021 17:33:58 GMT Subject: Integrated: 8265302: ProblemList runtime/logging/RedefineClasses.java on linux-x64 -Xcomp In-Reply-To: References: Message-ID: <2qJ24ep5dCgvGDB2hp3IDLGDlEUD2dmQGFO-R33W6Og=.f0195c38-0635-43eb-8444-2ed4b1f340fe@github.com> On Thu, 15 Apr 2021 17:23:15 GMT, Daniel D. Daugherty wrote: > A trivial fix to ProblemList runtime/logging/RedefineClasses.java on linux-x64 -Xcomp. Marked as reviewed by mikael (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3525 From dcubed at openjdk.java.net Thu Apr 15 17:33:58 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Thu, 15 Apr 2021 17:33:58 GMT Subject: Integrated: 8265302: ProblemList runtime/logging/RedefineClasses.java on linux-x64 -Xcomp Message-ID: A trivial fix to ProblemList runtime/logging/RedefineClasses.java on linux-x64 -Xcomp. ------------- Commit messages: - 8265302: ProblemList runtime/logging/RedefineClasses.java on linux-x64 -Xcomp Changes: https://git.openjdk.java.net/jdk/pull/3525/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3525&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8265302 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3525.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3525/head:pull/3525 PR: https://git.openjdk.java.net/jdk/pull/3525 From dcubed at openjdk.java.net Thu Apr 15 17:33:58 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Thu, 15 Apr 2021 17:33:58 GMT Subject: Integrated: 8265302: ProblemList runtime/logging/RedefineClasses.java on linux-x64 -Xcomp In-Reply-To: <2qJ24ep5dCgvGDB2hp3IDLGDlEUD2dmQGFO-R33W6Og=.f0195c38-0635-43eb-8444-2ed4b1f340fe@github.com> References: <2qJ24ep5dCgvGDB2hp3IDLGDlEUD2dmQGFO-R33W6Og=.f0195c38-0635-43eb-8444-2ed4b1f340fe@github.com> Message-ID: On Thu, 15 Apr 2021 17:27:40 GMT, Mikael Vidstedt wrote: >> A trivial fix to ProblemList runtime/logging/RedefineClasses.java on linux-x64 -Xcomp. > > Marked as reviewed by mikael (Reviewer). @vidmik - Thanks for the lightning fast review! ------------- PR: https://git.openjdk.java.net/jdk/pull/3525 From dcubed at openjdk.java.net Thu Apr 15 17:33:59 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Thu, 15 Apr 2021 17:33:59 GMT Subject: Integrated: 8265302: ProblemList runtime/logging/RedefineClasses.java on linux-x64 -Xcomp In-Reply-To: References: Message-ID: On Thu, 15 Apr 2021 17:23:15 GMT, Daniel D. Daugherty wrote: > A trivial fix to ProblemList runtime/logging/RedefineClasses.java on linux-x64 -Xcomp. This pull request has now been integrated. Changeset: c7da64a4 Author: Daniel D. Daugherty URL: https://git.openjdk.java.net/jdk/commit/c7da64a4 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod 8265302: ProblemList runtime/logging/RedefineClasses.java on linux-x64 -Xcomp Reviewed-by: mikael ------------- PR: https://git.openjdk.java.net/jdk/pull/3525 From iignatyev at openjdk.java.net Thu Apr 15 17:52:34 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Thu, 15 Apr 2021 17:52:34 GMT Subject: RFR: 8254129: IR Test Framework to support regex-based matching on the IR in JTreg compiler tests [v2] In-Reply-To: References: <2iYQOJ5yeu7SvGcScLPBOWCPMLv69e1ksOL1vW3ytL8=.0c27621d-ef3d-422c-9d8c-922078ca3160@github.com> Message-ID: On Thu, 15 Apr 2021 13:19:56 GMT, Christian Hagedorn wrote: >> This RFE provides an IR test framework to perform regex-based checks on the C2 IR shape of test methods emitted by the VM flags `-XX:+PrintIdeal` and `-XX:+PrintOptoAssembly`. The framework can also be used for other non-IR matching (and non-compiler) tests by providing easy to use annotations for commonly used testing patterns and compiler control flags. >> >> The framework is based on the ideas of the currently present IR test framework in [Valhalla](https://github.com/openjdk/valhalla/blob/e9c78ce4fcfd01361c35883e0d68f9ae5a80d079/test/hotspot/jtreg/compiler/valhalla/inlinetypes/InlineTypeTest.java) (mainly implemented by @TobiHartmann) which is being used with great success. This new framework aims to replace the old one in Valhalla at some point. >> >> A detailed description about how this new IR test framework works and how it is used is provided in the [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) file and in the [Javadocs](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc/jdk/test/lib/hotspot/ir_framework/package-summary.html) written for the framework classes. >> >> To finish a first version of this framework for JDK 17, I decided to leave some improvement possibilities and ideas to be followed up on in additional RFEs. Some ideas are mentioned in "Future Work" in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) and were also created as subtasks of this RFE. >> >> Testing (also described in "Internal Framework Tests in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md)): >> There are various tests to verify the correctness of the test framework which can be found as JTreg tests in the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) folder. Additional testing was performed by converting all compiler Inline Types test of project Valhalla (done by @katyapav in [JDK-8263024](https://bugs.openjdk.java.net/browse/JDK-8263024)) that used the old framework to the new framework. This provided additional testing for the framework itself. We ran the converted tests with all the flag settings used in hs-tier1-9 and hs-precheckin-comp. For sanity checking, this was also done with a sample IR test in mainline. >> >> Some stats about the framework code added to [ir_framework](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework): >> >> - without the [Javadocs files](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc) : 60 changed files, 13212 insertions, 0 deletions. >> - without the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) and [examples](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/examples) folder: 40 files changed, 6781 insertions >> - comments: 2399 insertions (calculated with `git diff --cached !(tests|examples) | grep -c -E "(^[+-]\s*(/)?*)|(^[+-]\s*//)"`) >> - which leaves 4382 lines of code inserted >> >> Big thanks to: >> - @TobiHartmann for all his help by discussing the new framework and for providing insights from his IR test framework in Valhalla. >> - @katyapav for converting the Valhalla tests to use the new framework which found some harder to catch bugs in the framework and also some actual C2 bugs. >> - @iignatev for helping to simplify the framework usage with JTreg and with the framework internal VM calling structure. >> - and others who provided valuable feedback. >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: > > Adjust whitelist Hi Christian, kudos to you, Tobias, Katya, and all involved, I have really high hopes that this framework will improve the quality of both JVM and our testing. I'll look at the code later, a few general comments: - although having javadoc for testlibraries is highly desirable, I don't think we should check in the generated HTML files - the same goes for `README.html` generated from `README.md` - this library is hotspot-centric, I highly doubt that it will be used by any tests outside of the hotspot test base, hence the more appropriate location for it would be inside `test/hotspot/jtreg/testlibrary`. - I assume `test/lib/jdk/test/lib/hotspot/ir_framework/tests/` are the tests for the framework, in that case they should be in `test/lib-test`, if we stick to `test/lib` as the location for the library, or in `test/hotspot/jtreg/testlibrary_tests`, if we move it to `test/hotspot` Thanks, -- Igor ------------- PR: https://git.openjdk.java.net/jdk/pull/3508 From kvn at openjdk.java.net Thu Apr 15 18:08:38 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 15 Apr 2021 18:08:38 GMT Subject: RFR: 8254129: IR Test Framework to support regex-based matching on the IR in JTreg compiler tests [v2] In-Reply-To: References: <2iYQOJ5yeu7SvGcScLPBOWCPMLv69e1ksOL1vW3ytL8=.0c27621d-ef3d-422c-9d8c-922078ca3160@github.com> Message-ID: On Thu, 15 Apr 2021 13:19:56 GMT, Christian Hagedorn wrote: >> This RFE provides an IR test framework to perform regex-based checks on the C2 IR shape of test methods emitted by the VM flags `-XX:+PrintIdeal` and `-XX:+PrintOptoAssembly`. The framework can also be used for other non-IR matching (and non-compiler) tests by providing easy to use annotations for commonly used testing patterns and compiler control flags. >> >> The framework is based on the ideas of the currently present IR test framework in [Valhalla](https://github.com/openjdk/valhalla/blob/e9c78ce4fcfd01361c35883e0d68f9ae5a80d079/test/hotspot/jtreg/compiler/valhalla/inlinetypes/InlineTypeTest.java) (mainly implemented by @TobiHartmann) which is being used with great success. This new framework aims to replace the old one in Valhalla at some point. >> >> A detailed description about how this new IR test framework works and how it is used is provided in the [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) file and in the [Javadocs](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc/jdk/test/lib/hotspot/ir_framework/package-summary.html) written for the framework classes. >> >> To finish a first version of this framework for JDK 17, I decided to leave some improvement possibilities and ideas to be followed up on in additional RFEs. Some ideas are mentioned in "Future Work" in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) and were also created as subtasks of this RFE. >> >> Testing (also described in "Internal Framework Tests in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md)): >> There are various tests to verify the correctness of the test framework which can be found as JTreg tests in the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) folder. Additional testing was performed by converting all compiler Inline Types test of project Valhalla (done by @katyapav in [JDK-8263024](https://bugs.openjdk.java.net/browse/JDK-8263024)) that used the old framework to the new framework. This provided additional testing for the framework itself. We ran the converted tests with all the flag settings used in hs-tier1-9 and hs-precheckin-comp. For sanity checking, this was also done with a sample IR test in mainline. >> >> Some stats about the framework code added to [ir_framework](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework): >> >> - without the [Javadocs files](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc) : 60 changed files, 13212 insertions, 0 deletions. >> - without the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) and [examples](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/examples) folder: 40 files changed, 6781 insertions >> - comments: 2399 insertions (calculated with `git diff --cached !(tests|examples) | grep -c -E "(^[+-]\s*(/)?*)|(^[+-]\s*//)"`) >> - which leaves 4382 lines of code inserted >> >> Big thanks to: >> - @TobiHartmann for all his help by discussing the new framework and for providing insights from his IR test framework in Valhalla. >> - @katyapav for converting the Valhalla tests to use the new framework which found some harder to catch bugs in the framework and also some actual C2 bugs. >> - @iignatev for helping to simplify the framework usage with JTreg and with the framework internal VM calling structure. >> - and others who provided valuable feedback. >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: > > Adjust whitelist I understand the desire to use this framework for more then IR verification but it is main goal. Why you did not place it into `test/lib/sun/hotspot/code/`? I also don't think you should include javadoc generated files into these changes. test/lib/jdk/test/lib/hotspot/ir_framework/CompLevel.java line 62: > 60: * > 61: */ > 62: ANY(-2), This will change (`-2` -> `-1`) after AOT is removed (8264805). test/lib/jdk/test/lib/hotspot/ir_framework/CompLevel.java line 63: > 61: */ > 62: ANY(-2), > 63: /** For completeness may be add value `NONE(0)` for Interpreter only case. test/lib/jdk/test/lib/hotspot/ir_framework/IRMatcher.java line 478: > 476: * Helper class to store information about a method that needs to be IR matched. > 477: */ > 478: class IRMethod { Can you move this class `IRMethod` into a separate file? This file is already big. test/lib/jdk/test/lib/hotspot/ir_framework/examples/TEST.ROOT line 1: > 1: # Minimal TEST.ROOT file to run the examples tests as if the examples would have been placed inside Missing Copyright header. test/lib/jdk/test/lib/hotspot/ir_framework/tests/README.md line 7: > 5: > 6: Additional testing should be performed with the converted Valhalla tests (see [JDK-8263024](https://bugs.openjdk.java.net/browse/JDK-8263024)) to make sure a changeset is correct (these are part of the Valhalla CI). > 7: I don't think you should reference particular RFE which will be resolved eventually. The statement itself is fine. test/lib/jdk/test/lib/hotspot/ir_framework/tests/TEST.ROOT line 1: > 1: # Minimal TEST.ROOT file to run the internal framework tests as if they would have been placed inside Missing Copyright header. ------------- PR: https://git.openjdk.java.net/jdk/pull/3508 From github.com+2249648+johntortugo at openjdk.java.net Thu Apr 15 18:28:02 2021 From: github.com+2249648+johntortugo at openjdk.java.net (John Tortugo) Date: Thu, 15 Apr 2021 18:28:02 GMT Subject: RFR: 8241502: Migrate x86_64.ad to MacroAssembler [v7] In-Reply-To: References: Message-ID: > Relates to: https://bugs.openjdk.java.net/browse/JDK-8241502 > Tested on: Linux tier1, 2 and 3 > > Can you please take a look whether these changes are going in the direction expected or not? If it is, I'll continue working on the `JDK-8241502` but I'd like to split it in a few PRs since it's a lot of changes. John Tortugo has updated the pull request incrementally with one additional commit since the last revision: Use cdql/cdqq implemented in MacroAssembler. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2420/files - new: https://git.openjdk.java.net/jdk/pull/2420/files/6fecb5dc..ef7a993c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2420&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2420&range=05-06 Stats: 69 lines in 1 file changed: 0 ins; 63 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/2420.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2420/head:pull/2420 PR: https://git.openjdk.java.net/jdk/pull/2420 From github.com+2249648+johntortugo at openjdk.java.net Thu Apr 15 18:30:37 2021 From: github.com+2249648+johntortugo at openjdk.java.net (John Tortugo) Date: Thu, 15 Apr 2021 18:30:37 GMT Subject: RFR: 8241502: Migrate x86_64.ad to MacroAssembler [v7] In-Reply-To: References: <00CZum78AaG7wa9LLxbJNiIsYVqT_Ice0Gl3ucxglW4=.b94a74d7-0c88-4ca8-9986-5b5c67929297@github.com> Message-ID: On Mon, 29 Mar 2021 10:44:12 GMT, Vladimir Ivanov wrote: >> John Tortugo has updated the pull request incrementally with one additional commit since the last revision: >> >> Use cdql/cdqq implemented in MacroAssembler. > > src/hotspot/cpu/x86/x86_64.ad line 8329: > >> 8327: "done:" %} >> 8328: opcode(0xF7, 0x7); /* Opcode F7 /7 */ >> 8329: ins_encode(cdql_enc(div), REX_reg(div), OpcP, reg_opc(div)); > > `cdql_enc`/`cdqq_enc` contain extensive comments about the logic. What about migrating them to dedicated methods in `MacroAssembler`/`C2_MacroAssembler` (`idivrem`/`ldivrem`) and calling those from `divI_rReg` / `divL_rReg`/`divModL_rReg_divmod`/...? Hi Vladimir, thanks for the comment. Actually, there is already an implementation of cdql/cdqq in MacroAssembler, the names are slightly different though: corrected_idivl and corrected_idivq. I followed your suggestion and replaced the encoding of div*_rReg / divMod*_rReg_divmod and mod*_rReg with calls to these two functions. ------------- PR: https://git.openjdk.java.net/jdk/pull/2420 From dnsimon at openjdk.java.net Thu Apr 15 19:05:33 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Thu, 15 Apr 2021 19:05:33 GMT Subject: RFR: 8254129: IR Test Framework to support regex-based matching on the IR in JTreg compiler tests [v2] In-Reply-To: References: <2iYQOJ5yeu7SvGcScLPBOWCPMLv69e1ksOL1vW3ytL8=.0c27621d-ef3d-422c-9d8c-922078ca3160@github.com> Message-ID: On Thu, 15 Apr 2021 13:19:56 GMT, Christian Hagedorn wrote: >> This RFE provides an IR test framework to perform regex-based checks on the C2 IR shape of test methods emitted by the VM flags `-XX:+PrintIdeal` and `-XX:+PrintOptoAssembly`. The framework can also be used for other non-IR matching (and non-compiler) tests by providing easy to use annotations for commonly used testing patterns and compiler control flags. >> >> The framework is based on the ideas of the currently present IR test framework in [Valhalla](https://github.com/openjdk/valhalla/blob/e9c78ce4fcfd01361c35883e0d68f9ae5a80d079/test/hotspot/jtreg/compiler/valhalla/inlinetypes/InlineTypeTest.java) (mainly implemented by @TobiHartmann) which is being used with great success. This new framework aims to replace the old one in Valhalla at some point. >> >> A detailed description about how this new IR test framework works and how it is used is provided in the [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) file and in the [Javadocs](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc/jdk/test/lib/hotspot/ir_framework/package-summary.html) written for the framework classes. >> >> To finish a first version of this framework for JDK 17, I decided to leave some improvement possibilities and ideas to be followed up on in additional RFEs. Some ideas are mentioned in "Future Work" in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) and were also created as subtasks of this RFE. >> >> Testing (also described in "Internal Framework Tests in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md)): >> There are various tests to verify the correctness of the test framework which can be found as JTreg tests in the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) folder. Additional testing was performed by converting all compiler Inline Types test of project Valhalla (done by @katyapav in [JDK-8263024](https://bugs.openjdk.java.net/browse/JDK-8263024)) that used the old framework to the new framework. This provided additional testing for the framework itself. We ran the converted tests with all the flag settings used in hs-tier1-9 and hs-precheckin-comp. For sanity checking, this was also done with a sample IR test in mainline. >> >> Some stats about the framework code added to [ir_framework](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework): >> >> - without the [Javadocs files](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc) : 60 changed files, 13212 insertions, 0 deletions. >> - without the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) and [examples](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/examples) folder: 40 files changed, 6781 insertions >> - comments: 2399 insertions (calculated with `git diff --cached !(tests|examples) | grep -c -E "(^[+-]\s*(/)?*)|(^[+-]\s*//)"`) >> - which leaves 4382 lines of code inserted >> >> Big thanks to: >> - @TobiHartmann for all his help by discussing the new framework and for providing insights from his IR test framework in Valhalla. >> - @katyapav for converting the Valhalla tests to use the new framework which found some harder to catch bugs in the framework and also some actual C2 bugs. >> - @iignatev for helping to simplify the framework usage with JTreg and with the framework internal VM calling structure. >> - and others who provided valuable feedback. >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: > > Adjust whitelist test/lib/jdk/test/lib/hotspot/ir_framework/README.html line 1: > 1:

IR Test Framework

Just curious: why HTML instead of markdown? ------------- PR: https://git.openjdk.java.net/jdk/pull/3508 From iignatyev at openjdk.java.net Fri Apr 16 00:44:36 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Fri, 16 Apr 2021 00:44:36 GMT Subject: RFR: 8254129: IR Test Framework to support regex-based matching on the IR in JTreg compiler tests [v2] In-Reply-To: References: <2iYQOJ5yeu7SvGcScLPBOWCPMLv69e1ksOL1vW3ytL8=.0c27621d-ef3d-422c-9d8c-922078ca3160@github.com> Message-ID: On Thu, 15 Apr 2021 18:05:16 GMT, Vladimir Kozlov wrote: > Why you did not place it into `test/lib/sun/hotspot/code/`? there is an RFE to rename `sun.hotspot` testlibrary classes, so I wouldn't put new stuff there. ------------- PR: https://git.openjdk.java.net/jdk/pull/3508 From iignatyev at openjdk.java.net Fri Apr 16 01:33:37 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Fri, 16 Apr 2021 01:33:37 GMT Subject: RFR: 8254129: IR Test Framework to support regex-based matching on the IR in JTreg compiler tests [v2] In-Reply-To: References: <2iYQOJ5yeu7SvGcScLPBOWCPMLv69e1ksOL1vW3ytL8=.0c27621d-ef3d-422c-9d8c-922078ca3160@github.com> Message-ID: On Thu, 15 Apr 2021 13:19:56 GMT, Christian Hagedorn wrote: >> This RFE provides an IR test framework to perform regex-based checks on the C2 IR shape of test methods emitted by the VM flags `-XX:+PrintIdeal` and `-XX:+PrintOptoAssembly`. The framework can also be used for other non-IR matching (and non-compiler) tests by providing easy to use annotations for commonly used testing patterns and compiler control flags. >> >> The framework is based on the ideas of the currently present IR test framework in [Valhalla](https://github.com/openjdk/valhalla/blob/e9c78ce4fcfd01361c35883e0d68f9ae5a80d079/test/hotspot/jtreg/compiler/valhalla/inlinetypes/InlineTypeTest.java) (mainly implemented by @TobiHartmann) which is being used with great success. This new framework aims to replace the old one in Valhalla at some point. >> >> A detailed description about how this new IR test framework works and how it is used is provided in the [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) file and in the [Javadocs](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc/jdk/test/lib/hotspot/ir_framework/package-summary.html) written for the framework classes. >> >> To finish a first version of this framework for JDK 17, I decided to leave some improvement possibilities and ideas to be followed up on in additional RFEs. Some ideas are mentioned in "Future Work" in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) and were also created as subtasks of this RFE. >> >> Testing (also described in "Internal Framework Tests in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md)): >> There are various tests to verify the correctness of the test framework which can be found as JTreg tests in the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) folder. Additional testing was performed by converting all compiler Inline Types test of project Valhalla (done by @katyapav in [JDK-8263024](https://bugs.openjdk.java.net/browse/JDK-8263024)) that used the old framework to the new framework. This provided additional testing for the framework itself. We ran the converted tests with all the flag settings used in hs-tier1-9 and hs-precheckin-comp. For sanity checking, this was also done with a sample IR test in mainline. >> >> Some stats about the framework code added to [ir_framework](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework): >> >> - without the [Javadocs files](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc) : 60 changed files, 13212 insertions, 0 deletions. >> - without the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) and [examples](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/examples) folder: 40 files changed, 6781 insertions >> - comments: 2399 insertions (calculated with `git diff --cached !(tests|examples) | grep -c -E "(^[+-]\s*(/)?*)|(^[+-]\s*//)"`) >> - which leaves 4382 lines of code inserted >> >> Big thanks to: >> - @TobiHartmann for all his help by discussing the new framework and for providing insights from his IR test framework in Valhalla. >> - @katyapav for converting the Valhalla tests to use the new framework which found some harder to catch bugs in the framework and also some actual C2 bugs. >> - @iignatev for helping to simplify the framework usage with JTreg and with the framework internal VM calling structure. >> - and others who provided valuable feedback. >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: > > Adjust whitelist Hi Christian, here is the 1st portion of my comments. I'll return to reviewing it 1st thing tomorrow... Cheers, -- Igor test/lib/jdk/test/lib/hotspot/ir_framework/TestFramework.java line 107: > 105: */ > 106: public static final Set JTREG_WHITELIST_FLAGS = new HashSet<>( > 107: Arrays.asList( it doesn't seem to be a comprehensive list of flags that don't affect IR verification work, e.g. different GCs. I think it might be easy to go with the blacklist instead, and possibly warning people if there are any flags in `test.*.opts`. test/lib/jdk/test/lib/hotspot/ir_framework/TestFramework.java line 146: > 144: private List> helperClasses = null; > 145: private List scenarios = null; > 146: private final List flags = new ArrayList<>(); why did you decide to eagerly create list for `flags`, but not for `scenarios` / `helpersClasses`? test/lib/jdk/test/lib/hotspot/ir_framework/TestFramework.java line 166: > 164: if (VERBOSE) { > 165: System.out.println("Test class: " + testClass); > 166: } this c-tor can be replaced by: Suggestion: this(StackWalker.getInstance(StackWalker.Option.RETAIN_CLASS_REFERENCE).getCallerClass()); and actually, I don't see it being used (outside of the tests for this testlibrary) and I don't think it will ever be used by actual tests that would use this framework, so I think we can safely remove it. test/lib/jdk/test/lib/hotspot/ir_framework/TestFramework.java line 211: > 209: } > 210: > 211: /** I'm not sure how much of the benefits all these different `run.*` bring. I fully understand the desire to simplify the most common use-case (i.e. no-arg `run`), but for the rest I'd assume the users will be capable of writing, for example, `new TestFramework(testClass).addScenarios(scenarios).start();` instead of `TestFramework .runWithScenarios(testClass, scenarios);` test/lib/jdk/test/lib/hotspot/ir_framework/TestFramework.java line 269: > 267: *
    > 268: *
  • If a helper class is not in the same file as the test class, make sure that JTreg compiles it by using > 269: * {@literal @}compile in the JTreg header comment block.

  • you don't need `@compile` to compile a helper class, 1st of all, you shouldn't use `@compile` when you want to compile a class in your test, you should use `@build`, 2nd, in this particular case, the class will be automatically built as part of your test b/c jtreg (or rather javac) will be able to statically detect it as a dependency of the code that calls `runWithHelperClasses(Class, Class...)` test/lib/jdk/test/lib/hotspot/ir_framework/TestFramework.java line 366: > 364: } > 365: > 366: for (Class helperClass : helperClasses) { nit: I'd use `var` here test/lib/jdk/test/lib/hotspot/ir_framework/TestFramework.java line 367: > 365: > 366: for (Class helperClass : helperClasses) { > 367: TestRun.check(!this.helperClasses.contains(helperClass), "Cannot add the same class twice: " + helperClass); won't it be easy to use `Set` to store `helperClasses`? test/lib/jdk/test/lib/hotspot/ir_framework/TestFramework.java line 407: > 405: start(null); > 406: } catch (TestVMException e) { > 407: System.err.println("\n" + e.getExceptionInfo()); you shouldn't use "\n", as it might not be the right line-separator. you can either do: Suggestion: System.err.println(); System.err.println(e.getExceptionInfo()); or Suggestion: System.err.printf("%n%s%n", e.getExceptionInfo()); test/lib/jdk/test/lib/hotspot/ir_framework/TestFramework.java line 604: > 602: * Disable IR verification completely in certain cases. > 603: */ > 604: private void maybeDisableIRVerificationCompletely() { nit: Suggestion: private void disableIRVerificationIfNotFeasible() { test/lib/jdk/test/lib/hotspot/ir_framework/TestFramework.java line 617: > 615: "and PrintOptoAssembly), running with -Xint, or -Xcomp (use warm-up of 0 instead)"); > 616: return; > 617: } I'd reorder them as "platform" checks are faster than `hasIRAnnotations` check and, I'd guess, will be a more common culprit to disable IR verification. test/lib/jdk/test/lib/hotspot/ir_framework/TestFramework.java line 669: > 667: } > 668: if (e instanceof IRViolationException) { > 669: IRViolationException irException = (IRViolationException) e; nit: Suggestion: if (e instanceof IRViolationException irException) { test/lib/jdk/test/lib/hotspot/ir_framework/TestFramework.java line 674: > 672: + "Compilation(s) of failed matche(s):"); > 673: System.out.println(irException.getCompilations()); > 674: builder.append(errorMsg).append("\n").append(irException.getExceptionInfo()).append(e.getMessage()); as `builder.toString` will be printed out to cout/cerr, you should use platform-specific line-separator instead of `\n` ------------- Changes requested by iignatyev (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3508 From iignatyev at openjdk.java.net Fri Apr 16 01:33:38 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Fri, 16 Apr 2021 01:33:38 GMT Subject: RFR: 8254129: IR Test Framework to support regex-based matching on the IR in JTreg compiler tests [v2] In-Reply-To: References: <2iYQOJ5yeu7SvGcScLPBOWCPMLv69e1ksOL1vW3ytL8=.0c27621d-ef3d-422c-9d8c-922078ca3160@github.com> Message-ID: On Fri, 16 Apr 2021 01:08:28 GMT, Igor Ignatyev wrote: >> Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: >> >> Adjust whitelist > > test/lib/jdk/test/lib/hotspot/ir_framework/TestFramework.java line 367: > >> 365: >> 366: for (Class helperClass : helperClasses) { >> 367: TestRun.check(!this.helperClasses.contains(helperClass), "Cannot add the same class twice: " + helperClass); > > won't it be easy to use `Set` to store `helperClasses`? you might also want to update javadoc for this method to state that there can be duplicates. ------------- PR: https://git.openjdk.java.net/jdk/pull/3508 From whuang at openjdk.java.net Fri Apr 16 01:40:01 2021 From: whuang at openjdk.java.net (Wang Huang) Date: Fri, 16 Apr 2021 01:40:01 GMT Subject: RFR: 8263006: Add optimization for Max(*)Node and Min(*)Node [v2] In-Reply-To: References: Message-ID: > * I optimize `max` and `min` by using these identities > - op (max(a,b) , min(a,b))=== op(a,b) > - if op is commutable > - example : > - max(a,b) + min(a,b))=== a + b // op = add > - max(a,b) * min(a,b))=== a * b // op = mul > - max( max(a,b) , min(a,b)))=== max(a,b) // op = max() > - min( max(a,b) , min(a,b)))=== max(a,b) // op = min() > * Test case > ```java > /* > * Copyright (c) 2021, Huawei Technologies Co. Ltd. All rights reserved. > * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. > * > * This code is free software; you can redistribute it and/or modify it > * under the terms of the GNU General Public License version 2 only, as > * published by the Free Software Foundation. > * > * This code is distributed in the hope that it will be useful, but WITHOUT > * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License > * version 2 for more details (a copy is included in the LICENSE file that > * accompanied this code). > * > * You should have received a copy of the GNU General Public License version > * 2 along with this work; if not, write to the Free Software Foundation, > * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. > * > * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA > * or visit www.oracle.com if you need additional information or have any > * questions. > */ > package org.sample; > > import org.openjdk.jmh.annotations.Benchmark; > import org.openjdk.jmh.annotations.*; > > import java.util.Random; > import java.util.concurrent.TimeUnit; > import org.openjdk.jmh.infra.Blackhole; > > @BenchmarkMode({Mode.AverageTime}) > @OutputTimeUnit(TimeUnit.MICROSECONDS) > public class MyBenchmark { > > static int length = 100000; > static double[] data1 = new double[length]; > static double[] data2 = new double[length]; > static Random random = new Random(); > > static { > for(int i = 0; i < length; ++i) { > data1[i] = random.nextDouble(); > data2[i] = random.nextDouble(); > } > } > > @Benchmark > public void testAdd(Blackhole bh) { > double sum = 0; > for (int i = 0; i < length; i++) { > sum += Math.max(data1[i], data2[i]) + Math.min(data1[i], data2[i]); > } > bh.consume(sum); > } > > @Benchmark > public void testMax(Blackhole bh) { > double sum = 0; > for (int i = 0; i < length; i++) { > sum += Math.max(Math.max(data1[i], data2[i]), Math.min(data1[i], data2[i])); > } > bh.consume(sum); > } > > @Benchmark > public void testMin(Blackhole bh) { > double sum = 0; > for (int i = 0; i < length; i++) { > sum += Math.min(Math.max(data1[i], data2[i]), Math.min(data1[i], data2[i])); > } > bh.consume(sum); > } > > @Benchmark > public void testMul(Blackhole bh) { > double sum = 0; > for (int i = 0; i < length; i++) { > sum += (Math.max(data1[i], data2[i]) * Math.min(data1[i], data2[i])); > } > bh.consume(sum); > } > } > ``` > > * The result is listed here (aarch64): > > before: > > |Benchmark| Mode| Samples| Score| Score error| Units| > |---| ---| ---| ---| --- | ---| > |o.s.MyBenchmark.testAdd |avgt | 10 | 556.048 | 32.368 | us/op | > | o.s.MyBenchmark.testMax | avgt | 10 |543.065 | 54.221 | us/op | > | o.s.MyBenchmark.testMin | avgt |10 |570.731 | 37.630 | us/op | > | o.s.MyBenchmark.testMul | avgt | 10 | 531.906 | 20.518 | us/op | > > after: > > |Benchmark| Mode| Samples| Score| Score error| Units| > |---| ---| ---| ---| --- | ---| > | o.s.MyBenchmark.testAdd | avgt | 10 | 319.350 | 9.248 | us/op | > | o.s.MyBenchmark.testMax | avgt | 10 | 356.138 | 10.736 | us/op | > | o.s.MyBenchmark.testMin | avgt | 10 | 323.731 | 16.621 | us/op | > | o.s.MyBenchmark.testMul | avgt | 10 | 338.458 | 23.755 | us/op | Wang Huang has updated the pull request incrementally with one additional commit since the last revision: adjust code style ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3513/files - new: https://git.openjdk.java.net/jdk/pull/3513/files/f132222e..7579586f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3513&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3513&range=00-01 Stats: 4 lines in 2 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/3513.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3513/head:pull/3513 PR: https://git.openjdk.java.net/jdk/pull/3513 From xgong at openjdk.java.net Fri Apr 16 01:46:32 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Fri, 16 Apr 2021 01:46:32 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v4] In-Reply-To: References: Message-ID: On Thu, 15 Apr 2021 09:53:42 GMT, Jatin Bhateja wrote: > Thanks for addressing it. as Vladimir suggested in long term we can just emit this node instead of VectorStoreMask + VectorLoadMask combination when mask types are non-conformal to emit efficient instruction sequence using input mask packing/unpacking. Yeah, previously I thought about this. However, considering there are some optimizations like GVN for `VectorStoreMask`, currently I prefer to keep the code as it is. Maybe we can reconsider this in future. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From whuang at openjdk.java.net Fri Apr 16 02:10:10 2021 From: whuang at openjdk.java.net (Wang Huang) Date: Fri, 16 Apr 2021 02:10:10 GMT Subject: RFR: 8265244: assert(false) failed: bad AD file [v2] In-Reply-To: References: Message-ID: > * aarch64 can only accept `VectorReinterpret` with 64/128 bits. > * I will fix this bug by adding a rule for `VectorReinterpret` in `match_rule_supported_vector` > * after changing note with @nsjian and @XiaohongGong , I think that two checks in `inline_vector_conver` is useless now. However, these checks impact other cpus, so I need more reviewers. Wang Huang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch 'master' into JDK-8265244 - 8265244: assert(false) failed: bad AD file ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3507/files - new: https://git.openjdk.java.net/jdk/pull/3507/files/f65c8671..522b256f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3507&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3507&range=00-01 Stats: 1045 lines in 44 files changed: 798 ins; 153 del; 94 mod Patch: https://git.openjdk.java.net/jdk/pull/3507.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3507/head:pull/3507 PR: https://git.openjdk.java.net/jdk/pull/3507 From yyang at openjdk.java.net Fri Apr 16 03:16:04 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Fri, 16 Apr 2021 03:16:04 GMT Subject: RFR: 8265322: C2: Simplify control inputs for BarrierSetC2::obj_allocate Message-ID: <_NicB3VpqFwTiu2-bKLNhfnQAxJmbtND66g3Md5ck5g=.f9b4edcd-fa2b-4917-9092-bccc38b202df@github.com> This PR simplifies control inputs for BarrierSetC2::obj_allocate. In most cases, it doesn't change anything since `toobig_false` is equivalent to `ctrl`. In rare case, `toobig_false` is created for Unsafe.allocateInstance while instance size is not statically known, `ctrl` would become control input of IfNode whose projects are `toobig_false` and `toobig_true`, old eden_end and old_eden_top can simply accept `toobig_false` as their control input rather than `ctrl`. ------------- Commit messages: - simplify Changes: https://git.openjdk.java.net/jdk/pull/3529/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3529&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8265322 Stats: 5 lines in 3 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/3529.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3529/head:pull/3529 PR: https://git.openjdk.java.net/jdk/pull/3529 From yyang at openjdk.java.net Fri Apr 16 03:16:58 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Fri, 16 Apr 2021 03:16:58 GMT Subject: RFR: 8265245: depChecker_ don't have any functionalities Message-ID: <1JKPxPdQ9XoTDXdIP8P_eBy6EqYZhEoD8SACCs0MfSs=.d5cd2e5f-6c11-4be8-8651-3931368d1471@github.com> Yet another OpenJDK6 old guy. src/hotspot/cpu//depChecker_.hpp/cpp are included by src/hotspot/share/compiler/disassembler.cpp, they don't provide any functionality. In the absence of strong demand either in existing ARCHs or future ARCHs, I think we can remove it. ------------- Commit messages: - remove depcheck_ Changes: https://git.openjdk.java.net/jdk/pull/3531/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3531&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8265245 Stats: 305 lines in 13 files changed: 0 ins; 305 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3531.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3531/head:pull/3531 PR: https://git.openjdk.java.net/jdk/pull/3531 From yyang at openjdk.java.net Fri Apr 16 03:23:06 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Fri, 16 Apr 2021 03:23:06 GMT Subject: RFR: 8265323: Leftover local variables in PcDesc Message-ID: Leftover local variables in PcDesc. Remove it to save electric power. ------------- Commit messages: - remove unused local vars Changes: https://git.openjdk.java.net/jdk/pull/3532/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3532&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8265323 Stats: 6 lines in 1 file changed: 0 ins; 5 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3532.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3532/head:pull/3532 PR: https://git.openjdk.java.net/jdk/pull/3532 From xgong at openjdk.java.net Fri Apr 16 03:36:02 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Fri, 16 Apr 2021 03:36:02 GMT Subject: RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v6] In-Reply-To: References: Message-ID: <0Oh1jRqGYglCqibmgxl3D5_H99UFu2if2VYuhwSb-Lk=.29a8f263-4375-46f9-811c-6a68d8f8a189@github.com> > The Vector API defines different element types for floating point VectorMask. For example, the bitwise related APIs use "`long/int`", while data related APIs use "`double/float`". This makes some optimizations that based on the type checking cannot work well. > > For example, the VectorBox/Unbox elimination like `"VectorUnbox (VectorBox v) ==> v"` requires the types of output and > input are equal. Normally this is necessary. However, due to the different element type for floating point VectorMask with the same species, the VectorBox/Unbox pattern is optimized to: > > VectorLoadMask (VectorStoreMask vmask) > > Actually the types can be treated as the same one for such cases. And considering the vector mask representation is the same for > vectors with the same element size and vector length, it's safe to do the optimization: > > VectorLoadMask (VectorStoreMask vmask) ==> vmask > > The same issue exists for GVN which is based on the type of a node. Making the mask node's` hash()/cmp()` methods depends on the element size instead of the element type can fix it. Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge jdk 'master' into JDK-8264104 - Add type restrict in the match rule predicate - Use "Matcher::match_rule_supported_vector" for vector nodes checking - Revert changes to VectorLoadMask and add a VectorMaskCast for the same element size mask casting - 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3238/files - new: https://git.openjdk.java.net/jdk/pull/3238/files/8232bd96..e4352b39 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3238&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3238&range=04-05 Stats: 96154 lines in 2370 files changed: 53099 ins; 34862 del; 8193 mod Patch: https://git.openjdk.java.net/jdk/pull/3238.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3238/head:pull/3238 PR: https://git.openjdk.java.net/jdk/pull/3238 From jiefu at openjdk.java.net Fri Apr 16 06:58:00 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 16 Apr 2021 06:58:00 GMT Subject: RFR: 8265325: Optimize StubRoutines::dpow() for Math.pow(x, 0.5) Message-ID: Hi all, I'd like to optimize the StubRoutines::dpow() for Math.pow(x, 0.5). In the pow and sqrt discussion [1], Joe taught me that the Java library implementation of pow has been optimized for pow(x, 2.0) [2] and pow(x, 0.5) [3]. However, the hotspot StubRoutines::dpow() only implements the same opt for pow(x, 2.0), but still not for pow(x, 0.5). This patch optimizes StubRoutines::dpow() for pow(x, 0.5). Although not all Math.pow(x, 0.5) can be replaced with sqrt(x), we can still do it safely for the following cases: 1) x >= 0.0 (fully implemented) 2) x is +Inf (fully implemented) 3) x is NaN (can be further divided into +NaN and -NaN and only +NaN is implemented) The effect of this opt has been tested on serveral platforms showing 3.0x ~ 6.3x performance improvement. And no performance drop was observed. Testing: - tier1 ~ tier3 on Linux/x64 Thanks. Best regards, Jie [1] https://mail.openjdk.java.net/pipermail/core-libs-dev/2021-April/076220.html [2] https://github.com/openjdk/jdk/blob/d84a7e55be40eae57b6c322694d55661a5053a55/src/java.base/share/classes/java/lang/FdLibm.java#L362 [3] https://github.com/openjdk/jdk/blob/d84a7e55be40eae57b6c322694d55661a5053a55/src/java.base/share/classes/java/lang/FdLibm.java#L364 Detailed performance numbers: * Linux/Intel --------- Before ----------- Benchmark (seed) Mode Cnt Score Error Units MathBench.powDouble 0 thrpt 8 218783.605 ? 838.379 ops/ms MathBench.powDouble0Dot5 0 thrpt 8 45498.351 ? 7.558 ops/ms MathBench.powDouble0Dot5Const 0 thrpt 8 45243.530 ? 1097.100 ops/ms MathBench.powDouble0Dot5Loop 0 thrpt 8 0.031 ? 0.001 ops/ms MathBench.powDoubleLoop 0 thrpt 8 0.031 ? 0.001 ops/ms StrictMathBench.powDouble N/A thrpt 8 176106.602 ? 13127.650 ops/ms ---------------------------- --------- After ----------- Benchmark (seed) Mode Cnt Score Error Units MathBench.powDouble 0 thrpt 8 219930.462 ? 181.922 ops/ms MathBench.powDouble0Dot5 0 thrpt 8 204966.834 ? 329.032 ops/ms <-- 4.5x up MathBench.powDouble0Dot5Const 0 thrpt 8 203004.302 ? 684.072 ops/ms MathBench.powDouble0Dot5Loop 0 thrpt 8 0.121 ? 0.001 ops/ms <-- 3.9x up MathBench.powDoubleLoop 0 thrpt 8 0.031 ? 0.001 ops/ms StrictMathBench.powDouble N/A thrpt 8 178818.861 ? 16235.465 ops/ms ---------------------------- * Linux/AMD --------- Before ----------- Benchmark (seed) Mode Cnt Score Error Units MathBench.powDouble 0 thrpt 8 100741.348 ? 207.766 ops/ms MathBench.powDouble0Dot5 0 thrpt 8 33896.623 ? 103.352 ops/ms MathBench.powDouble0Dot5Const 0 thrpt 8 34195.944 ? 230.703 ops/ms MathBench.powDouble0Dot5Loop 0 thrpt 8 0.039 ? 0.001 ops/ms MathBench.powDoubleLoop 0 thrpt 8 0.038 ? 0.001 ops/ms StrictMathBench.powDouble N/A thrpt 8 72000.166 ? 135.002 ops/ms ---------------------------- --------- After ----------- Benchmark (seed) Mode Cnt Score Error Units MathBench.powDouble 0 thrpt 8 100738.866 ? 222.820 ops/ms MathBench.powDouble0Dot5 0 thrpt 8 100799.098 ? 95.537 ops/ms <-- 3.0x up MathBench.powDouble0Dot5Const 0 thrpt 8 100765.571 ? 178.436 ops/ms MathBench.powDouble0Dot5Loop 0 thrpt 8 0.244 ? 0.002 ops/ms <-- 6.3x up MathBench.powDoubleLoop 0 thrpt 8 0.038 ? 0.001 ops/ms StrictMathBench.powDouble N/A thrpt 8 71758.725 ? 339.660 ops/ms ---------------------------- * MacOS/Intel --------- Before ----------- Benchmark (seed) Mode Cnt Score Error Units MathBench.powDouble 0 thrpt 8 238064.722 ? 5181.318 ops/ms MathBench.powDouble0Dot5 0 thrpt 8 59235.979 ? 2046.519 ops/ms MathBench.powDouble0Dot5Const 0 thrpt 8 59695.014 ? 1079.692 ops/ms MathBench.powDouble0Dot5Loop 0 thrpt 8 0.040 ? 0.001 ops/ms MathBench.powDoubleLoop 0 thrpt 8 0.041 ? 0.001 ops/ms StrictMathBench.powDouble N/A thrpt 8 238391.026 ? 2743.385 ops/ms ---------------------------- --------- After ----------- Benchmark (seed) Mode Cnt Score Error Units MathBench.powDouble 0 thrpt 8 238582.414 ? 3661.261 ops/ms MathBench.powDouble0Dot5 0 thrpt 8 224102.701 ? 2846.892 ops/ms <-- 3.8x up MathBench.powDouble0Dot5Const 0 thrpt 8 224542.331 ? 19027.596 ops/ms MathBench.powDouble0Dot5Loop 0 thrpt 8 0.158 ? 0.002 ops/ms <-- 4.0x up MathBench.powDoubleLoop 0 thrpt 8 0.041 ? 0.001 ops/ms StrictMathBench.powDouble N/A thrpt 8 233689.504 ? 10141.034 ops/ms ---------------------------- ------------- Commit messages: - 8265325: Optimize StubRoutines::dpow() for Math.pow(x, 0.5) Changes: https://git.openjdk.java.net/jdk/pull/3536/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3536&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8265325 Stats: 144 lines in 3 files changed: 142 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/3536.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3536/head:pull/3536 PR: https://git.openjdk.java.net/jdk/pull/3536 From jiefu at openjdk.java.net Fri Apr 16 07:08:35 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 16 Apr 2021 07:08:35 GMT Subject: RFR: 8264945: Optimize the code-gen for Math.pow(x, 0.5) In-Reply-To: References: Message-ID: On Fri, 9 Apr 2021 22:39:44 GMT, Vladimir Kozlov wrote: >> Hi all, >> >> I'd like to optimize the code-gen for Math.pow(x, 0.5). >> And 7x ~ 14x performance improvement is observed by the jmh micro-benchmarks. >> >> While I was optimizing a machine learning program, I found both Math.pow(x, 2) and Math.pow(x, 0.5) are used. >> To my surprise, C2 just optimizes the case for Math.pow(x, 2) [1], but still not for Math.pow(x, 0.5) yet. >> >> The patch just replace Math.pow(x, 0.5) with Math.sqrt(x). >> >> Before: >> >> Benchmark (seed) Mode Cnt Score Error Units >> MathBench.powDouble0Dot5 0 thrpt 8 45525.117 ? 11.686 ops/ms >> MathBench.powDouble0Dot5Loop 0 thrpt 8 0.031 ? 0.001 ops/ms >> >> Benchmark (seed) Mode Cnt Score Error Units >> MathBench.powDouble0Dot5 0 thrpt 8 45509.317 ? 6.581 ops/ms >> MathBench.powDouble0Dot5Loop 0 thrpt 8 0.031 ? 0.001 ops/ms >> >> >> After: >> >> Benchmark (seed) Mode Cnt Score Error Units >> MathBench.powDouble0Dot5 0 thrpt 8 343354.892 ? 362.900 ops/ms >> MathBench.powDouble0Dot5Loop 0 thrpt 8 0.457 ? 0.001 ops/mso >> >> Benchmark (seed) Mode Cnt Score Error Units >> MathBench.powDouble0Dot5 0 thrpt 8 343421.559 ? 49.326 ops/ms >> MathBench.powDouble0Dot5Loop 0 thrpt 8 0.457 ? 0.001 ops/ms >> >> >> Testing: >> - tier1~3 on Linux/x64 >> >> Thanks, >> Best regards, >> Jie >> >> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/library_call.cpp#L1680 > > Please, verify that result is the same when run with -Xint (Interpreter only) and (-XX:TieredStopAtLevel=1) C1 only. May be they need the same optimization. > Hi @vnkozlov @neliasso and @huishi-hs , > > Thanks for your review and comments. > While I was implementing the opt for C1 and interpreter, I found Math.pow(x, 0.5) and Math.sqrt(x) would compute different values for x={-0.0, Double.NEGATIVE_INFINITY} [1]. > Let's hold on this issue until we have a conclusion about that question. > Thanks. > > [1] https://mail.openjdk.java.net/pipermail/core-libs-dev/2021-April/076195.html Hi all, According to the discussion [1], we can still perform pow(x, 0.5) => sqrt(x) for x >= 0.0 or x is +Inf/NaN. To better help the code review, the whole optimization has been split into JDK-8265325 and JDK-8264945. - 1) JDK-8265325: Optimize StubRoutines::dpow() for Math.pow(x, 0.5) - 2) JDK-8264945: Optimize the code-gen for Math.pow(x, 0.5) I'll update this pr once JDK-8265325 is finished since it depends on JDK-8265325. Thanks. Best regards, Jie [1] https://mail.openjdk.java.net/pipermail/core-libs-dev/2021-April/076220.html ------------- PR: https://git.openjdk.java.net/jdk/pull/3404 From xgong at openjdk.java.net Fri Apr 16 07:16:36 2021 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Fri, 16 Apr 2021 07:16:36 GMT Subject: Integrated: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask In-Reply-To: References: Message-ID: On Mon, 29 Mar 2021 06:00:46 GMT, Xiaohong Gong wrote: > The Vector API defines different element types for floating point VectorMask. For example, the bitwise related APIs use "`long/int`", while data related APIs use "`double/float`". This makes some optimizations that based on the type checking cannot work well. > > For example, the VectorBox/Unbox elimination like `"VectorUnbox (VectorBox v) ==> v"` requires the types of output and > input are equal. Normally this is necessary. However, due to the different element type for floating point VectorMask with the same species, the VectorBox/Unbox pattern is optimized to: > > VectorLoadMask (VectorStoreMask vmask) > > Actually the types can be treated as the same one for such cases. And considering the vector mask representation is the same for > vectors with the same element size and vector length, it's safe to do the optimization: > > VectorLoadMask (VectorStoreMask vmask) ==> vmask > > The same issue exists for GVN which is based on the type of a node. Making the mask node's` hash()/cmp()` methods depends on the element size instead of the element type can fix it. This pull request has now been integrated. Changeset: e0151a6f Author: Xiaohong Gong Committer: Ningsheng Jian URL: https://git.openjdk.java.net/jdk/commit/e0151a6f Stats: 103 lines in 9 files changed: 102 ins; 0 del; 1 mod 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask Reviewed-by: kvn, vlivanov ------------- PR: https://git.openjdk.java.net/jdk/pull/3238 From rehn at openjdk.java.net Fri Apr 16 07:18:36 2021 From: rehn at openjdk.java.net (Robbin Ehn) Date: Fri, 16 Apr 2021 07:18:36 GMT Subject: RFR: 8264480: Unreachable code in nmethod.cpp inside #ifdef DEBUG In-Reply-To: References: Message-ID: On Thu, 15 Apr 2021 12:50:14 GMT, Robbin Ehn wrote: > Fixed ifdef to use correct define. > > Stress tested locally, running t1-5. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/3517 From rehn at openjdk.java.net Fri Apr 16 07:22:34 2021 From: rehn at openjdk.java.net (Robbin Ehn) Date: Fri, 16 Apr 2021 07:22:34 GMT Subject: RFR: 8264480: Unreachable code in nmethod.cpp inside #ifdef DEBUG In-Reply-To: References: Message-ID: On Thu, 15 Apr 2021 12:50:14 GMT, Robbin Ehn wrote: > Fixed ifdef to use correct define. > > Stress tested locally, running t1-5. Passed t1-5. ------------- PR: https://git.openjdk.java.net/jdk/pull/3517 From rehn at openjdk.java.net Fri Apr 16 07:22:34 2021 From: rehn at openjdk.java.net (Robbin Ehn) Date: Fri, 16 Apr 2021 07:22:34 GMT Subject: Integrated: 8264480: Unreachable code in nmethod.cpp inside #ifdef DEBUG In-Reply-To: References: Message-ID: On Thu, 15 Apr 2021 12:50:14 GMT, Robbin Ehn wrote: > Fixed ifdef to use correct define. > > Stress tested locally, running t1-5. This pull request has now been integrated. Changeset: 50f3da8d Author: Robbin Ehn URL: https://git.openjdk.java.net/jdk/commit/50f3da8d Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8264480: Unreachable code in nmethod.cpp inside #ifdef DEBUG Reviewed-by: chagedorn, kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/3517 From jiefu at openjdk.java.net Fri Apr 16 07:30:11 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 16 Apr 2021 07:30:11 GMT Subject: RFR: 8265325: Optimize StubRoutines::dpow() for Math.pow(x, 0.5) [v2] In-Reply-To: References: Message-ID: > Hi all, > > I'd like to optimize the StubRoutines::dpow() for Math.pow(x, 0.5). > > In the pow and sqrt discussion [1], Joe taught me that the Java library implementation of pow has been optimized for pow(x, 2.0) [2] and pow(x, 0.5) [3]. > However, the hotspot StubRoutines::dpow() only implements the same opt for pow(x, 2.0), but still not for pow(x, 0.5). > This patch optimizes StubRoutines::dpow() for pow(x, 0.5). > > Although not all Math.pow(x, 0.5) can be replaced with sqrt(x), we can still do it safely for the following cases: > 1) x >= 0.0 (fully implemented) > 2) x is +Inf (fully implemented) > 3) x is NaN (can be further divided into +NaN and -NaN and only +NaN is implemented) > > The effect of this opt has been tested on serveral platforms showing 3.0x ~ 6.3x performance improvement. > And no performance drop was observed. > > Testing: > - tier1 ~ tier3 on Linux/x64 > > Thanks. > Best regards, > Jie > > [1] https://mail.openjdk.java.net/pipermail/core-libs-dev/2021-April/076220.html > [2] https://github.com/openjdk/jdk/blob/d84a7e55be40eae57b6c322694d55661a5053a55/src/java.base/share/classes/java/lang/FdLibm.java#L362 > [3] https://github.com/openjdk/jdk/blob/d84a7e55be40eae57b6c322694d55661a5053a55/src/java.base/share/classes/java/lang/FdLibm.java#L364 > > Detailed performance numbers: > * Linux/Intel > > --------- Before ----------- > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble 0 thrpt 8 218783.605 ? 838.379 ops/ms > MathBench.powDouble0Dot5 0 thrpt 8 45498.351 ? 7.558 ops/ms > MathBench.powDouble0Dot5Const 0 thrpt 8 45243.530 ? 1097.100 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.031 ? 0.001 ops/ms > MathBench.powDoubleLoop 0 thrpt 8 0.031 ? 0.001 ops/ms > StrictMathBench.powDouble N/A thrpt 8 176106.602 ? 13127.650 ops/ms > ---------------------------- > > --------- After ----------- > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble 0 thrpt 8 219930.462 ? 181.922 ops/ms > MathBench.powDouble0Dot5 0 thrpt 8 204966.834 ? 329.032 ops/ms <-- 4.5x up > MathBench.powDouble0Dot5Const 0 thrpt 8 203004.302 ? 684.072 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.121 ? 0.001 ops/ms <-- 3.9x up > MathBench.powDoubleLoop 0 thrpt 8 0.031 ? 0.001 ops/ms > StrictMathBench.powDouble N/A thrpt 8 178818.861 ? 16235.465 ops/ms > ---------------------------- > > > * Linux/AMD > > --------- Before ----------- > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble 0 thrpt 8 100741.348 ? 207.766 ops/ms > MathBench.powDouble0Dot5 0 thrpt 8 33896.623 ? 103.352 ops/ms > MathBench.powDouble0Dot5Const 0 thrpt 8 34195.944 ? 230.703 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.039 ? 0.001 ops/ms > MathBench.powDoubleLoop 0 thrpt 8 0.038 ? 0.001 ops/ms > StrictMathBench.powDouble N/A thrpt 8 72000.166 ? 135.002 ops/ms > ---------------------------- > > --------- After ----------- > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble 0 thrpt 8 100738.866 ? 222.820 ops/ms > MathBench.powDouble0Dot5 0 thrpt 8 100799.098 ? 95.537 ops/ms <-- 3.0x up > MathBench.powDouble0Dot5Const 0 thrpt 8 100765.571 ? 178.436 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.244 ? 0.002 ops/ms <-- 6.3x up > MathBench.powDoubleLoop 0 thrpt 8 0.038 ? 0.001 ops/ms > StrictMathBench.powDouble N/A thrpt 8 71758.725 ? 339.660 ops/ms > ---------------------------- > > > * MacOS/Intel > > --------- Before ----------- > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble 0 thrpt 8 238064.722 ? 5181.318 ops/ms > MathBench.powDouble0Dot5 0 thrpt 8 59235.979 ? 2046.519 ops/ms > MathBench.powDouble0Dot5Const 0 thrpt 8 59695.014 ? 1079.692 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.040 ? 0.001 ops/ms > MathBench.powDoubleLoop 0 thrpt 8 0.041 ? 0.001 ops/ms > StrictMathBench.powDouble N/A thrpt 8 238391.026 ? 2743.385 ops/ms > ---------------------------- > > --------- After ----------- > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble 0 thrpt 8 238582.414 ? 3661.261 ops/ms > MathBench.powDouble0Dot5 0 thrpt 8 224102.701 ? 2846.892 ops/ms <-- 3.8x up > MathBench.powDouble0Dot5Const 0 thrpt 8 224542.331 ? 19027.596 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.158 ? 0.002 ops/ms <-- 4.0x up > MathBench.powDoubleLoop 0 thrpt 8 0.041 ? 0.001 ops/ms > StrictMathBench.powDouble N/A thrpt 8 233689.504 ? 10141.034 ops/ms > ---------------------------- Jie Fu has updated the pull request incrementally with one additional commit since the last revision: Add test for x=0.0 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3536/files - new: https://git.openjdk.java.net/jdk/pull/3536/files/392a5b92..a97cb957 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3536&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3536&range=00-01 Stats: 4 lines in 1 file changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/3536.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3536/head:pull/3536 PR: https://git.openjdk.java.net/jdk/pull/3536 From thartmann at openjdk.java.net Fri Apr 16 08:11:40 2021 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Fri, 16 Apr 2021 08:11:40 GMT Subject: RFR: 8265323: Leftover local variables in PcDesc In-Reply-To: References: Message-ID: On Fri, 16 Apr 2021 03:16:30 GMT, Yi Yang wrote: > Leftover local variables in PcDesc. Remove it to save electric power. Marked as reviewed by thartmann (Reviewer). src/hotspot/share/code/pcDesc.cpp line 46: > 44: #ifndef PRODUCT > 45: ResourceMark rm; > 46: st->print_cr("PcDesc(pc=" PTR_FORMAT " offset=%x bits=%x):", p2i(real_pc(code)), pc_offset(), _flags); This changes the output from PcDesc(pc=0x00007fe00d12475f offset=ffffffff bits=0): PcDesc(pc=0x00007fe00d12478c offset=2c bits=0): Test::test3 at -1 (line 33) ``` to PcDesc(pc=0x00007f4fc9035a5f offset=ffffffff bits=0): PcDesc(pc=0x00007f4fc9035a8c offset=2c bits=0): Test::test3 at -1 (line 33) ``` But since that is more inline with the output of `ScopeDesc::print_on`, that's fine with me. ------------- PR: https://git.openjdk.java.net/jdk/pull/3532 From thartmann at openjdk.java.net Fri Apr 16 08:13:34 2021 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Fri, 16 Apr 2021 08:13:34 GMT Subject: RFR: 8265245: depChecker_ don't have any functionalities In-Reply-To: <1JKPxPdQ9XoTDXdIP8P_eBy6EqYZhEoD8SACCs0MfSs=.d5cd2e5f-6c11-4be8-8651-3931368d1471@github.com> References: <1JKPxPdQ9XoTDXdIP8P_eBy6EqYZhEoD8SACCs0MfSs=.d5cd2e5f-6c11-4be8-8651-3931368d1471@github.com> Message-ID: <7KW8AJT9EL9Mt0Tebhtooow-7zbDcwEbeoG21HURdG8=.12c7367e-11fc-4a45-b9c1-23f97321799d@github.com> On Fri, 16 Apr 2021 03:10:12 GMT, Yi Yang wrote: > Yet another OpenJDK6 old guy. > > src/hotspot/cpu//depChecker_.hpp/cpp are included by src/hotspot/share/compiler/disassembler.cpp, they don't provide any functionality. > > In the absence of strong demand either in existing ARCHs or future ARCHs, I think we can remove it. Looks good to me. ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3531 From roland at openjdk.java.net Fri Apr 16 08:45:06 2021 From: roland at openjdk.java.net (Roland Westrelin) Date: Fri, 16 Apr 2021 08:45:06 GMT Subject: RFR: 8264958: C2 compilation fails with assert "n is later than its clone" Message-ID: <3LVvNT_SSyzcC2M7fT-WGZ0nf_0a1QOSgdG3uUjmVaI=.2d00656f-ecd7-4635-a81d-a0a3f8e913b9@github.com> JDK-8229483 added logic to hoist a load that would wrongly end up in an outer strip mined loop. That logic checks that it's legal to do so with: is_dominator(n_ctrl, x_head) but that test passes for n_ctrl == x_head when it's not legal to hoist the load i.e. the test we want is for strict domination. The fix I propose is to add an explicit check for that case. ------------- Commit messages: - test - fix Changes: https://git.openjdk.java.net/jdk/pull/3539/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3539&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264958 Stats: 58 lines in 2 files changed: 57 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3539.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3539/head:pull/3539 PR: https://git.openjdk.java.net/jdk/pull/3539 From neliasso at openjdk.java.net Fri Apr 16 09:18:40 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Fri, 16 Apr 2021 09:18:40 GMT Subject: RFR: 8264958: C2 compilation fails with assert "n is later than its clone" In-Reply-To: <3LVvNT_SSyzcC2M7fT-WGZ0nf_0a1QOSgdG3uUjmVaI=.2d00656f-ecd7-4635-a81d-a0a3f8e913b9@github.com> References: <3LVvNT_SSyzcC2M7fT-WGZ0nf_0a1QOSgdG3uUjmVaI=.2d00656f-ecd7-4635-a81d-a0a3f8e913b9@github.com> Message-ID: <10meL3nI3zWykDGjFlH6L7N1hU716jl9SJ3VOSRfASY=.ab2873db-c091-498f-bd05-7103efde5da0@github.com> On Fri, 16 Apr 2021 08:36:31 GMT, Roland Westrelin wrote: > JDK-8229483 added logic to hoist a load that would wrongly end up in > an outer strip mined loop. That logic checks that it's legal to do so > with: > > is_dominator(n_ctrl, x_head) > > but that test passes for n_ctrl == x_head when it's not legal to hoist > the load i.e. the test we want is for strict domination. The fix I > propose is to add an explicit check for that case. Looks good! ------------- Marked as reviewed by neliasso (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3539 From neliasso at openjdk.java.net Fri Apr 16 09:21:36 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Fri, 16 Apr 2021 09:21:36 GMT Subject: RFR: 8265245: depChecker_ don't have any functionalities In-Reply-To: <1JKPxPdQ9XoTDXdIP8P_eBy6EqYZhEoD8SACCs0MfSs=.d5cd2e5f-6c11-4be8-8651-3931368d1471@github.com> References: <1JKPxPdQ9XoTDXdIP8P_eBy6EqYZhEoD8SACCs0MfSs=.d5cd2e5f-6c11-4be8-8651-3931368d1471@github.com> Message-ID: On Fri, 16 Apr 2021 03:10:12 GMT, Yi Yang wrote: > Yet another OpenJDK6 old guy. > > src/hotspot/cpu//depChecker_.hpp/cpp are included by src/hotspot/share/compiler/disassembler.cpp, they don't provide any functionality. > > In the absence of strong demand either in existing ARCHs or future ARCHs, I think we can remove it. Nice cleanup. Approved. ------------- Marked as reviewed by neliasso (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3531 From neliasso at openjdk.java.net Fri Apr 16 09:30:33 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Fri, 16 Apr 2021 09:30:33 GMT Subject: RFR: 8265323: Leftover local variables in PcDesc In-Reply-To: References: Message-ID: On Fri, 16 Apr 2021 03:16:30 GMT, Yi Yang wrote: > Leftover local variables in PcDesc. Remove it to save electric power. Approved. ------------- Marked as reviewed by neliasso (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3532 From rcastanedalo at openjdk.java.net Fri Apr 16 10:03:06 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Fri, 16 Apr 2021 10:03:06 GMT Subject: RFR: 8262462: IGV: cannot remove specific groups imported via network Message-ID: <4rcLWv7_UPyzmG6NRSLwBQs78yKuQyyveKYg4LzvMWY=.0bb709f7-d313-4bbf-ade3-278fbeb3fc3b@github.com> This change connects groups to their `GraphDocument` parents when the groups are imported via network, making it possible to remove the imported groups later. Before the change, this connection was made only when groups are imported from a file: https://github.com/openjdk/jdk/blob/1d66a155c711906fbb5e8e976fd6585ef491ea68/src/utils/IdealGraphVisualizer/Coordinator/src/main/java/com/sun/hotspot/igv/coordinator/actions/ImportAction.java#L147 Tested manually on JDK 8, 11, and 15 (latest version currently supported by IGV) by following the steps in the [bug report](https://bugs.openjdk.java.net/browse/JDK-8262462). ------------- Commit messages: - Set group's parent when a group is added to the graph document Changes: https://git.openjdk.java.net/jdk/pull/3515/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3515&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8262462 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3515.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3515/head:pull/3515 PR: https://git.openjdk.java.net/jdk/pull/3515 From neliasso at openjdk.java.net Fri Apr 16 10:12:34 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Fri, 16 Apr 2021 10:12:34 GMT Subject: RFR: 8262462: IGV: cannot remove specific groups imported via network In-Reply-To: <4rcLWv7_UPyzmG6NRSLwBQs78yKuQyyveKYg4LzvMWY=.0bb709f7-d313-4bbf-ade3-278fbeb3fc3b@github.com> References: <4rcLWv7_UPyzmG6NRSLwBQs78yKuQyyveKYg4LzvMWY=.0bb709f7-d313-4bbf-ade3-278fbeb3fc3b@github.com> Message-ID: On Thu, 15 Apr 2021 12:39:09 GMT, Roberto Casta?eda Lozano wrote: > This change connects groups to their `GraphDocument` parents when the groups are imported via network, making it possible to remove the imported groups later. Before the change, this connection was made only when groups are imported from a file: > > https://github.com/openjdk/jdk/blob/1d66a155c711906fbb5e8e976fd6585ef491ea68/src/utils/IdealGraphVisualizer/Coordinator/src/main/java/com/sun/hotspot/igv/coordinator/actions/ImportAction.java#L147 > > Tested manually on JDK 8, 11, and 15 (latest version currently supported by IGV) by following the steps in the [bug report](https://bugs.openjdk.java.net/browse/JDK-8262462). Looks good! You can consider this a 'trivial' change. ------------- Marked as reviewed by neliasso (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3515 From rcastanedalo at openjdk.java.net Fri Apr 16 10:18:44 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Fri, 16 Apr 2021 10:18:44 GMT Subject: RFR: 8262462: IGV: cannot remove specific groups imported via network In-Reply-To: References: <4rcLWv7_UPyzmG6NRSLwBQs78yKuQyyveKYg4LzvMWY=.0bb709f7-d313-4bbf-ade3-278fbeb3fc3b@github.com> Message-ID: On Fri, 16 Apr 2021 10:09:39 GMT, Nils Eliasson wrote: > Looks good! > > You can consider this a 'trivial' change. Thanks for reviewing, Nils! ------------- PR: https://git.openjdk.java.net/jdk/pull/3515 From thartmann at openjdk.java.net Fri Apr 16 10:45:39 2021 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Fri, 16 Apr 2021 10:45:39 GMT Subject: RFR: 8264958: C2 compilation fails with assert "n is later than its clone" In-Reply-To: <3LVvNT_SSyzcC2M7fT-WGZ0nf_0a1QOSgdG3uUjmVaI=.2d00656f-ecd7-4635-a81d-a0a3f8e913b9@github.com> References: <3LVvNT_SSyzcC2M7fT-WGZ0nf_0a1QOSgdG3uUjmVaI=.2d00656f-ecd7-4635-a81d-a0a3f8e913b9@github.com> Message-ID: On Fri, 16 Apr 2021 08:36:31 GMT, Roland Westrelin wrote: > JDK-8229483 added logic to hoist a load that would wrongly end up in > an outer strip mined loop. That logic checks that it's legal to do so > with: > > is_dominator(n_ctrl, x_head) > > but that test passes for n_ctrl == x_head when it's not legal to hoist > the load i.e. the test we want is for strict domination. The fix I > propose is to add an explicit check for that case. Looks good to me. ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3539 From thartmann at openjdk.java.net Fri Apr 16 10:47:36 2021 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Fri, 16 Apr 2021 10:47:36 GMT Subject: RFR: 8265105: gc/arguments/TestSelectDefaultGC.java fails when compiler1 is disabled In-Reply-To: References: Message-ID: On Tue, 13 Apr 2021 05:46:15 GMT, SUN Guoyun wrote: > On MIPS64 platform has not impliment C1,only has C2. > so when tiered compilation is off, it is unnecessary to set client emulation mode flags. > perhaps this bug be included by https://bugs.openjdk.java.net/browse/JDK-8251462 @veresov should have a look at this. ------------- PR: https://git.openjdk.java.net/jdk/pull/3449 From roland at openjdk.java.net Fri Apr 16 11:37:38 2021 From: roland at openjdk.java.net (Roland Westrelin) Date: Fri, 16 Apr 2021 11:37:38 GMT Subject: RFR: 8264958: C2 compilation fails with assert "n is later than its clone" In-Reply-To: <10meL3nI3zWykDGjFlH6L7N1hU716jl9SJ3VOSRfASY=.ab2873db-c091-498f-bd05-7103efde5da0@github.com> References: <3LVvNT_SSyzcC2M7fT-WGZ0nf_0a1QOSgdG3uUjmVaI=.2d00656f-ecd7-4635-a81d-a0a3f8e913b9@github.com> <10meL3nI3zWykDGjFlH6L7N1hU716jl9SJ3VOSRfASY=.ab2873db-c091-498f-bd05-7103efde5da0@github.com> Message-ID: On Fri, 16 Apr 2021 09:15:39 GMT, Nils Eliasson wrote: >> JDK-8229483 added logic to hoist a load that would wrongly end up in >> an outer strip mined loop. That logic checks that it's legal to do so >> with: >> >> is_dominator(n_ctrl, x_head) >> >> but that test passes for n_ctrl == x_head when it's not legal to hoist >> the load i.e. the test we want is for strict domination. The fix I >> propose is to add an explicit check for that case. > > Looks good! @neliasso @TobiHartmann thanks for the reviews ------------- PR: https://git.openjdk.java.net/jdk/pull/3539 From roland at openjdk.java.net Fri Apr 16 11:37:41 2021 From: roland at openjdk.java.net (Roland Westrelin) Date: Fri, 16 Apr 2021 11:37:41 GMT Subject: Integrated: 8264958: C2 compilation fails with assert "n is later than its clone" In-Reply-To: <3LVvNT_SSyzcC2M7fT-WGZ0nf_0a1QOSgdG3uUjmVaI=.2d00656f-ecd7-4635-a81d-a0a3f8e913b9@github.com> References: <3LVvNT_SSyzcC2M7fT-WGZ0nf_0a1QOSgdG3uUjmVaI=.2d00656f-ecd7-4635-a81d-a0a3f8e913b9@github.com> Message-ID: On Fri, 16 Apr 2021 08:36:31 GMT, Roland Westrelin wrote: > JDK-8229483 added logic to hoist a load that would wrongly end up in > an outer strip mined loop. That logic checks that it's legal to do so > with: > > is_dominator(n_ctrl, x_head) > > but that test passes for n_ctrl == x_head when it's not legal to hoist > the load i.e. the test we want is for strict domination. The fix I > propose is to add an explicit check for that case. This pull request has now been integrated. Changeset: 71373280 Author: Roland Westrelin URL: https://git.openjdk.java.net/jdk/commit/71373280 Stats: 58 lines in 2 files changed: 57 ins; 0 del; 1 mod 8264958: C2 compilation fails with assert "n is later than its clone" Reviewed-by: neliasso, thartmann ------------- PR: https://git.openjdk.java.net/jdk/pull/3539 From rcastanedalo at openjdk.java.net Fri Apr 16 11:52:36 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Fri, 16 Apr 2021 11:52:36 GMT Subject: Integrated: 8262462: IGV: cannot remove specific groups imported via network In-Reply-To: <4rcLWv7_UPyzmG6NRSLwBQs78yKuQyyveKYg4LzvMWY=.0bb709f7-d313-4bbf-ade3-278fbeb3fc3b@github.com> References: <4rcLWv7_UPyzmG6NRSLwBQs78yKuQyyveKYg4LzvMWY=.0bb709f7-d313-4bbf-ade3-278fbeb3fc3b@github.com> Message-ID: On Thu, 15 Apr 2021 12:39:09 GMT, Roberto Casta?eda Lozano wrote: > This change connects groups to their `GraphDocument` parents when the groups are imported via network, making it possible to remove the imported groups later. Before the change, this connection was made only when groups are imported from a file: > > https://github.com/openjdk/jdk/blob/1d66a155c711906fbb5e8e976fd6585ef491ea68/src/utils/IdealGraphVisualizer/Coordinator/src/main/java/com/sun/hotspot/igv/coordinator/actions/ImportAction.java#L147 > > Tested manually on JDK 8, 11, and 15 (latest version currently supported by IGV) by following the steps in the [bug report](https://bugs.openjdk.java.net/browse/JDK-8262462). This pull request has now been integrated. Changeset: 10ec38f8 Author: Roberto Casta?eda Lozano URL: https://git.openjdk.java.net/jdk/commit/10ec38f8 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod 8262462: IGV: cannot remove specific groups imported via network Reviewed-by: neliasso ------------- PR: https://git.openjdk.java.net/jdk/pull/3515 From rcastanedalo at openjdk.java.net Fri Apr 16 12:04:43 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Fri, 16 Apr 2021 12:04:43 GMT Subject: RFR: 8262725: IGV: crash when removing all graphs in a group Message-ID: This change makes IGV close graph views when their groups are either a) removed or b) emptied (all graphs are removed but the group remains), and avoids faulty graph view computation in case b). Tested the following scenarios manually (on groups loaded both via network and from a file, and on JDK 8, 11, and 15): 1. open a graph, then remove all graphs in the group (as described in the [bug report](https://bugs.openjdk.java.net/browse/JDK-8262725)); 2. open a graph, then remove its group; and 3. open a graph, then remove all graphs and groups (File -> Remove all graphs and groups). ------------- Commit messages: - Close graph views when their groups are removed or emptied Changes: https://git.openjdk.java.net/jdk/pull/3540/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3540&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8262725 Stats: 33 lines in 2 files changed: 31 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/3540.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3540/head:pull/3540 PR: https://git.openjdk.java.net/jdk/pull/3540 From iveresov at openjdk.java.net Fri Apr 16 14:22:37 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Fri, 16 Apr 2021 14:22:37 GMT Subject: RFR: 8265105: gc/arguments/TestSelectDefaultGC.java fails when compiler1 is disabled In-Reply-To: References: Message-ID: On Tue, 13 Apr 2021 05:46:15 GMT, SUN Guoyun wrote: > On MIPS64 platform has not impliment C1,only has C2. > so when tiered compilation is off, it is unnecessary to set client emulation mode flags. > perhaps this bug be included by https://bugs.openjdk.java.net/browse/JDK-8251462 So, you want the VM to ignore the ```NeverActAsServerClassMachine``` if it has C2 and not C1? ------------- PR: https://git.openjdk.java.net/jdk/pull/3449 From enikitin at openjdk.java.net Fri Apr 16 14:36:38 2021 From: enikitin at openjdk.java.net (Evgeny Nikitin) Date: Fri, 16 Apr 2021 14:36:38 GMT Subject: Integrated: 8262060: compiler/whitebox/BlockingCompilation.java timed out In-Reply-To: References: Message-ID: On Fri, 26 Mar 2021 20:22:14 GMT, Evgeny Nikitin wrote: > Please review this small fix for a test. > > Test fails sometimes when run with UsageTracker enabled. For some reason, a loading of ThreadLocalRandom can happen during the test run, and this invalidates Random.nextInt method (because it's not the only implementation now). > > Fixed by pre-loading ThreadLocalRandom. Tested by multiple runs with UsageTracker enabled - approx. 1 out of 20-30 test runs fails without the fix and no failures spotted with the fix. This pull request has now been integrated. Changeset: 694e1cdc Author: Evgeny Nikitin Committer: Igor Ignatyev URL: https://git.openjdk.java.net/jdk/commit/694e1cdc Stats: 3 lines in 1 file changed: 0 ins; 2 del; 1 mod 8262060: compiler/whitebox/BlockingCompilation.java timed out Reviewed-by: iignatyev ------------- PR: https://git.openjdk.java.net/jdk/pull/3224 From kvn at openjdk.java.net Fri Apr 16 17:28:38 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 16 Apr 2021 17:28:38 GMT Subject: RFR: 8265325: Optimize StubRoutines::dpow() for Math.pow(x, 0.5) [v2] In-Reply-To: References: Message-ID: On Fri, 16 Apr 2021 07:30:11 GMT, Jie Fu wrote: >> Hi all, >> >> I'd like to optimize the StubRoutines::dpow() for Math.pow(x, 0.5). >> >> In the pow and sqrt discussion [1], Joe taught me that the Java library implementation of pow has been optimized for pow(x, 2.0) [2] and pow(x, 0.5) [3]. >> However, the hotspot StubRoutines::dpow() only implements the same opt for pow(x, 2.0), but still not for pow(x, 0.5). >> This patch optimizes StubRoutines::dpow() for pow(x, 0.5). >> >> Although not all Math.pow(x, 0.5) can be replaced with sqrt(x), we can still do it safely for the following cases: >> 1) x >= 0.0 (fully implemented) >> 2) x is +Inf (fully implemented) >> 3) x is NaN (can be further divided into +NaN and -NaN and only +NaN is implemented) >> >> The effect of this opt has been tested on serveral platforms showing 3.0x ~ 6.3x performance improvement. >> And no performance drop was observed. >> >> Testing: >> - tier1 ~ tier3 on Linux/x64 >> >> Thanks. >> Best regards, >> Jie >> >> [1] https://mail.openjdk.java.net/pipermail/core-libs-dev/2021-April/076220.html >> [2] https://github.com/openjdk/jdk/blob/d84a7e55be40eae57b6c322694d55661a5053a55/src/java.base/share/classes/java/lang/FdLibm.java#L362 >> [3] https://github.com/openjdk/jdk/blob/d84a7e55be40eae57b6c322694d55661a5053a55/src/java.base/share/classes/java/lang/FdLibm.java#L364 >> >> Detailed performance numbers: >> * Linux/Intel >> >> --------- Before ----------- >> Benchmark (seed) Mode Cnt Score Error Units >> MathBench.powDouble 0 thrpt 8 218783.605 ? 838.379 ops/ms >> MathBench.powDouble0Dot5 0 thrpt 8 45498.351 ? 7.558 ops/ms >> MathBench.powDouble0Dot5Const 0 thrpt 8 45243.530 ? 1097.100 ops/ms >> MathBench.powDouble0Dot5Loop 0 thrpt 8 0.031 ? 0.001 ops/ms >> MathBench.powDoubleLoop 0 thrpt 8 0.031 ? 0.001 ops/ms >> StrictMathBench.powDouble N/A thrpt 8 176106.602 ? 13127.650 ops/ms >> ---------------------------- >> >> --------- After ----------- >> Benchmark (seed) Mode Cnt Score Error Units >> MathBench.powDouble 0 thrpt 8 219930.462 ? 181.922 ops/ms >> MathBench.powDouble0Dot5 0 thrpt 8 204966.834 ? 329.032 ops/ms <-- 4.5x up >> MathBench.powDouble0Dot5Const 0 thrpt 8 203004.302 ? 684.072 ops/ms >> MathBench.powDouble0Dot5Loop 0 thrpt 8 0.121 ? 0.001 ops/ms <-- 3.9x up >> MathBench.powDoubleLoop 0 thrpt 8 0.031 ? 0.001 ops/ms >> StrictMathBench.powDouble N/A thrpt 8 178818.861 ? 16235.465 ops/ms >> ---------------------------- >> >> >> * Linux/AMD >> >> --------- Before ----------- >> Benchmark (seed) Mode Cnt Score Error Units >> MathBench.powDouble 0 thrpt 8 100741.348 ? 207.766 ops/ms >> MathBench.powDouble0Dot5 0 thrpt 8 33896.623 ? 103.352 ops/ms >> MathBench.powDouble0Dot5Const 0 thrpt 8 34195.944 ? 230.703 ops/ms >> MathBench.powDouble0Dot5Loop 0 thrpt 8 0.039 ? 0.001 ops/ms >> MathBench.powDoubleLoop 0 thrpt 8 0.038 ? 0.001 ops/ms >> StrictMathBench.powDouble N/A thrpt 8 72000.166 ? 135.002 ops/ms >> ---------------------------- >> >> --------- After ----------- >> Benchmark (seed) Mode Cnt Score Error Units >> MathBench.powDouble 0 thrpt 8 100738.866 ? 222.820 ops/ms >> MathBench.powDouble0Dot5 0 thrpt 8 100799.098 ? 95.537 ops/ms <-- 3.0x up >> MathBench.powDouble0Dot5Const 0 thrpt 8 100765.571 ? 178.436 ops/ms >> MathBench.powDouble0Dot5Loop 0 thrpt 8 0.244 ? 0.002 ops/ms <-- 6.3x up >> MathBench.powDoubleLoop 0 thrpt 8 0.038 ? 0.001 ops/ms >> StrictMathBench.powDouble N/A thrpt 8 71758.725 ? 339.660 ops/ms >> ---------------------------- >> >> >> * MacOS/Intel >> >> --------- Before ----------- >> Benchmark (seed) Mode Cnt Score Error Units >> MathBench.powDouble 0 thrpt 8 238064.722 ? 5181.318 ops/ms >> MathBench.powDouble0Dot5 0 thrpt 8 59235.979 ? 2046.519 ops/ms >> MathBench.powDouble0Dot5Const 0 thrpt 8 59695.014 ? 1079.692 ops/ms >> MathBench.powDouble0Dot5Loop 0 thrpt 8 0.040 ? 0.001 ops/ms >> MathBench.powDoubleLoop 0 thrpt 8 0.041 ? 0.001 ops/ms >> StrictMathBench.powDouble N/A thrpt 8 238391.026 ? 2743.385 ops/ms >> ---------------------------- >> >> --------- After ----------- >> Benchmark (seed) Mode Cnt Score Error Units >> MathBench.powDouble 0 thrpt 8 238582.414 ? 3661.261 ops/ms >> MathBench.powDouble0Dot5 0 thrpt 8 224102.701 ? 2846.892 ops/ms <-- 3.8x up >> MathBench.powDouble0Dot5Const 0 thrpt 8 224542.331 ? 19027.596 ops/ms >> MathBench.powDouble0Dot5Loop 0 thrpt 8 0.158 ? 0.002 ops/ms <-- 4.0x up >> MathBench.powDoubleLoop 0 thrpt 8 0.041 ? 0.001 ops/ms >> StrictMathBench.powDouble N/A thrpt 8 233689.504 ? 10141.034 ops/ms >> ---------------------------- > > Jie Fu has updated the pull request incrementally with one additional commit since the last revision: > > Add test for x=0.0 Thank you for testing all modes. Note, when StubRoutines::dpow() is not used (-XX:-UseLibmIntrinsic or -XX:DisableIntrinsic=_pow) we use C code which also have this (and others) optimization already: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntimeTrans.cpp#L483 src/hotspot/cpu/x86/macroAssembler_x86_pow.cpp line 2: > 1: /* > 2: * Copyright (c) 2016, 2021, Intel Corporation. Please, don't modify copyright line of other company. Add new line for your company. test/hotspot/jtreg/compiler/intrinsics/math/TestPow0Dot5Opt.java line 41: > 39: if (a < 0.0) return; > 40: > 41: double r1 = Math.sqrt(a); `r1` should be static value to avoid compiling `sqrt()` method as intrinsic. ------------- PR: https://git.openjdk.java.net/jdk/pull/3536 From kvn at openjdk.java.net Fri Apr 16 17:31:37 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 16 Apr 2021 17:31:37 GMT Subject: RFR: 8265322: C2: Simplify control inputs for BarrierSetC2::obj_allocate In-Reply-To: <_NicB3VpqFwTiu2-bKLNhfnQAxJmbtND66g3Md5ck5g=.f9b4edcd-fa2b-4917-9092-bccc38b202df@github.com> References: <_NicB3VpqFwTiu2-bKLNhfnQAxJmbtND66g3Md5ck5g=.f9b4edcd-fa2b-4917-9092-bccc38b202df@github.com> Message-ID: On Fri, 16 Apr 2021 03:06:19 GMT, Yi Yang wrote: > This PR simplifies control inputs for BarrierSetC2::obj_allocate. In most cases, it doesn't change anything since `toobig_false` is equivalent to `ctrl`. In rare case, `toobig_false` is created for Unsafe.allocateInstance while instance size is not statically known, `ctrl` would become control input of IfNode whose projects are `toobig_false` and `toobig_true`, old eden_end and old_eden_top can simply accept `toobig_false` as their control input rather than `ctrl`. This should be reviewed by GC group who owns this code. ------------- PR: https://git.openjdk.java.net/jdk/pull/3529 From kvn at openjdk.java.net Fri Apr 16 17:39:40 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 16 Apr 2021 17:39:40 GMT Subject: RFR: 8265322: C2: Simplify control inputs for BarrierSetC2::obj_allocate In-Reply-To: <_NicB3VpqFwTiu2-bKLNhfnQAxJmbtND66g3Md5ck5g=.f9b4edcd-fa2b-4917-9092-bccc38b202df@github.com> References: <_NicB3VpqFwTiu2-bKLNhfnQAxJmbtND66g3Md5ck5g=.f9b4edcd-fa2b-4917-9092-bccc38b202df@github.com> Message-ID: On Fri, 16 Apr 2021 03:06:19 GMT, Yi Yang wrote: > This PR simplifies control inputs for BarrierSetC2::obj_allocate. In most cases, it doesn't change anything since `toobig_false` is equivalent to `ctrl`. In rare case, `toobig_false` is created for Unsafe.allocateInstance while instance size is not statically known, `ctrl` would become control input of IfNode whose projects are `toobig_false` and `toobig_true`, old eden_end and old_eden_top can simply accept `toobig_false` as their control input rather than `ctrl`. >From compiler code POV the fix is reasonable and correct. Note, the path when `initial_slow_test != NULL` is not rare. It is frequent for arrays allocation when `length` is not constant. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3529 From iignatyev at openjdk.java.net Fri Apr 16 18:27:39 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Fri, 16 Apr 2021 18:27:39 GMT Subject: RFR: 8254129: IR Test Framework to support regex-based matching on the IR in JTreg compiler tests [v2] In-Reply-To: References: <2iYQOJ5yeu7SvGcScLPBOWCPMLv69e1ksOL1vW3ytL8=.0c27621d-ef3d-422c-9d8c-922078ca3160@github.com> Message-ID: On Thu, 15 Apr 2021 13:19:56 GMT, Christian Hagedorn wrote: >> This RFE provides an IR test framework to perform regex-based checks on the C2 IR shape of test methods emitted by the VM flags `-XX:+PrintIdeal` and `-XX:+PrintOptoAssembly`. The framework can also be used for other non-IR matching (and non-compiler) tests by providing easy to use annotations for commonly used testing patterns and compiler control flags. >> >> The framework is based on the ideas of the currently present IR test framework in [Valhalla](https://github.com/openjdk/valhalla/blob/e9c78ce4fcfd01361c35883e0d68f9ae5a80d079/test/hotspot/jtreg/compiler/valhalla/inlinetypes/InlineTypeTest.java) (mainly implemented by @TobiHartmann) which is being used with great success. This new framework aims to replace the old one in Valhalla at some point. >> >> A detailed description about how this new IR test framework works and how it is used is provided in the [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) file and in the [Javadocs](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc/jdk/test/lib/hotspot/ir_framework/package-summary.html) written for the framework classes. >> >> To finish a first version of this framework for JDK 17, I decided to leave some improvement possibilities and ideas to be followed up on in additional RFEs. Some ideas are mentioned in "Future Work" in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) and were also created as subtasks of this RFE. >> >> Testing (also described in "Internal Framework Tests in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md)): >> There are various tests to verify the correctness of the test framework which can be found as JTreg tests in the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) folder. Additional testing was performed by converting all compiler Inline Types test of project Valhalla (done by @katyapav in [JDK-8263024](https://bugs.openjdk.java.net/browse/JDK-8263024)) that used the old framework to the new framework. This provided additional testing for the framework itself. We ran the converted tests with all the flag settings used in hs-tier1-9 and hs-precheckin-comp. For sanity checking, this was also done with a sample IR test in mainline. >> >> Some stats about the framework code added to [ir_framework](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework): >> >> - without the [Javadocs files](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc) : 60 changed files, 13212 insertions, 0 deletions. >> - without the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) and [examples](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/examples) folder: 40 files changed, 6781 insertions >> - comments: 2399 insertions (calculated with `git diff --cached !(tests|examples) | grep -c -E "(^[+-]\s*(/)?*)|(^[+-]\s*//)"`) >> - which leaves 4382 lines of code inserted >> >> Big thanks to: >> - @TobiHartmann for all his help by discussing the new framework and for providing insights from his IR test framework in Valhalla. >> - @katyapav for converting the Valhalla tests to use the new framework which found some harder to catch bugs in the framework and also some actual C2 bugs. >> - @iignatev for helping to simplify the framework usage with JTreg and with the framework internal VM calling structure. >> - and others who provided valuable feedback. >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: > > Adjust whitelist next portion of my comments, stay tuned for more to come :) -- Igor PS I must say that by some reason github's 'file changed' tab is unbelievably slow for me in this particular PR, so it's somewhat painful to even just find the place I want to associate my comment with (not to speak about actually using it to browse/read the code) test/lib/jdk/test/lib/hotspot/ir_framework/AbstractInfo.java line 41: > 39: */ > 40: abstract public class AbstractInfo { > 41: private static final Random random = new Random(); you shouldn't use Random w/o predefined seed as it might make it harder to reproduce, please consider using `jdk.test.lib.Utils.getRandomInstance` or `jdk.test.lib.RandomFactory.getRandom` here test/lib/jdk/test/lib/hotspot/ir_framework/AbstractInfo.java line 57: > 55: * @return an inverted boolean of the result of the last invocation of this method. > 56: */ > 57: public boolean toggleBoolean() { I don't think `toggleBoolean` really belongs to `AbstractInfo`, I'd rather move it into a (separate) aux class. test/lib/jdk/test/lib/hotspot/ir_framework/ArgumentValue.java line 35: > 33: */ > 34: class ArgumentValue { > 35: private static final Random random = new Random(); the same comment as in `AbstractInfo`, please use the reproducible rng here. test/lib/jdk/test/lib/hotspot/ir_framework/IREncodingPrinter.java line 213: > 211: > 212: private boolean checkBooleanFlag(String flag, String value, Boolean actualFlagValue) { > 213: boolean actualBooleanFlagValue = actualFlagValue; as `actualFlagValue` can't be null, you don't need to use box here: Suggestion: private boolean checkBooleanFlag(String flag, String value, boolean actualBooleanFlagValue) { the same for `checkLongFlag`, `checkDoubleFlag` test/lib/jdk/test/lib/hotspot/ir_framework/IRMatcher.java line 47: > 45: > 46: public IRMatcher(String hotspotPidFileName, String irEncoding, Class testClass) { > 47: this.compilations = new LinkedHashMap<>(); why do we use LinkedHashMap here (as opposed to HashMap)? I haven't found the place where you need to traverse it in the insertion order. test/lib/jdk/test/lib/hotspot/ir_framework/IRMatcher.java line 88: > 86: Map irRulesMap = new HashMap<>(); > 87: String patternString = "(?<=" + IREncodingPrinter.START + "\\R)[\\s\\S]*(?=" + IREncodingPrinter.END + ")"; > 88: Pattern pattern = Pattern.compile(patternString); `patternString` is effectively a constant here, but you will compile it into `j.u.Pattern` on each invocation of `parseIREncoding`, it might be a better idea to introduce a static field that holds a precompiled patter. test/lib/jdk/test/lib/hotspot/ir_framework/IRMatcher.java line 98: > 96: String[] splitComma = line.split(","); > 97: if (splitComma.length < 2) { > 98: throw new TestFrameworkException("Invalid IR match rule encoding"); will it make sense to include the line-offender into the exception message here? test/lib/jdk/test/lib/hotspot/ir_framework/IRMatcher.java line 101: > 99: } > 100: String testName = splitComma[0]; > 101: Integer[] irRulesIdx = new Integer[splitComma.length - 1]; you can actually use an array of int here, so there will be less wasted memory and no unboxing later on test/lib/jdk/test/lib/hotspot/ir_framework/IRMatcher.java line 116: > 114: private void parseHotspotPidFile() { > 115: Map compileIdMap = new HashMap<>(); > 116: try (BufferedReader br = new BufferedReader(new FileReader(new File(System.getProperty("user.dir") + File.separator + hotspotPidFileName)))) { more idiomatic/modern version would be Suggestion: try (BufferedReader br = Files.newBufferedReader(Paths.get(System.getProperty("user.dir"), hotspotPidFileName))) { I actually not sure if you really need `$user.dir`, won't hotspot_pid file get generated in the current dir? test/lib/jdk/test/lib/hotspot/ir_framework/IRMatcher.java line 207: > 205: private void addTestMethodCompileId(Map compileIdMap, String line) { > 206: Pattern pattern = Pattern.compile("compile_id='(\\d+)'.*" + Pattern.quote(testClass.getCanonicalName()) + " (\\S+)"); > 207: Matcher matcher = pattern.matcher(line); similarly to `parseIREncoding`, `pattern` here can be precomputed in `IRMatcher::IRMatcher` and stored into a final instance field. test/lib/jdk/test/lib/hotspot/ir_framework/IRMatcher.java line 243: > 241: private String getMethodName(Map compileIdMap, String line) { > 242: Pattern pattern = Pattern.compile("compile_id='(\\d+)'"); > 243: Matcher matcher = pattern.matcher(line); again, precompiled pattern can be saved into a (in this case) static field and reused. test/lib/jdk/test/lib/hotspot/ir_framework/TestFramework.java line 639: > 637: TestFormat.check(!scenarioIndices.contains(scenarioIndex), > 638: "Cannot define two scenarios with the same index " + scenarioIndex); > 639: scenarioIndices.add(scenarioIndex); you can use `Set::add` to verify that the element isn't in a set: Suggestion: TestFormat.check(scenarioIndices.add(scenarioIndex), "Cannot define two scenarios with the same index " + scenarioIndex); test/lib/jdk/test/lib/hotspot/ir_framework/TestFramework.java line 795: > 793: private void checkFlagVMExitCode(OutputAnalyzer oa) { > 794: String flagVMOutput = oa.getOutput(); > 795: final int exitCode = oa.getExitValue(); nit: there is no need for this `final` test/lib/jdk/test/lib/hotspot/ir_framework/TestFramework.java line 804: > 802: System.err.println("--- OUTPUT TestFramework flag VM ---"); > 803: System.err.println(flagVMOutput); > 804: throw new RuntimeException("\nTestFramework flag VM exited with " + exitCode); what's the reason for `\n` in the begging of this exception message? test/lib/jdk/test/lib/hotspot/ir_framework/TestFramework.java line 958: > 956: throw new TestFrameworkException("Internal Test Framework exception - please file a bug:\n" + failureMessage, e); > 957: } > 958: } I am not convinced that we really these guys when we already have `TestFormat::check` and `TestRun::check` (I'm actually not 100% convinced that we need check/fail in both TestFormat and TestRun) test/lib/jdk/test/lib/hotspot/ir_framework/TestFramework.java line 1027: > 1025: } > 1026: > 1027: public static String getRerunHint() { why is this a method and not just a public final static String? test/lib/jdk/test/lib/hotspot/ir_framework/TestFramework.java line 1043: > 1041: * Dedicated socket to send data from the flag and test VM back to the driver VM. > 1042: */ > 1043: class TestFrameworkSocket { could you please move it to a separate .java file? test/lib/jdk/test/lib/hotspot/ir_framework/TestFramework.java line 1043: > 1041: * Dedicated socket to send data from the flag and test VM back to the driver VM. > 1042: */ > 1043: class TestFrameworkSocket { I guess it should implement AutoClosable, and then you try-catch-finally in `TestFramework::start` could be replaced by `try-w-resource`. btw, I don't the fact that `socket` is a field of `TestFramework` with the lifetime bounded to `start` method. ------------- Changes requested by iignatyev (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3508 From iignatyev at openjdk.java.net Fri Apr 16 18:27:40 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Fri, 16 Apr 2021 18:27:40 GMT Subject: RFR: 8254129: IR Test Framework to support regex-based matching on the IR in JTreg compiler tests [v2] In-Reply-To: References: <2iYQOJ5yeu7SvGcScLPBOWCPMLv69e1ksOL1vW3ytL8=.0c27621d-ef3d-422c-9d8c-922078ca3160@github.com> Message-ID: On Fri, 16 Apr 2021 01:48:49 GMT, Igor Ignatyev wrote: >> Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: >> >> Adjust whitelist > > test/lib/jdk/test/lib/hotspot/ir_framework/TestFramework.java line 639: > >> 637: TestFormat.check(!scenarioIndices.contains(scenarioIndex), >> 638: "Cannot define two scenarios with the same index " + scenarioIndex); >> 639: scenarioIndices.add(scenarioIndex); > > you can use `Set::add` to verify that the element isn't in a set: > Suggestion: > > TestFormat.check(scenarioIndices.add(scenarioIndex), > "Cannot define two scenarios with the same index " + scenarioIndex); also, shouldn't this check be done as part of `addScenarios`? ------------- PR: https://git.openjdk.java.net/jdk/pull/3508 From sviswanathan at openjdk.java.net Fri Apr 16 18:52:39 2021 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Fri, 16 Apr 2021 18:52:39 GMT Subject: RFR: 8265154: vinserti128 operand mix up for KNL platforms In-Reply-To: References: Message-ID: On Wed, 14 Apr 2021 15:23:26 GMT, Vladimir Kozlov wrote: >> There is a bug in macro assembler in vinserti128 special handling for platforms like KNL that do not support AVX512VL. >> >> The following: >> void vinserti128(XMMRegister dst, XMMRegister nds, XMMRegister src, uint8_t imm8) { >> if (UseAVX > 2 && VM_Version::supports_avx512novl()) { >> Assembler::vinserti32x4(dst, dst, src, imm8); >> } >> ... >> } >> >> Should have been: >> void vinserti128(XMMRegister dst, XMMRegister nds, XMMRegister src, uint8_t imm8) { >> if (UseAVX > 2 && VM_Version::supports_avx512novl()) { >> Assembler::vinserti32x4(dst, nds, src, imm8); >> } >> ... >> } >> >> Best Regards, >> Sandhya > > What problems this is causing? Would be nice to have a test to show the issue (if possible). @vnkozlov @TobiHartmann Thanks a lot for the review. I noticed this bug while running the vector api tests on KNL platform. The specific tests that are failing without this fix are: jdk/incubator/vector/Float256VectorTests.java (withFloat256VectorTests) jdk/incubator/vector/Double256VectorTests.java (withDouble256VectorTests) Majority of the places where vinserti128 is used in code gen, the dst and nds are passed as same register. Only in the implementation of VectorInsert node for vector api, the dst and nds are different. The vector api tests are run as part of tier 3. ------------- PR: https://git.openjdk.java.net/jdk/pull/3480 From dcubed at openjdk.java.net Fri Apr 16 19:35:04 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 16 Apr 2021 19:35:04 GMT Subject: RFR: 8265358: ProblemList jdk/jshell/ToolBasicTest.java on macOS-aarch64 Message-ID: <_-5JlNnDNuQFLLctYTMcmdAgWJOALgf4C9y-VyWiPVc=.86c6e1e3-14ae-474a-ab65-c0a0574ae96e@github.com> A set of trivial ProblemListing for macos-aarch64 Tier2 test failures: - JDK-8265358 ProblemList jdk/jshell/ToolBasicTest.java on macOS-aarch64 - JDK-8265361 ProblemList a few compiler/whitebox tests on macos-aarch64 - JDK-8265363 ProblemList java/net/Socket/UdpSocket.java on macos-aarch64 - JDK-8265366 ProblemList 2 javax/net/ssl/DTLS tests on macos-aarch64 - JDK-8265368 ProblemList 3 java/net/httpclient/websocket tests on macos-aarch64 - JDK-8265370 ProblemList java/net/MulticastSocket/Promiscuous.java on macos-aarch64 ------------- Commit messages: - 8265370: ProblemList java/net/MulticastSocket/Promiscuous.java on macos-aarch64 - 8265368: ProblemList 3 java/net/httpclient/websocket tests on macos-aarch64 - 8265366 ProblemList 2 javax/net/ssl/DTLS tests on macos-aarch64 - 8265363: ProblemList java/net/Socket/UdpSocket.java on macos-aarch64 - 8265361: ProblemList a few compiler/whitebox tests on macos-aarch64 - 8265358: ProblemList jdk/jshell/ToolBasicTest.java on macOS-aarch64 Changes: https://git.openjdk.java.net/jdk/pull/3549/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3549&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8265358 Stats: 16 lines in 3 files changed: 16 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3549.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3549/head:pull/3549 PR: https://git.openjdk.java.net/jdk/pull/3549 From bpb at openjdk.java.net Fri Apr 16 19:44:33 2021 From: bpb at openjdk.java.net (Brian Burkhalter) Date: Fri, 16 Apr 2021 19:44:33 GMT Subject: RFR: 8265358: ProblemList jdk/jshell/ToolBasicTest.java on macOS-aarch64 In-Reply-To: <_-5JlNnDNuQFLLctYTMcmdAgWJOALgf4C9y-VyWiPVc=.86c6e1e3-14ae-474a-ab65-c0a0574ae96e@github.com> References: <_-5JlNnDNuQFLLctYTMcmdAgWJOALgf4C9y-VyWiPVc=.86c6e1e3-14ae-474a-ab65-c0a0574ae96e@github.com> Message-ID: On Fri, 16 Apr 2021 18:07:01 GMT, Daniel D. Daugherty wrote: > A set of trivial ProblemListing for macos-aarch64 Tier2 test failures: > > - JDK-8265358 ProblemList jdk/jshell/ToolBasicTest.java on macOS-aarch64 > - JDK-8265361 ProblemList a few compiler/whitebox tests on macos-aarch64 > - JDK-8265363 ProblemList java/net/Socket/UdpSocket.java on macos-aarch64 > - JDK-8265366 ProblemList 2 javax/net/ssl/DTLS tests on macos-aarch64 > - JDK-8265368 ProblemList 3 java/net/httpclient/websocket tests on macos-aarch64 > - JDK-8265370 ProblemList java/net/MulticastSocket/Promiscuous.java on macos-aarch64 Marked as reviewed by bpb (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3549 From mikael at openjdk.java.net Fri Apr 16 19:49:33 2021 From: mikael at openjdk.java.net (Mikael Vidstedt) Date: Fri, 16 Apr 2021 19:49:33 GMT Subject: RFR: 8265358: ProblemList jdk/jshell/ToolBasicTest.java on macOS-aarch64 In-Reply-To: <_-5JlNnDNuQFLLctYTMcmdAgWJOALgf4C9y-VyWiPVc=.86c6e1e3-14ae-474a-ab65-c0a0574ae96e@github.com> References: <_-5JlNnDNuQFLLctYTMcmdAgWJOALgf4C9y-VyWiPVc=.86c6e1e3-14ae-474a-ab65-c0a0574ae96e@github.com> Message-ID: On Fri, 16 Apr 2021 18:07:01 GMT, Daniel D. Daugherty wrote: > A set of trivial ProblemListing for macos-aarch64 Tier2 test failures: > > - JDK-8265358 ProblemList jdk/jshell/ToolBasicTest.java on macOS-aarch64 > - JDK-8265361 ProblemList a few compiler/whitebox tests on macos-aarch64 > - JDK-8265363 ProblemList java/net/Socket/UdpSocket.java on macos-aarch64 > - JDK-8265366 ProblemList 2 javax/net/ssl/DTLS tests on macos-aarch64 > - JDK-8265368 ProblemList 3 java/net/httpclient/websocket tests on macos-aarch64 > - JDK-8265370 ProblemList java/net/MulticastSocket/Promiscuous.java on macos-aarch64 Marked as reviewed by mikael (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3549 From mikael at openjdk.java.net Fri Apr 16 20:16:06 2021 From: mikael at openjdk.java.net (Mikael Vidstedt) Date: Fri, 16 Apr 2021 20:16:06 GMT Subject: RFR: 8265358: ProblemList jdk/jshell/ToolBasicTest.java on macOS-aarch64 [v2] In-Reply-To: References: <_-5JlNnDNuQFLLctYTMcmdAgWJOALgf4C9y-VyWiPVc=.86c6e1e3-14ae-474a-ab65-c0a0574ae96e@github.com> Message-ID: On Fri, 16 Apr 2021 20:12:45 GMT, Daniel D. Daugherty wrote: >> A set of trivial ProblemListing for macos-aarch64 Tier2 test failures: >> >> - JDK-8265358 ProblemList jdk/jshell/ToolBasicTest.java on macOS-aarch64 >> - JDK-8265361 ProblemList a few compiler/whitebox tests on macos-aarch64 >> - JDK-8265363 ProblemList java/net/Socket/UdpSocket.java on macos-aarch64 >> - JDK-8265368 ProblemList 3 java/net/httpclient/websocket tests on macos-aarch64 >> - JDK-8265370 ProblemList java/net/MulticastSocket/Promiscuous.java on macos-aarch64 > > Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: > > Remove changes for JDK-8265366 at @fguallini's request. Marked as reviewed by mikael (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3549 From dcubed at openjdk.java.net Fri Apr 16 20:16:06 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 16 Apr 2021 20:16:06 GMT Subject: RFR: 8265358: ProblemList jdk/jshell/ToolBasicTest.java on macOS-aarch64 [v2] In-Reply-To: <_-5JlNnDNuQFLLctYTMcmdAgWJOALgf4C9y-VyWiPVc=.86c6e1e3-14ae-474a-ab65-c0a0574ae96e@github.com> References: <_-5JlNnDNuQFLLctYTMcmdAgWJOALgf4C9y-VyWiPVc=.86c6e1e3-14ae-474a-ab65-c0a0574ae96e@github.com> Message-ID: > A set of trivial ProblemListing for macos-aarch64 Tier2 test failures: > > - JDK-8265358 ProblemList jdk/jshell/ToolBasicTest.java on macOS-aarch64 > - JDK-8265361 ProblemList a few compiler/whitebox tests on macos-aarch64 > - JDK-8265363 ProblemList java/net/Socket/UdpSocket.java on macos-aarch64 > - JDK-8265368 ProblemList 3 java/net/httpclient/websocket tests on macos-aarch64 > - JDK-8265370 ProblemList java/net/MulticastSocket/Promiscuous.java on macos-aarch64 Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: Remove changes for JDK-8265366 at @fguallini's request. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3549/files - new: https://git.openjdk.java.net/jdk/pull/3549/files/5fb3de6a..7f245f4e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3549&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3549&range=00-01 Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3549.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3549/head:pull/3549 PR: https://git.openjdk.java.net/jdk/pull/3549 From dcubed at openjdk.java.net Fri Apr 16 20:16:07 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 16 Apr 2021 20:16:07 GMT Subject: RFR: 8265358: ProblemList jdk/jshell/ToolBasicTest.java on macOS-aarch64 [v2] In-Reply-To: References: <_-5JlNnDNuQFLLctYTMcmdAgWJOALgf4C9y-VyWiPVc=.86c6e1e3-14ae-474a-ab65-c0a0574ae96e@github.com> Message-ID: On Fri, 16 Apr 2021 19:42:06 GMT, Brian Burkhalter wrote: >> Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove changes for JDK-8265366 at @fguallini's request. > > Marked as reviewed by bpb (Reviewer). @bplb and @vidmik - Thanks for the reviews. I'm removing the changes for JDK-8265366 at @fguallini's request. ------------- PR: https://git.openjdk.java.net/jdk/pull/3549 From dcubed at openjdk.java.net Fri Apr 16 20:24:37 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 16 Apr 2021 20:24:37 GMT Subject: Integrated: 8265358: ProblemList jdk/jshell/ToolBasicTest.java on macOS-aarch64 In-Reply-To: <_-5JlNnDNuQFLLctYTMcmdAgWJOALgf4C9y-VyWiPVc=.86c6e1e3-14ae-474a-ab65-c0a0574ae96e@github.com> References: <_-5JlNnDNuQFLLctYTMcmdAgWJOALgf4C9y-VyWiPVc=.86c6e1e3-14ae-474a-ab65-c0a0574ae96e@github.com> Message-ID: On Fri, 16 Apr 2021 18:07:01 GMT, Daniel D. Daugherty wrote: > A set of trivial ProblemListing for macos-aarch64 Tier2 test failures: > > - JDK-8265358 ProblemList jdk/jshell/ToolBasicTest.java on macOS-aarch64 > - JDK-8265361 ProblemList a few compiler/whitebox tests on macos-aarch64 > - JDK-8265363 ProblemList java/net/Socket/UdpSocket.java on macos-aarch64 > - JDK-8265368 ProblemList 3 java/net/httpclient/websocket tests on macos-aarch64 > - JDK-8265370 ProblemList java/net/MulticastSocket/Promiscuous.java on macos-aarch64 This pull request has now been integrated. Changeset: 888d80b5 Author: Daniel D. Daugherty URL: https://git.openjdk.java.net/jdk/commit/888d80b5 Stats: 13 lines in 3 files changed: 13 ins; 0 del; 0 mod 8265358: ProblemList jdk/jshell/ToolBasicTest.java on macOS-aarch64 8265361: ProblemList a few compiler/whitebox tests on macos-aarch64 8265363: ProblemList java/net/Socket/UdpSocket.java on macos-aarch64 8265368: ProblemList 3 java/net/httpclient/websocket tests on macos-aarch64 8265370: ProblemList java/net/MulticastSocket/Promiscuous.java on macos-aarch64 Reviewed-by: bpb, mikael ------------- PR: https://git.openjdk.java.net/jdk/pull/3549 From bpb at openjdk.java.net Fri Apr 16 21:09:57 2021 From: bpb at openjdk.java.net (Brian Burkhalter) Date: Fri, 16 Apr 2021 21:09:57 GMT Subject: RFR: 8265381: ProblemList runtime/logging/RedefineClasses.java on macos-x64 -Xcomp In-Reply-To: <1d3fanUzNcRsQq4yVYnO4l1lVLLpwSz2RP3RUrEehhY=.febbe716-4ffb-420e-ada2-54ef613e6ac5@github.com> References: <1d3fanUzNcRsQq4yVYnO4l1lVLLpwSz2RP3RUrEehhY=.febbe716-4ffb-420e-ada2-54ef613e6ac5@github.com> Message-ID: On Fri, 16 Apr 2021 21:01:33 GMT, Daniel D. Daugherty wrote: > A trivial change to ProblemList runtime/logging/RedefineClasses.java on macos-x64 -Xcomp. Marked as reviewed by bpb (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3552 From dcubed at openjdk.java.net Fri Apr 16 21:09:57 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 16 Apr 2021 21:09:57 GMT Subject: RFR: 8265381: ProblemList runtime/logging/RedefineClasses.java on macos-x64 -Xcomp Message-ID: <1d3fanUzNcRsQq4yVYnO4l1lVLLpwSz2RP3RUrEehhY=.febbe716-4ffb-420e-ada2-54ef613e6ac5@github.com> A trivial change to ProblemList runtime/logging/RedefineClasses.java on macos-x64 -Xcomp. ------------- Commit messages: - 8265381: ProblemList runtime/logging/RedefineClasses.java on macos-x64 -Xcomp Changes: https://git.openjdk.java.net/jdk/pull/3552/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3552&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8265381 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3552.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3552/head:pull/3552 PR: https://git.openjdk.java.net/jdk/pull/3552 From kvn at openjdk.java.net Fri Apr 16 21:16:38 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 16 Apr 2021 21:16:38 GMT Subject: RFR: 8265154: vinserti128 operand mix up for KNL platforms In-Reply-To: References: Message-ID: On Wed, 14 Apr 2021 00:25:40 GMT, Sandhya Viswanathan wrote: > There is a bug in macro assembler in vinserti128 special handling for platforms like KNL that do not support AVX512VL. > > The following: > void vinserti128(XMMRegister dst, XMMRegister nds, XMMRegister src, uint8_t imm8) { > if (UseAVX > 2 && VM_Version::supports_avx512novl()) { > Assembler::vinserti32x4(dst, dst, src, imm8); > } > ... > } > > Should have been: > void vinserti128(XMMRegister dst, XMMRegister nds, XMMRegister src, uint8_t imm8) { > if (UseAVX > 2 && VM_Version::supports_avx512novl()) { > Assembler::vinserti32x4(dst, nds, src, imm8); > } > ... > } > > Best Regards, > Sandhya Good. So it was caught by existing tests. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3480 From dcubed at openjdk.java.net Fri Apr 16 21:24:36 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 16 Apr 2021 21:24:36 GMT Subject: RFR: 8265381: ProblemList runtime/logging/RedefineClasses.java on macos-x64 -Xcomp In-Reply-To: References: <1d3fanUzNcRsQq4yVYnO4l1lVLLpwSz2RP3RUrEehhY=.febbe716-4ffb-420e-ada2-54ef613e6ac5@github.com> Message-ID: On Fri, 16 Apr 2021 21:05:41 GMT, Brian Burkhalter wrote: >> A trivial change to ProblemList runtime/logging/RedefineClasses.java on macos-x64 -Xcomp. > > Marked as reviewed by bpb (Reviewer). @bplb - Thanks for the fast review! ------------- PR: https://git.openjdk.java.net/jdk/pull/3552 From dcubed at openjdk.java.net Fri Apr 16 21:24:37 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 16 Apr 2021 21:24:37 GMT Subject: Integrated: 8265381: ProblemList runtime/logging/RedefineClasses.java on macos-x64 -Xcomp In-Reply-To: <1d3fanUzNcRsQq4yVYnO4l1lVLLpwSz2RP3RUrEehhY=.febbe716-4ffb-420e-ada2-54ef613e6ac5@github.com> References: <1d3fanUzNcRsQq4yVYnO4l1lVLLpwSz2RP3RUrEehhY=.febbe716-4ffb-420e-ada2-54ef613e6ac5@github.com> Message-ID: On Fri, 16 Apr 2021 21:01:33 GMT, Daniel D. Daugherty wrote: > A trivial change to ProblemList runtime/logging/RedefineClasses.java on macos-x64 -Xcomp. This pull request has now been integrated. Changeset: 2c4075cb Author: Daniel D. Daugherty URL: https://git.openjdk.java.net/jdk/commit/2c4075cb Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8265381: ProblemList runtime/logging/RedefineClasses.java on macos-x64 -Xcomp Reviewed-by: bpb ------------- PR: https://git.openjdk.java.net/jdk/pull/3552 From sviswanathan at openjdk.java.net Fri Apr 16 21:32:33 2021 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Fri, 16 Apr 2021 21:32:33 GMT Subject: Integrated: 8265154: vinserti128 operand mix up for KNL platforms In-Reply-To: References: Message-ID: <1-V080sqMJz9jo3Vnm23HdPTSXQ9tjoh3JfDsoWKFHQ=.fab57e93-0a6d-4cf8-bbba-e83b5ad1abfe@github.com> On Wed, 14 Apr 2021 00:25:40 GMT, Sandhya Viswanathan wrote: > There is a bug in macro assembler in vinserti128 special handling for platforms like KNL that do not support AVX512VL. > > The following: > void vinserti128(XMMRegister dst, XMMRegister nds, XMMRegister src, uint8_t imm8) { > if (UseAVX > 2 && VM_Version::supports_avx512novl()) { > Assembler::vinserti32x4(dst, dst, src, imm8); > } > ... > } > > Should have been: > void vinserti128(XMMRegister dst, XMMRegister nds, XMMRegister src, uint8_t imm8) { > if (UseAVX > 2 && VM_Version::supports_avx512novl()) { > Assembler::vinserti32x4(dst, nds, src, imm8); > } > ... > } > > Best Regards, > Sandhya This pull request has now been integrated. Changeset: c108e7ab Author: Sandhya Viswanathan URL: https://git.openjdk.java.net/jdk/commit/c108e7ab Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod 8265154: vinserti128 operand mix up for KNL platforms Reviewed-by: thartmann, kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/3480 From sviswanathan at openjdk.java.net Fri Apr 16 21:32:32 2021 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Fri, 16 Apr 2021 21:32:32 GMT Subject: RFR: 8265154: vinserti128 operand mix up for KNL platforms In-Reply-To: References: Message-ID: On Fri, 16 Apr 2021 21:13:56 GMT, Vladimir Kozlov wrote: >> There is a bug in macro assembler in vinserti128 special handling for platforms like KNL that do not support AVX512VL. >> >> The following: >> void vinserti128(XMMRegister dst, XMMRegister nds, XMMRegister src, uint8_t imm8) { >> if (UseAVX > 2 && VM_Version::supports_avx512novl()) { >> Assembler::vinserti32x4(dst, dst, src, imm8); >> } >> ... >> } >> >> Should have been: >> void vinserti128(XMMRegister dst, XMMRegister nds, XMMRegister src, uint8_t imm8) { >> if (UseAVX > 2 && VM_Version::supports_avx512novl()) { >> Assembler::vinserti32x4(dst, nds, src, imm8); >> } >> ... >> } >> >> Best Regards, >> Sandhya > > Good. > > So it was caught by existing tests. Thanks @vnkozlov. Yes, it was caught by existing tests. ------------- PR: https://git.openjdk.java.net/jdk/pull/3480 From jiefu at openjdk.java.net Fri Apr 16 23:04:56 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 16 Apr 2021 23:04:56 GMT Subject: RFR: 8265325: Optimize StubRoutines::dpow() for Math.pow(x, 0.5) [v3] In-Reply-To: References: Message-ID: > Hi all, > > I'd like to optimize the StubRoutines::dpow() for Math.pow(x, 0.5). > > In the pow and sqrt discussion [1], Joe taught me that the Java library implementation of pow has been optimized for pow(x, 2.0) [2] and pow(x, 0.5) [3]. > However, the hotspot StubRoutines::dpow() only implements the same opt for pow(x, 2.0), but still not for pow(x, 0.5). > This patch optimizes StubRoutines::dpow() for pow(x, 0.5). > > Although not all Math.pow(x, 0.5) can be replaced with sqrt(x), we can still do it safely for the following cases: > 1) x >= 0.0 (fully implemented) > 2) x is +Inf (fully implemented) > 3) x is NaN (can be further divided into +NaN and -NaN and only +NaN is implemented) > > The effect of this opt has been tested on serveral platforms showing 3.0x ~ 6.3x performance improvement. > And no performance drop was observed. > > Testing: > - tier1 ~ tier3 on Linux/x64 > > Thanks. > Best regards, > Jie > > [1] https://mail.openjdk.java.net/pipermail/core-libs-dev/2021-April/076220.html > [2] https://github.com/openjdk/jdk/blob/d84a7e55be40eae57b6c322694d55661a5053a55/src/java.base/share/classes/java/lang/FdLibm.java#L362 > [3] https://github.com/openjdk/jdk/blob/d84a7e55be40eae57b6c322694d55661a5053a55/src/java.base/share/classes/java/lang/FdLibm.java#L364 > > Detailed performance numbers: > * Linux/Intel > > --------- Before ----------- > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble 0 thrpt 8 218783.605 ? 838.379 ops/ms > MathBench.powDouble0Dot5 0 thrpt 8 45498.351 ? 7.558 ops/ms > MathBench.powDouble0Dot5Const 0 thrpt 8 45243.530 ? 1097.100 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.031 ? 0.001 ops/ms > MathBench.powDoubleLoop 0 thrpt 8 0.031 ? 0.001 ops/ms > StrictMathBench.powDouble N/A thrpt 8 176106.602 ? 13127.650 ops/ms > ---------------------------- > > --------- After ----------- > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble 0 thrpt 8 219930.462 ? 181.922 ops/ms > MathBench.powDouble0Dot5 0 thrpt 8 204966.834 ? 329.032 ops/ms <-- 4.5x up > MathBench.powDouble0Dot5Const 0 thrpt 8 203004.302 ? 684.072 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.121 ? 0.001 ops/ms <-- 3.9x up > MathBench.powDoubleLoop 0 thrpt 8 0.031 ? 0.001 ops/ms > StrictMathBench.powDouble N/A thrpt 8 178818.861 ? 16235.465 ops/ms > ---------------------------- > > > * Linux/AMD > > --------- Before ----------- > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble 0 thrpt 8 100741.348 ? 207.766 ops/ms > MathBench.powDouble0Dot5 0 thrpt 8 33896.623 ? 103.352 ops/ms > MathBench.powDouble0Dot5Const 0 thrpt 8 34195.944 ? 230.703 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.039 ? 0.001 ops/ms > MathBench.powDoubleLoop 0 thrpt 8 0.038 ? 0.001 ops/ms > StrictMathBench.powDouble N/A thrpt 8 72000.166 ? 135.002 ops/ms > ---------------------------- > > --------- After ----------- > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble 0 thrpt 8 100738.866 ? 222.820 ops/ms > MathBench.powDouble0Dot5 0 thrpt 8 100799.098 ? 95.537 ops/ms <-- 3.0x up > MathBench.powDouble0Dot5Const 0 thrpt 8 100765.571 ? 178.436 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.244 ? 0.002 ops/ms <-- 6.3x up > MathBench.powDoubleLoop 0 thrpt 8 0.038 ? 0.001 ops/ms > StrictMathBench.powDouble N/A thrpt 8 71758.725 ? 339.660 ops/ms > ---------------------------- > > > * MacOS/Intel > > --------- Before ----------- > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble 0 thrpt 8 238064.722 ? 5181.318 ops/ms > MathBench.powDouble0Dot5 0 thrpt 8 59235.979 ? 2046.519 ops/ms > MathBench.powDouble0Dot5Const 0 thrpt 8 59695.014 ? 1079.692 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.040 ? 0.001 ops/ms > MathBench.powDoubleLoop 0 thrpt 8 0.041 ? 0.001 ops/ms > StrictMathBench.powDouble N/A thrpt 8 238391.026 ? 2743.385 ops/ms > ---------------------------- > > --------- After ----------- > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble 0 thrpt 8 238582.414 ? 3661.261 ops/ms > MathBench.powDouble0Dot5 0 thrpt 8 224102.701 ? 2846.892 ops/ms <-- 3.8x up > MathBench.powDouble0Dot5Const 0 thrpt 8 224542.331 ? 19027.596 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.158 ? 0.002 ops/ms <-- 4.0x up > MathBench.powDoubleLoop 0 thrpt 8 0.041 ? 0.001 ops/ms > StrictMathBench.powDouble N/A thrpt 8 233689.504 ? 10141.034 ops/ms > ---------------------------- Jie Fu has updated the pull request incrementally with one additional commit since the last revision: Fix tests ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3536/files - new: https://git.openjdk.java.net/jdk/pull/3536/files/a97cb957..dc194975 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3536&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3536&range=01-02 Stats: 5 lines in 2 files changed: 3 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/3536.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3536/head:pull/3536 PR: https://git.openjdk.java.net/jdk/pull/3536 From jiefu at openjdk.java.net Fri Apr 16 23:10:36 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 16 Apr 2021 23:10:36 GMT Subject: RFR: 8265325: Optimize StubRoutines::dpow() for Math.pow(x, 0.5) [v2] In-Reply-To: References: Message-ID: On Fri, 16 Apr 2021 17:22:32 GMT, Vladimir Kozlov wrote: > `r1` should be static value to avoid compiling `sqrt()` method as intrinsic. Hi @vnkozlov , Thanks for your review. Patch has been updated based on your comments. To be honest, I didn't get why r1 should be static value. I think both static and non-static should be OK for the test. So what would happen is sqrt() is intrinsified? Could you please make it more clearer? Thanks. Best regards, Jie ------------- PR: https://git.openjdk.java.net/jdk/pull/3536 From kvn at openjdk.java.net Fri Apr 16 23:41:36 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 16 Apr 2021 23:41:36 GMT Subject: RFR: 8265325: Optimize StubRoutines::dpow() for Math.pow(x, 0.5) [v2] In-Reply-To: References: Message-ID: On Fri, 16 Apr 2021 23:07:20 GMT, Jie Fu wrote: >> test/hotspot/jtreg/compiler/intrinsics/math/TestPow0Dot5Opt.java line 41: >> >>> 39: if (a < 0.0) return; >>> 40: >>> 41: double r1 = Math.sqrt(a); >> >> `r1` should be static value to avoid compiling `sqrt()` method as intrinsic. > >> `r1` should be static value to avoid compiling `sqrt()` method as intrinsic. > > Hi @vnkozlov , > > Thanks for your review. > > Patch has been updated based on your comments. > > To be honest, I didn't get why r1 should be static value. > I think both static and non-static should be OK for the test. > > So what would happen is sqrt() is intrinsified? > Could you please make it more clearer? > > Thanks. > Best regards, > Jie Sorry, my bad. I thought to calculate `sqrt(a)` outside `test` method so that only `pow()` code is tested. Like calculating `gold` value not in compiled code. But the test have to use sqrt() for each value `a` and `sqrt` is 1 HW instruction so my suggestion was stupid. Please, revert this change. Original test code was fine. ------------- PR: https://git.openjdk.java.net/jdk/pull/3536 From jiefu at openjdk.java.net Fri Apr 16 23:51:06 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 16 Apr 2021 23:51:06 GMT Subject: RFR: 8265325: Optimize StubRoutines::dpow() for Math.pow(x, 0.5) [v4] In-Reply-To: References: Message-ID: <6Nww_kOuxx9sjowFZjd0t1rYAOIkSaDDrGb1vkMc2iY=.0c81042a-a3b8-49b4-b718-112d1e7348f4@github.com> > Hi all, > > I'd like to optimize the StubRoutines::dpow() for Math.pow(x, 0.5). > > In the pow and sqrt discussion [1], Joe taught me that the Java library implementation of pow has been optimized for pow(x, 2.0) [2] and pow(x, 0.5) [3]. > However, the hotspot StubRoutines::dpow() only implements the same opt for pow(x, 2.0), but still not for pow(x, 0.5). > This patch optimizes StubRoutines::dpow() for pow(x, 0.5). > > Although not all Math.pow(x, 0.5) can be replaced with sqrt(x), we can still do it safely for the following cases: > 1) x >= 0.0 (fully implemented) > 2) x is +Inf (fully implemented) > 3) x is NaN (can be further divided into +NaN and -NaN and only +NaN is implemented) > > The effect of this opt has been tested on serveral platforms showing 3.0x ~ 6.3x performance improvement. > And no performance drop was observed. > > Testing: > - tier1 ~ tier3 on Linux/x64 > > Thanks. > Best regards, > Jie > > [1] https://mail.openjdk.java.net/pipermail/core-libs-dev/2021-April/076220.html > [2] https://github.com/openjdk/jdk/blob/d84a7e55be40eae57b6c322694d55661a5053a55/src/java.base/share/classes/java/lang/FdLibm.java#L362 > [3] https://github.com/openjdk/jdk/blob/d84a7e55be40eae57b6c322694d55661a5053a55/src/java.base/share/classes/java/lang/FdLibm.java#L364 > > Detailed performance numbers: > * Linux/Intel > > --------- Before ----------- > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble 0 thrpt 8 218783.605 ? 838.379 ops/ms > MathBench.powDouble0Dot5 0 thrpt 8 45498.351 ? 7.558 ops/ms > MathBench.powDouble0Dot5Const 0 thrpt 8 45243.530 ? 1097.100 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.031 ? 0.001 ops/ms > MathBench.powDoubleLoop 0 thrpt 8 0.031 ? 0.001 ops/ms > StrictMathBench.powDouble N/A thrpt 8 176106.602 ? 13127.650 ops/ms > ---------------------------- > > --------- After ----------- > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble 0 thrpt 8 219930.462 ? 181.922 ops/ms > MathBench.powDouble0Dot5 0 thrpt 8 204966.834 ? 329.032 ops/ms <-- 4.5x up > MathBench.powDouble0Dot5Const 0 thrpt 8 203004.302 ? 684.072 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.121 ? 0.001 ops/ms <-- 3.9x up > MathBench.powDoubleLoop 0 thrpt 8 0.031 ? 0.001 ops/ms > StrictMathBench.powDouble N/A thrpt 8 178818.861 ? 16235.465 ops/ms > ---------------------------- > > > * Linux/AMD > > --------- Before ----------- > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble 0 thrpt 8 100741.348 ? 207.766 ops/ms > MathBench.powDouble0Dot5 0 thrpt 8 33896.623 ? 103.352 ops/ms > MathBench.powDouble0Dot5Const 0 thrpt 8 34195.944 ? 230.703 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.039 ? 0.001 ops/ms > MathBench.powDoubleLoop 0 thrpt 8 0.038 ? 0.001 ops/ms > StrictMathBench.powDouble N/A thrpt 8 72000.166 ? 135.002 ops/ms > ---------------------------- > > --------- After ----------- > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble 0 thrpt 8 100738.866 ? 222.820 ops/ms > MathBench.powDouble0Dot5 0 thrpt 8 100799.098 ? 95.537 ops/ms <-- 3.0x up > MathBench.powDouble0Dot5Const 0 thrpt 8 100765.571 ? 178.436 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.244 ? 0.002 ops/ms <-- 6.3x up > MathBench.powDoubleLoop 0 thrpt 8 0.038 ? 0.001 ops/ms > StrictMathBench.powDouble N/A thrpt 8 71758.725 ? 339.660 ops/ms > ---------------------------- > > > * MacOS/Intel > > --------- Before ----------- > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble 0 thrpt 8 238064.722 ? 5181.318 ops/ms > MathBench.powDouble0Dot5 0 thrpt 8 59235.979 ? 2046.519 ops/ms > MathBench.powDouble0Dot5Const 0 thrpt 8 59695.014 ? 1079.692 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.040 ? 0.001 ops/ms > MathBench.powDoubleLoop 0 thrpt 8 0.041 ? 0.001 ops/ms > StrictMathBench.powDouble N/A thrpt 8 238391.026 ? 2743.385 ops/ms > ---------------------------- > > --------- After ----------- > Benchmark (seed) Mode Cnt Score Error Units > MathBench.powDouble 0 thrpt 8 238582.414 ? 3661.261 ops/ms > MathBench.powDouble0Dot5 0 thrpt 8 224102.701 ? 2846.892 ops/ms <-- 3.8x up > MathBench.powDouble0Dot5Const 0 thrpt 8 224542.331 ? 19027.596 ops/ms > MathBench.powDouble0Dot5Loop 0 thrpt 8 0.158 ? 0.002 ops/ms <-- 4.0x up > MathBench.powDoubleLoop 0 thrpt 8 0.041 ? 0.001 ops/ms > StrictMathBench.powDouble N/A thrpt 8 233689.504 ? 10141.034 ops/ms > ---------------------------- Jie Fu has updated the pull request incrementally with one additional commit since the last revision: Revert TestPow0Dot5Opt.java change ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3536/files - new: https://git.openjdk.java.net/jdk/pull/3536/files/dc194975..e745e4ba Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3536&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3536&range=02-03 Stats: 3 lines in 1 file changed: 0 ins; 2 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3536.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3536/head:pull/3536 PR: https://git.openjdk.java.net/jdk/pull/3536 From jiefu at openjdk.java.net Fri Apr 16 23:51:06 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 16 Apr 2021 23:51:06 GMT Subject: RFR: 8265325: Optimize StubRoutines::dpow() for Math.pow(x, 0.5) [v2] In-Reply-To: References: Message-ID: On Fri, 16 Apr 2021 23:38:38 GMT, Vladimir Kozlov wrote: > Sorry, my bad. I thought to calculate `sqrt(a)` outside `test` method so that only `pow()` code is tested. > Like calculating `gold` value not in compiled code. > But the test have to use sqrt() for each value `a` and `sqrt` is 1 HW instruction so my suggestion was stupid. > Please, revert this change. Original test code was fine. Thanks for your clarification. I got your point. Updated. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/3536 From whuang at openjdk.java.net Sat Apr 17 01:51:36 2021 From: whuang at openjdk.java.net (Wang Huang) Date: Sat, 17 Apr 2021 01:51:36 GMT Subject: RFR: 8265244: assert(false) failed: bad AD file [v2] In-Reply-To: References: Message-ID: On Fri, 16 Apr 2021 02:10:10 GMT, Wang Huang wrote: >> * aarch64 can only accept `VectorReinterpret` with 64/128 bits. >> * I will fix this bug by adding a rule for `VectorReinterpret` in `match_rule_supported_vector` >> * after changing note with @nsjian and @XiaohongGong , I think that two checks in `inline_vector_conver` is useless now. However, these checks impact other cpus, so I need more reviewers. > > Wang Huang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge branch 'master' into JDK-8265244 > - 8265244: assert(false) failed: bad AD file @iwanowww Can you do me a favor to review this patch? Thank you. ------------- PR: https://git.openjdk.java.net/jdk/pull/3507 From kvn at openjdk.java.net Sat Apr 17 01:52:35 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Sat, 17 Apr 2021 01:52:35 GMT Subject: RFR: 8265325: Optimize StubRoutines::dpow() for Math.pow(x, 0.5) [v4] In-Reply-To: <6Nww_kOuxx9sjowFZjd0t1rYAOIkSaDDrGb1vkMc2iY=.0c81042a-a3b8-49b4-b718-112d1e7348f4@github.com> References: <6Nww_kOuxx9sjowFZjd0t1rYAOIkSaDDrGb1vkMc2iY=.0c81042a-a3b8-49b4-b718-112d1e7348f4@github.com> Message-ID: On Fri, 16 Apr 2021 23:51:06 GMT, Jie Fu wrote: >> Hi all, >> >> I'd like to optimize the StubRoutines::dpow() for Math.pow(x, 0.5). >> >> In the pow and sqrt discussion [1], Joe taught me that the Java library implementation of pow has been optimized for pow(x, 2.0) [2] and pow(x, 0.5) [3]. >> However, the hotspot StubRoutines::dpow() only implements the same opt for pow(x, 2.0), but still not for pow(x, 0.5). >> This patch optimizes StubRoutines::dpow() for pow(x, 0.5). >> >> Although not all Math.pow(x, 0.5) can be replaced with sqrt(x), we can still do it safely for the following cases: >> 1) x >= 0.0 (fully implemented) >> 2) x is +Inf (fully implemented) >> 3) x is NaN (can be further divided into +NaN and -NaN and only +NaN is implemented) >> >> The effect of this opt has been tested on serveral platforms showing 3.0x ~ 6.3x performance improvement. >> And no performance drop was observed. >> >> Testing: >> - tier1 ~ tier3 on Linux/x64 >> >> Thanks. >> Best regards, >> Jie >> >> [1] https://mail.openjdk.java.net/pipermail/core-libs-dev/2021-April/076220.html >> [2] https://github.com/openjdk/jdk/blob/d84a7e55be40eae57b6c322694d55661a5053a55/src/java.base/share/classes/java/lang/FdLibm.java#L362 >> [3] https://github.com/openjdk/jdk/blob/d84a7e55be40eae57b6c322694d55661a5053a55/src/java.base/share/classes/java/lang/FdLibm.java#L364 >> >> Detailed performance numbers: >> * Linux/Intel >> >> --------- Before ----------- >> Benchmark (seed) Mode Cnt Score Error Units >> MathBench.powDouble 0 thrpt 8 218783.605 ? 838.379 ops/ms >> MathBench.powDouble0Dot5 0 thrpt 8 45498.351 ? 7.558 ops/ms >> MathBench.powDouble0Dot5Const 0 thrpt 8 45243.530 ? 1097.100 ops/ms >> MathBench.powDouble0Dot5Loop 0 thrpt 8 0.031 ? 0.001 ops/ms >> MathBench.powDoubleLoop 0 thrpt 8 0.031 ? 0.001 ops/ms >> StrictMathBench.powDouble N/A thrpt 8 176106.602 ? 13127.650 ops/ms >> ---------------------------- >> >> --------- After ----------- >> Benchmark (seed) Mode Cnt Score Error Units >> MathBench.powDouble 0 thrpt 8 219930.462 ? 181.922 ops/ms >> MathBench.powDouble0Dot5 0 thrpt 8 204966.834 ? 329.032 ops/ms <-- 4.5x up >> MathBench.powDouble0Dot5Const 0 thrpt 8 203004.302 ? 684.072 ops/ms >> MathBench.powDouble0Dot5Loop 0 thrpt 8 0.121 ? 0.001 ops/ms <-- 3.9x up >> MathBench.powDoubleLoop 0 thrpt 8 0.031 ? 0.001 ops/ms >> StrictMathBench.powDouble N/A thrpt 8 178818.861 ? 16235.465 ops/ms >> ---------------------------- >> >> >> * Linux/AMD >> >> --------- Before ----------- >> Benchmark (seed) Mode Cnt Score Error Units >> MathBench.powDouble 0 thrpt 8 100741.348 ? 207.766 ops/ms >> MathBench.powDouble0Dot5 0 thrpt 8 33896.623 ? 103.352 ops/ms >> MathBench.powDouble0Dot5Const 0 thrpt 8 34195.944 ? 230.703 ops/ms >> MathBench.powDouble0Dot5Loop 0 thrpt 8 0.039 ? 0.001 ops/ms >> MathBench.powDoubleLoop 0 thrpt 8 0.038 ? 0.001 ops/ms >> StrictMathBench.powDouble N/A thrpt 8 72000.166 ? 135.002 ops/ms >> ---------------------------- >> >> --------- After ----------- >> Benchmark (seed) Mode Cnt Score Error Units >> MathBench.powDouble 0 thrpt 8 100738.866 ? 222.820 ops/ms >> MathBench.powDouble0Dot5 0 thrpt 8 100799.098 ? 95.537 ops/ms <-- 3.0x up >> MathBench.powDouble0Dot5Const 0 thrpt 8 100765.571 ? 178.436 ops/ms >> MathBench.powDouble0Dot5Loop 0 thrpt 8 0.244 ? 0.002 ops/ms <-- 6.3x up >> MathBench.powDoubleLoop 0 thrpt 8 0.038 ? 0.001 ops/ms >> StrictMathBench.powDouble N/A thrpt 8 71758.725 ? 339.660 ops/ms >> ---------------------------- >> >> >> * MacOS/Intel >> >> --------- Before ----------- >> Benchmark (seed) Mode Cnt Score Error Units >> MathBench.powDouble 0 thrpt 8 238064.722 ? 5181.318 ops/ms >> MathBench.powDouble0Dot5 0 thrpt 8 59235.979 ? 2046.519 ops/ms >> MathBench.powDouble0Dot5Const 0 thrpt 8 59695.014 ? 1079.692 ops/ms >> MathBench.powDouble0Dot5Loop 0 thrpt 8 0.040 ? 0.001 ops/ms >> MathBench.powDoubleLoop 0 thrpt 8 0.041 ? 0.001 ops/ms >> StrictMathBench.powDouble N/A thrpt 8 238391.026 ? 2743.385 ops/ms >> ---------------------------- >> >> --------- After ----------- >> Benchmark (seed) Mode Cnt Score Error Units >> MathBench.powDouble 0 thrpt 8 238582.414 ? 3661.261 ops/ms >> MathBench.powDouble0Dot5 0 thrpt 8 224102.701 ? 2846.892 ops/ms <-- 3.8x up >> MathBench.powDouble0Dot5Const 0 thrpt 8 224542.331 ? 19027.596 ops/ms >> MathBench.powDouble0Dot5Loop 0 thrpt 8 0.158 ? 0.002 ops/ms <-- 4.0x up >> MathBench.powDoubleLoop 0 thrpt 8 0.041 ? 0.001 ops/ms >> StrictMathBench.powDouble N/A thrpt 8 233689.504 ? 10141.034 ops/ms >> ---------------------------- > > Jie Fu has updated the pull request incrementally with one additional commit since the last revision: > > Revert TestPow0Dot5Opt.java change Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3536 From whuang at openjdk.java.net Sat Apr 17 03:39:36 2021 From: whuang at openjdk.java.net (Wang Huang) Date: Sat, 17 Apr 2021 03:39:36 GMT Subject: RFR: 8263006: Add optimization for Max(*)Node and Min(*)Node [v2] In-Reply-To: References: Message-ID: On Fri, 16 Apr 2021 01:40:01 GMT, Wang Huang wrote: >> * I optimize `max` and `min` by using these identities >> - op (max(a,b) , min(a,b))=== op(a,b) >> - if op is commutable >> - example : >> - max(a,b) + min(a,b))=== a + b // op = add >> - max(a,b) * min(a,b))=== a * b // op = mul >> - max( max(a,b) , min(a,b)))=== max(a,b) // op = max() >> - min( max(a,b) , min(a,b)))=== max(a,b) // op = min() >> * Test case >> ```java >> /* >> * Copyright (c) 2021, Huawei Technologies Co. Ltd. All rights reserved. >> * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. >> * >> * This code is free software; you can redistribute it and/or modify it >> * under the terms of the GNU General Public License version 2 only, as >> * published by the Free Software Foundation. >> * >> * This code is distributed in the hope that it will be useful, but WITHOUT >> * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or >> * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License >> * version 2 for more details (a copy is included in the LICENSE file that >> * accompanied this code). >> * >> * You should have received a copy of the GNU General Public License version >> * 2 along with this work; if not, write to the Free Software Foundation, >> * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. >> * >> * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA >> * or visit www.oracle.com if you need additional information or have any >> * questions. >> */ >> package org.sample; >> >> import org.openjdk.jmh.annotations.Benchmark; >> import org.openjdk.jmh.annotations.*; >> >> import java.util.Random; >> import java.util.concurrent.TimeUnit; >> import org.openjdk.jmh.infra.Blackhole; >> >> @BenchmarkMode({Mode.AverageTime}) >> @OutputTimeUnit(TimeUnit.MICROSECONDS) >> public class MyBenchmark { >> >> static int length = 100000; >> static double[] data1 = new double[length]; >> static double[] data2 = new double[length]; >> static Random random = new Random(); >> >> static { >> for(int i = 0; i < length; ++i) { >> data1[i] = random.nextDouble(); >> data2[i] = random.nextDouble(); >> } >> } >> >> @Benchmark >> public void testAdd(Blackhole bh) { >> double sum = 0; >> for (int i = 0; i < length; i++) { >> sum += Math.max(data1[i], data2[i]) + Math.min(data1[i], data2[i]); >> } >> bh.consume(sum); >> } >> >> @Benchmark >> public void testMax(Blackhole bh) { >> double sum = 0; >> for (int i = 0; i < length; i++) { >> sum += Math.max(Math.max(data1[i], data2[i]), Math.min(data1[i], data2[i])); >> } >> bh.consume(sum); >> } >> >> @Benchmark >> public void testMin(Blackhole bh) { >> double sum = 0; >> for (int i = 0; i < length; i++) { >> sum += Math.min(Math.max(data1[i], data2[i]), Math.min(data1[i], data2[i])); >> } >> bh.consume(sum); >> } >> >> @Benchmark >> public void testMul(Blackhole bh) { >> double sum = 0; >> for (int i = 0; i < length; i++) { >> sum += (Math.max(data1[i], data2[i]) * Math.min(data1[i], data2[i])); >> } >> bh.consume(sum); >> } >> } >> ``` >> >> * The result is listed here (aarch64): >> >> before: >> >> |Benchmark| Mode| Samples| Score| Score error| Units| >> |---| ---| ---| ---| --- | ---| >> |o.s.MyBenchmark.testAdd |avgt | 10 | 556.048 | 32.368 | us/op | >> | o.s.MyBenchmark.testMax | avgt | 10 |543.065 | 54.221 | us/op | >> | o.s.MyBenchmark.testMin | avgt |10 |570.731 | 37.630 | us/op | >> | o.s.MyBenchmark.testMul | avgt | 10 | 531.906 | 20.518 | us/op | >> >> after: >> >> |Benchmark| Mode| Samples| Score| Score error| Units| >> |---| ---| ---| ---| --- | ---| >> | o.s.MyBenchmark.testAdd | avgt | 10 | 319.350 | 9.248 | us/op | >> | o.s.MyBenchmark.testMax | avgt | 10 | 356.138 | 10.736 | us/op | >> | o.s.MyBenchmark.testMin | avgt | 10 | 323.731 | 16.621 | us/op | >> | o.s.MyBenchmark.testMul | avgt | 10 | 338.458 | 23.755 | us/op | >> >> * I have tested `NaN` ` INFINITY` and `-INFINITY` and got same result (before/after) > > Wang Huang has updated the pull request incrementally with one additional commit since the last revision: > > adjust code style @vnkozlov Can you do me a favor to review this optimization? Thank you very much. ------------- PR: https://git.openjdk.java.net/jdk/pull/3513 From github.com+40024232+sunny868 at openjdk.java.net Sat Apr 17 05:49:36 2021 From: github.com+40024232+sunny868 at openjdk.java.net (SUN Guoyun) Date: Sat, 17 Apr 2021 05:49:36 GMT Subject: RFR: 8265105: gc/arguments/TestSelectDefaultGC.java fails when compiler1 is disabled In-Reply-To: References: Message-ID: <-Av02XcjAMlPB7vBxjMrVi9RQnUJtMj5As1wY7bkeEI=.038fd344-04b8-4140-b216-b020979cee74@github.com> On Tue, 13 Apr 2021 05:46:15 GMT, SUN Guoyun wrote: > On MIPS64 platform has not impliment C1,only has C2. > so when tiered compilation is off, it is unnecessary to set client emulation mode flags. > perhaps this bug be included by https://bugs.openjdk.java.net/browse/JDK-8251462 Maybe yes. And I am not clear between TieredCompilation NeverActAsServerClassMachine and AlwaysActAsServerClassMachine much. ------------- PR: https://git.openjdk.java.net/jdk/pull/3449 From kvn at openjdk.java.net Sat Apr 17 05:58:35 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Sat, 17 Apr 2021 05:58:35 GMT Subject: RFR: 8263006: Add optimization for Max(*)Node and Min(*)Node [v2] In-Reply-To: References: Message-ID: On Fri, 16 Apr 2021 01:40:01 GMT, Wang Huang wrote: >> * I optimize `max` and `min` by using these identities >> - op (max(a,b) , min(a,b))=== op(a,b) >> - if op is commutable >> - example : >> - max(a,b) + min(a,b))=== a + b // op = add >> - max(a,b) * min(a,b))=== a * b // op = mul >> - max( max(a,b) , min(a,b)))=== max(a,b) // op = max() >> - min( max(a,b) , min(a,b)))=== max(a,b) // op = min() >> * Test case >> ```java >> /* >> * Copyright (c) 2021, Huawei Technologies Co. Ltd. All rights reserved. >> * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. >> * >> * This code is free software; you can redistribute it and/or modify it >> * under the terms of the GNU General Public License version 2 only, as >> * published by the Free Software Foundation. >> * >> * This code is distributed in the hope that it will be useful, but WITHOUT >> * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or >> * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License >> * version 2 for more details (a copy is included in the LICENSE file that >> * accompanied this code). >> * >> * You should have received a copy of the GNU General Public License version >> * 2 along with this work; if not, write to the Free Software Foundation, >> * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. >> * >> * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA >> * or visit www.oracle.com if you need additional information or have any >> * questions. >> */ >> package org.sample; >> >> import org.openjdk.jmh.annotations.Benchmark; >> import org.openjdk.jmh.annotations.*; >> >> import java.util.Random; >> import java.util.concurrent.TimeUnit; >> import org.openjdk.jmh.infra.Blackhole; >> >> @BenchmarkMode({Mode.AverageTime}) >> @OutputTimeUnit(TimeUnit.MICROSECONDS) >> public class MyBenchmark { >> >> static int length = 100000; >> static double[] data1 = new double[length]; >> static double[] data2 = new double[length]; >> static Random random = new Random(); >> >> static { >> for(int i = 0; i < length; ++i) { >> data1[i] = random.nextDouble(); >> data2[i] = random.nextDouble(); >> } >> } >> >> @Benchmark >> public void testAdd(Blackhole bh) { >> double sum = 0; >> for (int i = 0; i < length; i++) { >> sum += Math.max(data1[i], data2[i]) + Math.min(data1[i], data2[i]); >> } >> bh.consume(sum); >> } >> >> @Benchmark >> public void testMax(Blackhole bh) { >> double sum = 0; >> for (int i = 0; i < length; i++) { >> sum += Math.max(Math.max(data1[i], data2[i]), Math.min(data1[i], data2[i])); >> } >> bh.consume(sum); >> } >> >> @Benchmark >> public void testMin(Blackhole bh) { >> double sum = 0; >> for (int i = 0; i < length; i++) { >> sum += Math.min(Math.max(data1[i], data2[i]), Math.min(data1[i], data2[i])); >> } >> bh.consume(sum); >> } >> >> @Benchmark >> public void testMul(Blackhole bh) { >> double sum = 0; >> for (int i = 0; i < length; i++) { >> sum += (Math.max(data1[i], data2[i]) * Math.min(data1[i], data2[i])); >> } >> bh.consume(sum); >> } >> } >> ``` >> >> * The result is listed here (aarch64): >> >> before: >> >> |Benchmark| Mode| Samples| Score| Score error| Units| >> |---| ---| ---| ---| --- | ---| >> |o.s.MyBenchmark.testAdd |avgt | 10 | 556.048 | 32.368 | us/op | >> | o.s.MyBenchmark.testMax | avgt | 10 |543.065 | 54.221 | us/op | >> | o.s.MyBenchmark.testMin | avgt |10 |570.731 | 37.630 | us/op | >> | o.s.MyBenchmark.testMul | avgt | 10 | 531.906 | 20.518 | us/op | >> >> after: >> >> |Benchmark| Mode| Samples| Score| Score error| Units| >> |---| ---| ---| ---| --- | ---| >> | o.s.MyBenchmark.testAdd | avgt | 10 | 319.350 | 9.248 | us/op | >> | o.s.MyBenchmark.testMax | avgt | 10 | 356.138 | 10.736 | us/op | >> | o.s.MyBenchmark.testMin | avgt | 10 | 323.731 | 16.621 | us/op | >> | o.s.MyBenchmark.testMul | avgt | 10 | 338.458 | 23.755 | us/op | >> >> * I have tested `NaN` ` INFINITY` and `-INFINITY` and got same result (before/after) > > Wang Huang has updated the pull request incrementally with one additional commit since the last revision: > > adjust code style Do you have a real example in Java applications which benefit from this optimization? We should not add and **support** code which would never be used in real world. Optimization will not work for Integer because of `_min` and `_max` intrinsic which generates `cmove`: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/library_call.cpp#L1806 I am not sure if this optimization will always work for float/double because of NaN values. You need to verify results for all edge cases. ------------- PR: https://git.openjdk.java.net/jdk/pull/3513 From whuang at openjdk.java.net Sat Apr 17 06:42:40 2021 From: whuang at openjdk.java.net (Wang Huang) Date: Sat, 17 Apr 2021 06:42:40 GMT Subject: RFR: 8263006: Add optimization for Max(*)Node and Min(*)Node [v2] In-Reply-To: References: Message-ID: On Fri, 16 Apr 2021 01:40:01 GMT, Wang Huang wrote: >> * I optimize `max` and `min` by using these identities >> - op (max(a,b) , min(a,b))=== op(a,b) >> - if op is commutable >> - example : >> - max(a,b) + min(a,b))=== a + b // op = add >> - max(a,b) * min(a,b))=== a * b // op = mul >> - max( max(a,b) , min(a,b)))=== max(a,b) // op = max() >> - min( max(a,b) , min(a,b)))=== max(a,b) // op = min() >> * Test case >> ```java >> /* >> * Copyright (c) 2021, Huawei Technologies Co. Ltd. All rights reserved. >> * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. >> * >> * This code is free software; you can redistribute it and/or modify it >> * under the terms of the GNU General Public License version 2 only, as >> * published by the Free Software Foundation. >> * >> * This code is distributed in the hope that it will be useful, but WITHOUT >> * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or >> * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License >> * version 2 for more details (a copy is included in the LICENSE file that >> * accompanied this code). >> * >> * You should have received a copy of the GNU General Public License version >> * 2 along with this work; if not, write to the Free Software Foundation, >> * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. >> * >> * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA >> * or visit www.oracle.com if you need additional information or have any >> * questions. >> */ >> package org.sample; >> >> import org.openjdk.jmh.annotations.Benchmark; >> import org.openjdk.jmh.annotations.*; >> >> import java.util.Random; >> import java.util.concurrent.TimeUnit; >> import org.openjdk.jmh.infra.Blackhole; >> >> @BenchmarkMode({Mode.AverageTime}) >> @OutputTimeUnit(TimeUnit.MICROSECONDS) >> public class MyBenchmark { >> >> static int length = 100000; >> static double[] data1 = new double[length]; >> static double[] data2 = new double[length]; >> static Random random = new Random(); >> >> static { >> for(int i = 0; i < length; ++i) { >> data1[i] = random.nextDouble(); >> data2[i] = random.nextDouble(); >> } >> } >> >> @Benchmark >> public void testAdd(Blackhole bh) { >> double sum = 0; >> for (int i = 0; i < length; i++) { >> sum += Math.max(data1[i], data2[i]) + Math.min(data1[i], data2[i]); >> } >> bh.consume(sum); >> } >> >> @Benchmark >> public void testMax(Blackhole bh) { >> double sum = 0; >> for (int i = 0; i < length; i++) { >> sum += Math.max(Math.max(data1[i], data2[i]), Math.min(data1[i], data2[i])); >> } >> bh.consume(sum); >> } >> >> @Benchmark >> public void testMin(Blackhole bh) { >> double sum = 0; >> for (int i = 0; i < length; i++) { >> sum += Math.min(Math.max(data1[i], data2[i]), Math.min(data1[i], data2[i])); >> } >> bh.consume(sum); >> } >> >> @Benchmark >> public void testMul(Blackhole bh) { >> double sum = 0; >> for (int i = 0; i < length; i++) { >> sum += (Math.max(data1[i], data2[i]) * Math.min(data1[i], data2[i])); >> } >> bh.consume(sum); >> } >> } >> ``` >> >> * The result is listed here (aarch64): >> >> before: >> >> |Benchmark| Mode| Samples| Score| Score error| Units| >> |---| ---| ---| ---| --- | ---| >> |o.s.MyBenchmark.testAdd |avgt | 10 | 556.048 | 32.368 | us/op | >> | o.s.MyBenchmark.testMax | avgt | 10 |543.065 | 54.221 | us/op | >> | o.s.MyBenchmark.testMin | avgt |10 |570.731 | 37.630 | us/op | >> | o.s.MyBenchmark.testMul | avgt | 10 | 531.906 | 20.518 | us/op | >> >> after: >> >> |Benchmark| Mode| Samples| Score| Score error| Units| >> |---| ---| ---| ---| --- | ---| >> | o.s.MyBenchmark.testAdd | avgt | 10 | 319.350 | 9.248 | us/op | >> | o.s.MyBenchmark.testMax | avgt | 10 | 356.138 | 10.736 | us/op | >> | o.s.MyBenchmark.testMin | avgt | 10 | 323.731 | 16.621 | us/op | >> | o.s.MyBenchmark.testMul | avgt | 10 | 338.458 | 23.755 | us/op | >> >> * I have tested `NaN` ` INFINITY` and `-INFINITY` and got same result (before/after) > > Wang Huang has updated the pull request incrementally with one additional commit since the last revision: > > adjust code style Thank you for your review. > Do you have a real example in Java applications which benefit from this optimization? > We should not add and **support** code which would never be used in real world. > Yes. We refined this optimization from our internal software experience. For instance, the model `min( max(a,b) , min(a,b)))` exists in many source codes in some AI projects. > Optimization will not work for Integer because of `_min` and `_max` intrinsic which generates `cmove`: > https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/library_call.cpp#L1806 > Yes. Adding `MaxINode`'s `max_opcode` is just for `max_opcode` method is abstract. Our test cases is for float types only. > I am not sure if this optimization will always work for float/double because of NaN values. > > You need to verify results for all edge cases. I have tested that and showed in my comments. The test cases for NaN values and other special values are listed here import java.lang.Math; public class Test { public static void main(String[] args) throws Exception { Test m = new Test(); m.test(); } public void test() throws Exception { double[] num = new double[9]; num[0] = 1; num[1] = 0; num[2] = -0; num[3] = Double.POSITIVE_INFINITY; num[4] = Double.NEGATIVE_INFINITY; num[5] = Double.NaN; num[6] = Double.MAX_VALUE; num[7] = Double.MIN_VALUE; num[8] = Double.MIN_NORMAL; for(int i = 0; i < 9; i++) { for(int j = 0; j < 9; j++) { check(add_opt(num[i], num[j]), (num[i] + num[j])); check(mul_opt(num[i], num[j]), (num[i] * num[j])); check(max_opt(num[i], num[j]), Math.max(num[i], num[j])); check(min_opt(num[i], num[j]), Math.min(num[i], num[j])); } } } public void check(double a, double b) { if (a != b) { System.out.println("false"); System.out.println(a); System.out.println(b); System.out.println(); } } public double add_opt(double a, double b) throws Exception { return Math.max(a, b) + Math.min(a, b); } public double mul_opt(double a, double b) throws Exception { return Math.max(a, b) * Math.min(a, b); } public double max_opt(double a, double b) throws Exception { return Math.max(Math.max(a, b), Math.min(a, b)); } public double min_opt(double a, double b) throws Exception { return Math.min(Math.max(a, b), Math.min(a, b)); } } The `NaN` is a special case. Because `NaN == NaN` is false in Java, so I run the case and check the result. Should I add the other test cases for `NaN` ? ------------- PR: https://git.openjdk.java.net/jdk/pull/3513 From iveresov at openjdk.java.net Sat Apr 17 19:12:38 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Sat, 17 Apr 2021 19:12:38 GMT Subject: RFR: 8265105: gc/arguments/TestSelectDefaultGC.java fails when compiler1 is disabled In-Reply-To: References: Message-ID: On Tue, 13 Apr 2021 05:46:15 GMT, SUN Guoyun wrote: > On MIPS64 platform has not impliment C1,only has C2. > so when tiered compilation is off, it is unnecessary to set client emulation mode flags. > perhaps this bug be included by https://bugs.openjdk.java.net/browse/JDK-8251462 Ok, I'll try to reproduce it and take a look. ------------- PR: https://git.openjdk.java.net/jdk/pull/3449 From iveresov at openjdk.java.net Sat Apr 17 20:11:34 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Sat, 17 Apr 2021 20:11:34 GMT Subject: RFR: 8265105: gc/arguments/TestSelectDefaultGC.java fails when compiler1 is disabled In-Reply-To: References: Message-ID: On Tue, 13 Apr 2021 05:46:15 GMT, SUN Guoyun wrote: > On MIPS64 platform has not impliment C1,only has C2. > so when tiered compilation is off, it is unnecessary to set client emulation mode flags. > perhaps this bug be included by https://bugs.openjdk.java.net/browse/JDK-8251462 So, I would propose doing it the following way. Basically, I think we should put ```if (has_c1())``` around the whole thing because ```NeverActAsServerClassMachine``` doesn't make any sense without C1. diff --git a/src/hotspot/share/compiler/compilerDefinitions.cpp b/src/hotspot/share/compiler/compilerDefinitions.cpp index f8eff0a6917..7db439373a1 100644 --- a/src/hotspot/share/compiler/compilerDefinitions.cpp +++ b/src/hotspot/share/compiler/compilerDefinitions.cpp @@ -159,7 +159,8 @@ intx CompilerConfig::scaled_freq_log(intx freq_log, double scale) { } } -void set_client_emulation_mode_flags() { +void CompilerConfig::set_client_emulation_mode_flags() { + assert(has_c1(), "Must have C1 compiler present"); CompilationModeFlag::set_quick_only(); FLAG_SET_ERGO(ProfileInterpreter, false); @@ -560,17 +561,19 @@ void CompilerConfig::ergo_initialize() { return; #endif - if (!is_compilation_mode_selected()) { + if (has_c1()) { + if (!is_compilation_mode_selected()) { #if defined(_WINDOWS) && !defined(_LP64) - if (FLAG_IS_DEFAULT(NeverActAsServerClassMachine)) { - FLAG_SET_ERGO(NeverActAsServerClassMachine, true); - } + if (FLAG_IS_DEFAULT(NeverActAsServerClassMachine)) { + FLAG_SET_ERGO(NeverActAsServerClassMachine, true); + } #endif - if (NeverActAsServerClassMachine) { + if (NeverActAsServerClassMachine) { + set_client_emulation_mode_flags(); + } + } else if (!has_c2() && !is_jvmci_compiler()) { set_client_emulation_mode_flags(); } - } else if (!has_c2() && !is_jvmci_compiler()) { - set_client_emulation_mode_flags(); } set_legacy_emulation_flags(); diff --git a/src/hotspot/share/compiler/compilerDefinitions.hpp b/src/hotspot/share/compiler/compilerDefinitions.hpp index d87c892f091..2f8794ad43a 100644 --- a/src/hotspot/share/compiler/compilerDefinitions.hpp +++ b/src/hotspot/share/compiler/compilerDefinitions.hpp @@ -246,6 +246,7 @@ private: static void set_compilation_policy_flags(); static void set_jvmci_specific_flags(); static void set_legacy_emulation_flags(); + static void set_client_emulation_mode_flags(); }; #endif // SHARE_COMPILER_COMPILERDEFINITIONS_HPP ------------- PR: https://git.openjdk.java.net/jdk/pull/3449 From dnsimon at openjdk.java.net Sat Apr 17 20:48:07 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Sat, 17 Apr 2021 20:48:07 GMT Subject: RFR: 8252600: [JVMCI] update JVMCI code style and mx configuration Message-ID: <5_y5r2sWg9i6sNhcJrg_GvODVPAPDs60KVqOw8z8jrI=.4e633372-2ba1-4b58-a9ca-04b38c9a91fa@github.com> This PR updates the configuration files used to develop the JVMCI Java and C++ sources with mx and Eclipse. ------------- Commit messages: - 8252600: [JVMCI] update JVMCI code style and mx configuration Changes: https://git.openjdk.java.net/jdk/pull/3559/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3559&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8252600 Stats: 2096 lines in 14 files changed: 752 ins; 1344 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3559.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3559/head:pull/3559 PR: https://git.openjdk.java.net/jdk/pull/3559 From alanb at openjdk.java.net Sun Apr 18 07:34:55 2021 From: alanb at openjdk.java.net (Alan Bateman) Date: Sun, 18 Apr 2021 07:34:55 GMT Subject: RFR: 8252600: [JVMCI] update JVMCI code style and mx configuration In-Reply-To: <5_y5r2sWg9i6sNhcJrg_GvODVPAPDs60KVqOw8z8jrI=.4e633372-2ba1-4b58-a9ca-04b38c9a91fa@github.com> References: <5_y5r2sWg9i6sNhcJrg_GvODVPAPDs60KVqOw8z8jrI=.4e633372-2ba1-4b58-a9ca-04b38c9a91fa@github.com> Message-ID: On Sat, 17 Apr 2021 20:37:08 GMT, Doug Simon wrote: > This PR updates the configuration files used to develop the JVMCI Java and C++ sources with mx and Eclipse. Are you sure it make sense to have this dev config in the openjdk/jdk repo? I would think this is something for the downstream Graal repos. ------------- PR: https://git.openjdk.java.net/jdk/pull/3559 From dnsimon at openjdk.java.net Sun Apr 18 09:53:41 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Sun, 18 Apr 2021 09:53:41 GMT Subject: RFR: 8252600: [JVMCI] update JVMCI code style and mx configuration In-Reply-To: <5_y5r2sWg9i6sNhcJrg_GvODVPAPDs60KVqOw8z8jrI=.4e633372-2ba1-4b58-a9ca-04b38c9a91fa@github.com> References: <5_y5r2sWg9i6sNhcJrg_GvODVPAPDs60KVqOw8z8jrI=.4e633372-2ba1-4b58-a9ca-04b38c9a91fa@github.com> Message-ID: On Sat, 17 Apr 2021 20:37:08 GMT, Doug Simon wrote: > This PR updates the configuration files used to develop the JVMCI Java and C++ sources with mx and Eclipse. Until we have downstream repos for JDK 17, it's very handy to have this config in the JDK. I've tried to keep it only in JVMCI related directories so as to not perturb non-JVMCI sources. However, if the consensus is that this config does not belong in the JDK at all, I will repurpose this PR to remove all such config as it's current broken as is. ------------- PR: https://git.openjdk.java.net/jdk/pull/3559 From lists at steffen-moser.de Sun Apr 18 11:53:11 2021 From: lists at steffen-moser.de (Steffen Moser) Date: Sun, 18 Apr 2021 13:53:11 +0200 Subject: Backport of JDK-8249672 ("Include microcode revision in features_string") to JDK-11.0.10 breaks compiling it on Oracle Solaris 11.4 Message-ID: <473d105d-9cd3-475f-726a-97fe93c2bb4c@steffen-moser.de> Hi all, I am new into JDK contribution/bug reporting, so I really hope I've chosen the right way to report a bug - at least I could neither find a possibility to register for openjdk.java.net nor did my Oracle SSO account work on this site. Bug report: In JDK-8249672, the microcode version of an x86 CPU was added to the "features_string" printed in an hs_err_pidXXXXX log file. As far as I know, it was introduced to JDK-16 and back-ported to both, JDK-15 and JDK-11 last year. While Solaris support was (unfortunately) abandoned in JDK-15, JDK-11 still supports and should further support Solaris on both SPARC and x86 if my information is correct. I desperately need JDK-11 (and probably newer versions of Java) in order to run several modern open-source tools. When trying to compile JDK-11 on Solaris 11.4 SRU 31 on x86 according to [1] and [2], I run into the following compile error problem: "./src/hotspot/cpu/x86/vm_version_x86.cpp", line 753: Error: cpu_microcode_revision is not a member of os. 1 Error(s) detected. The reason is quite obvious. The method os::cpu_microcode_revision() is not defined for the Solaris platform as JDK-8249672 does not alter src/hotspot/os_cpu/solaris_x86/os_solaris_x86.hpp src/hotspot/os_cpu/solaris_x86/os_solaris_x86.cpp The question is: How should we fix it? Is it allowed to call Solaris' /usr/sbin/ucodeadm -v and parse this binary's results or do we have to get the contents from "/dev/ucode" (which seems to be the symlink to the pseudo-device "/devices/pseudo/ucode at 0:ucode" and is accessed by ucodeadm) manually? Any help to make JDK 11 compiling on Solaris 11.4/x86 again would be highly appreciated. Thank you very much in advance! Kind regards, Steffen [1] https://blogs.oracle.com/solaris/building-openjdk-12-using-jdk-8 [2] http://notallmicrosoft.blogspot.com/2020/04/building-openjdk-13-and-openjdk-14-on.html From github.com+40024232+sunny868 at openjdk.java.net Sun Apr 18 13:03:51 2021 From: github.com+40024232+sunny868 at openjdk.java.net (SUN Guoyun) Date: Sun, 18 Apr 2021 13:03:51 GMT Subject: RFR: 8265105: gc/arguments/TestSelectDefaultGC.java fails when compiler1 is disabled In-Reply-To: References: Message-ID: On Tue, 13 Apr 2021 05:46:15 GMT, SUN Guoyun wrote: > On MIPS64 platform has not impliment C1,only has C2. > so when tiered compilation is off, it is unnecessary to set client emulation mode flags. > perhaps this bug be included by https://bugs.openjdk.java.net/browse/JDK-8251462 OK, thank you. I'll update this patch later ------------- PR: https://git.openjdk.java.net/jdk/pull/3449 From dnsimon at openjdk.java.net Sun Apr 18 20:17:14 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Sun, 18 Apr 2021 20:17:14 GMT Subject: RFR: 8252600: remove mx configuration [v2] In-Reply-To: <5_y5r2sWg9i6sNhcJrg_GvODVPAPDs60KVqOw8z8jrI=.4e633372-2ba1-4b58-a9ca-04b38c9a91fa@github.com> References: <5_y5r2sWg9i6sNhcJrg_GvODVPAPDs60KVqOw8z8jrI=.4e633372-2ba1-4b58-a9ca-04b38c9a91fa@github.com> Message-ID: <1PPeQbsoW3AFDIt1L2dYfg5WwykGVftJ47L6AE0aWXE=.ded1e016-5781-44ea-9a3a-46a719c7f584@github.com> > This PR updates the configuration files used to develop the JVMCI Java and C++ sources with mx and Eclipse. Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: 8252600: [JVMCI] remove mx configuration ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3559/files - new: https://git.openjdk.java.net/jdk/pull/3559/files/7026605f..7c18d2d9 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3559&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3559&range=00-01 Stats: 754 lines in 8 files changed: 0 ins; 754 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3559.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3559/head:pull/3559 PR: https://git.openjdk.java.net/jdk/pull/3559 From dnsimon at openjdk.java.net Sun Apr 18 20:20:43 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Sun, 18 Apr 2021 20:20:43 GMT Subject: RFR: 8252600: remove mx configuration [v2] In-Reply-To: <1PPeQbsoW3AFDIt1L2dYfg5WwykGVftJ47L6AE0aWXE=.ded1e016-5781-44ea-9a3a-46a719c7f584@github.com> References: <5_y5r2sWg9i6sNhcJrg_GvODVPAPDs60KVqOw8z8jrI=.4e633372-2ba1-4b58-a9ca-04b38c9a91fa@github.com> <1PPeQbsoW3AFDIt1L2dYfg5WwykGVftJ47L6AE0aWXE=.ded1e016-5781-44ea-9a3a-46a719c7f584@github.com> Message-ID: <9BHFJ8Vurf3bVA12x8Fo0iaZGzQmUrXCbImXjY56Wtw=.daf237c0-2368-48dd-9fc0-5936431ddd56@github.com> On Sun, 18 Apr 2021 20:17:14 GMT, Doug Simon wrote: >> This PR removes the mx configuration files in the JDK as they do not really belong here. Instead, I've updated and moved them to https://github.com/dougxc/mx_jdk. > > Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > 8252600: [JVMCI] remove mx configuration On second thoughts, I concur with you Alan and am thus using this PR to remove the mx configuration files. ------------- PR: https://git.openjdk.java.net/jdk/pull/3559 From github.com+40024232+sunny868 at openjdk.java.net Mon Apr 19 01:23:06 2021 From: github.com+40024232+sunny868 at openjdk.java.net (SUN Guoyun) Date: Mon, 19 Apr 2021 01:23:06 GMT Subject: RFR: 8265105: gc/arguments/TestSelectDefaultGC.java fails when compiler1 is disabled [v2] In-Reply-To: References: Message-ID: > On MIPS64 platform has not impliment C1,only has C2. > so when tiered compilation is off, it is unnecessary to set client emulation mode flags. > perhaps this bug be included by https://bugs.openjdk.java.net/browse/JDK-8251462 SUN Guoyun has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: 8265105: gc/arguments/TestSelectDefaultGC.java fails when compiler1 is disabled ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3449/files - new: https://git.openjdk.java.net/jdk/pull/3449/files/51dcbbf8..c21468f2 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3449&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3449&range=00-01 Stats: 14 lines in 2 files changed: 6 ins; 2 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/3449.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3449/head:pull/3449 PR: https://git.openjdk.java.net/jdk/pull/3449 From yyang at openjdk.java.net Mon Apr 19 02:12:35 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Mon, 19 Apr 2021 02:12:35 GMT Subject: RFR: 8265245: depChecker_ don't have any functionalities In-Reply-To: <7KW8AJT9EL9Mt0Tebhtooow-7zbDcwEbeoG21HURdG8=.12c7367e-11fc-4a45-b9c1-23f97321799d@github.com> References: <1JKPxPdQ9XoTDXdIP8P_eBy6EqYZhEoD8SACCs0MfSs=.d5cd2e5f-6c11-4be8-8651-3931368d1471@github.com> <7KW8AJT9EL9Mt0Tebhtooow-7zbDcwEbeoG21HURdG8=.12c7367e-11fc-4a45-b9c1-23f97321799d@github.com> Message-ID: On Fri, 16 Apr 2021 08:10:48 GMT, Tobias Hartmann wrote: >> Yet another OpenJDK6 old guy. >> >> src/hotspot/cpu//depChecker_.hpp/cpp are included by src/hotspot/share/compiler/disassembler.cpp, they don't provide any functionality. >> >> In the absence of strong demand either in existing ARCHs or future ARCHs, I think we can remove it. > > Looks good to me. Thanks @TobiHartmann and @neliasso for the reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/3531 From yyang at openjdk.java.net Mon Apr 19 02:12:34 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Mon, 19 Apr 2021 02:12:34 GMT Subject: RFR: 8265323: Leftover local variables in PcDesc In-Reply-To: References: Message-ID: On Fri, 16 Apr 2021 08:08:14 GMT, Tobias Hartmann wrote: >> Leftover local variables in PcDesc. Remove it to save electric power. > > Marked as reviewed by thartmann (Reviewer). Thanks @TobiHartmann and @neliasso for the reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/3532 From yyang at openjdk.java.net Mon Apr 19 02:22:34 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Mon, 19 Apr 2021 02:22:34 GMT Subject: RFR: 8265322: C2: Simplify control inputs for BarrierSetC2::obj_allocate In-Reply-To: References: <_NicB3VpqFwTiu2-bKLNhfnQAxJmbtND66g3Md5ck5g=.f9b4edcd-fa2b-4917-9092-bccc38b202df@github.com> Message-ID: On Fri, 16 Apr 2021 17:37:05 GMT, Vladimir Kozlov wrote: > From compiler code POV the fix is reasonable and correct. Thank you @vnkozlov! Do you think it's reasonable to move PhaseMacroExpand::set_eden_pointers to BarrierSetC2? It seems that that's GC knowledge area about how to set eden_top/eden_end w or w/o turning UseTLAB. > Note, the path when `initial_slow_test != NULL` is not rare. It is frequent for arrays allocation when `length` is not constant. Yes, I missed the AllocateArrayNode when skimming the code. Thanks for pointing out this. Let's wait for other hotspot gc/compiler folks for more reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/3529 From yyang at openjdk.java.net Mon Apr 19 04:59:03 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Mon, 19 Apr 2021 04:59:03 GMT Subject: RFR: 8265106: IGV: Enforce en-US locale while parsing ideal graph Message-ID: IGV is designed to support parsing incomplete XML. However, it does not work for non-English users. See XXPR for the detailed reasons. This patch would address it. (P.S. Locale.ENGLISH also does not work, see Philip Helger' [comment](https://stackoverflow.com/questions/18531633/locale-specific-messages-in-xerces-2-11-0-java) on the first answer.) ------------- Commit messages: - parse incomplete xml Changes: https://git.openjdk.java.net/jdk/pull/3563/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3563&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8265106 Stats: 9 lines in 2 files changed: 9 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3563.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3563/head:pull/3563 PR: https://git.openjdk.java.net/jdk/pull/3563 From yyang at openjdk.java.net Mon Apr 19 04:59:03 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Mon, 19 Apr 2021 04:59:03 GMT Subject: RFR: 8265106: IGV: Enforce en-US locale while parsing ideal graph In-Reply-To: References: Message-ID: On Mon, 19 Apr 2021 04:48:07 GMT, Yi Yang wrote: > IGV is designed to support parsing incomplete XML. However, it does not work for non-English users. See XXPR for the detailed reasons. This patch would address it. > > (P.S. Locale.ENGLISH also does not work, see Philip Helger' [comment](https://stackoverflow.com/questions/18531633/locale-specific-messages-in-xerces-2-11-0-java) on the first answer.) Thanks @robcasloz for giving this purposed fix! ------------- PR: https://git.openjdk.java.net/jdk/pull/3563 From yyang at openjdk.java.net Mon Apr 19 06:29:35 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Mon, 19 Apr 2021 06:29:35 GMT Subject: Integrated: 8265323: Leftover local variables in PcDesc In-Reply-To: References: Message-ID: On Fri, 16 Apr 2021 03:16:30 GMT, Yi Yang wrote: > Leftover local variables in PcDesc. Remove it to save electric power. This pull request has now been integrated. Changeset: a2b0e0f4 Author: Yi Yang Committer: Tobias Hartmann URL: https://git.openjdk.java.net/jdk/commit/a2b0e0f4 Stats: 6 lines in 1 file changed: 0 ins; 5 del; 1 mod 8265323: Leftover local variables in PcDesc Reviewed-by: thartmann, neliasso ------------- PR: https://git.openjdk.java.net/jdk/pull/3532 From yyang at openjdk.java.net Mon Apr 19 06:30:41 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Mon, 19 Apr 2021 06:30:41 GMT Subject: Integrated: 8265245: depChecker_ don't have any functionalities In-Reply-To: <1JKPxPdQ9XoTDXdIP8P_eBy6EqYZhEoD8SACCs0MfSs=.d5cd2e5f-6c11-4be8-8651-3931368d1471@github.com> References: <1JKPxPdQ9XoTDXdIP8P_eBy6EqYZhEoD8SACCs0MfSs=.d5cd2e5f-6c11-4be8-8651-3931368d1471@github.com> Message-ID: <5HVUqQeZKsgXRGEmzpXbVxw-0vmHU0Ofg8ZrRh48rL8=.0b2794fd-33c6-4438-b910-ff848312e15f@github.com> On Fri, 16 Apr 2021 03:10:12 GMT, Yi Yang wrote: > Yet another OpenJDK6 old guy. > > src/hotspot/cpu//depChecker_.hpp/cpp are included by src/hotspot/share/compiler/disassembler.cpp, they don't provide any functionality. > > In the absence of strong demand either in existing ARCHs or future ARCHs, I think we can remove it. This pull request has now been integrated. Changeset: fa58aae8 Author: Yi Yang Committer: Tobias Hartmann URL: https://git.openjdk.java.net/jdk/commit/fa58aae8 Stats: 305 lines in 13 files changed: 0 ins; 305 del; 0 mod 8265245: depChecker_ don't have any functionalities Reviewed-by: thartmann, neliasso ------------- PR: https://git.openjdk.java.net/jdk/pull/3531 From rcastanedalo at openjdk.java.net Mon Apr 19 08:07:33 2021 From: rcastanedalo at openjdk.java.net (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Mon, 19 Apr 2021 08:07:33 GMT Subject: RFR: 8265106: IGV: Enforce en-US locale while parsing ideal graph In-Reply-To: References: Message-ID: On Mon, 19 Apr 2021 04:48:07 GMT, Yi Yang wrote: > IGV is designed to support parsing incomplete XML. However, it does not work for non-English users. See #3071 for detailed reasons. This patch would address it. > > (P.S. Locale.ENGLISH also does not work, see Philip Helger' [comment](https://stackoverflow.com/questions/18531633/locale-specific-messages-in-xerces-2-11-0-java) on the first answer.) Hi Yang, I tested this on Linux by opening a file missing closing `group` and `graphDocument` tags ([incomplete.xml](https://bugs.openjdk.java.net/secure/attachment/94215/incomplete.zip)), and it works for the following configurations: - JDK versions 8, 11, and 15 - `export LC_ALL=es_ES.UTF-8`, `export LC_ALL=en_US.UTF-8` ------------- PR: https://git.openjdk.java.net/jdk/pull/3563 From eliu at openjdk.java.net Mon Apr 19 08:54:49 2021 From: eliu at openjdk.java.net (Eric Liu) Date: Mon, 19 Apr 2021 08:54:49 GMT Subject: RFR: 8262916: Merge LShiftCntV and RShiftCntV into a single node [v2] In-Reply-To: References: Message-ID: > The vector shift count was defined by two separate nodes(LShiftCntV and > RShiftCntV), which would prevent them from being shared when the shift > counts are the same. > > > public static void test_shiftv(int sh) { > for (int i = 0; i < N; i+=1) { > a0[i] = a1[i] << sh; > b0[i] = b1[i] >> sh; > } > } > > > Given the example above, by merging the same shift counts into one > node, they could be shared by shift nodes(RShiftV or LShiftV) like > below: > > > Before: > 1184 LShiftCntV === _ 1189 [[ 1185 ... ]] > 1190 RShiftCntV === _ 1189 [[ 1191 ... ]] > 1185 LShiftVI === _ 1181 1184 [[ 1186 ]] > 1191 RShiftVI === _ 1187 1190 [[ 1192 ]] > > After: > 1190 ShiftCntV === _ 1189 [[ 1191 1204 ... ]] > 1204 LShiftVI === _ 1211 1190 [[ 1203 ]] > 1191 RShiftVI === _ 1187 1190 [[ 1192 ]] > > > The final code could remove one redundant ?dup?(scalar->vector), > with one register saved. > > > Before: > dup v16.16b, w12 > dup v17.16b, w12 > ... > ldr q18, [x13, #16] > sshl v18.4s, v18.4s, v16.4s > add x18, x16, x12 ; iaload > > add x4, x15, x12 > str q18, [x4, #16] ; iastore > > ldr q18, [x18, #16] > add x12, x14, x12 > neg v19.16b, v17.16b > sshl v18.4s, v18.4s, v19.4s > str q18, [x12, #16] ; iastore > > After: > dup v16.16b, w11 > ... > ldr q17, [x13, #16] > sshl v17.4s, v17.4s, v16.4s > add x2, x22, x11 ; iaload > > add x4, x16, x11 > str q17, [x4, #16] ; iastore > > ldr q17, [x2, #16] > add x11, x21, x11 > neg v18.16b, v16.16b > sshl v17.4s, v17.4s, v18.4s > str q17, [x11, #16] ; iastore Eric Liu has updated the pull request incrementally with one additional commit since the last revision: code backup Change-Id: Ie9046b1d7e8f5e2669767756b6b074b564523039 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3371/files - new: https://git.openjdk.java.net/jdk/pull/3371/files/eece376d..c860c725 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3371&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3371&range=00-01 Stats: 826 lines in 10 files changed: 324 ins; 205 del; 297 mod Patch: https://git.openjdk.java.net/jdk/pull/3371.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3371/head:pull/3371 PR: https://git.openjdk.java.net/jdk/pull/3371 From yyang at openjdk.java.net Mon Apr 19 09:02:34 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Mon, 19 Apr 2021 09:02:34 GMT Subject: RFR: 8265106: IGV: Enforce en-US locale while parsing ideal graph In-Reply-To: References: Message-ID: On Mon, 19 Apr 2021 04:48:07 GMT, Yi Yang wrote: > IGV is designed to support parsing incomplete XML. However, it does not work for non-English users. See #3071 for detailed reasons. This patch would address it. > > (P.S. Locale.ENGLISH also does not work, see Philip Helger' [comment](https://stackoverflow.com/questions/18531633/locale-specific-messages-in-xerces-2-11-0-java) on the first answer.) Thank you Roberto for additional testing :) It also works for my following configuration: + JDK 15 + Both en-US and zh-CN (Systme Locale, not LC_ALL environment variable) ------------- PR: https://git.openjdk.java.net/jdk/pull/3563 From eliu at openjdk.java.net Mon Apr 19 09:02:40 2021 From: eliu at openjdk.java.net (Eric Liu) Date: Mon, 19 Apr 2021 09:02:40 GMT Subject: Withdrawn: 8262916: Merge LShiftCntV and RShiftCntV into a single node In-Reply-To: References: Message-ID: On Wed, 7 Apr 2021 07:28:08 GMT, Eric Liu wrote: > The vector shift count was defined by two separate nodes(LShiftCntV and > RShiftCntV), which would prevent them from being shared when the shift > counts are the same. > > > public static void test_shiftv(int sh) { > for (int i = 0; i < N; i+=1) { > a0[i] = a1[i] << sh; > b0[i] = b1[i] >> sh; > } > } > > > Given the example above, by merging the same shift counts into one > node, they could be shared by shift nodes(RShiftV or LShiftV) like > below: > > > Before: > 1184 LShiftCntV === _ 1189 [[ 1185 ... ]] > 1190 RShiftCntV === _ 1189 [[ 1191 ... ]] > 1185 LShiftVI === _ 1181 1184 [[ 1186 ]] > 1191 RShiftVI === _ 1187 1190 [[ 1192 ]] > > After: > 1190 ShiftCntV === _ 1189 [[ 1191 1204 ... ]] > 1204 LShiftVI === _ 1211 1190 [[ 1203 ]] > 1191 RShiftVI === _ 1187 1190 [[ 1192 ]] > > > The final code could remove one redundant ?dup?(scalar->vector), > with one register saved. > > > Before: > dup v16.16b, w12 > dup v17.16b, w12 > ... > ldr q18, [x13, #16] > sshl v18.4s, v18.4s, v16.4s > add x18, x16, x12 ; iaload > > add x4, x15, x12 > str q18, [x4, #16] ; iastore > > ldr q18, [x18, #16] > add x12, x14, x12 > neg v19.16b, v17.16b > sshl v18.4s, v18.4s, v19.4s > str q18, [x12, #16] ; iastore > > After: > dup v16.16b, w11 > ... > ldr q17, [x13, #16] > sshl v17.4s, v17.4s, v16.4s > add x2, x22, x11 ; iaload > > add x4, x16, x11 > str q17, [x4, #16] ; iastore > > ldr q17, [x2, #16] > add x11, x21, x11 > neg v18.16b, v16.16b > sshl v17.4s, v17.4s, v18.4s > str q17, [x11, #16] ; iastore This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/3371 From eliu at openjdk.java.net Mon Apr 19 09:02:39 2021 From: eliu at openjdk.java.net (Eric Liu) Date: Mon, 19 Apr 2021 09:02:39 GMT Subject: RFR: 8262916: Merge LShiftCntV and RShiftCntV into a single node [v2] In-Reply-To: References: Message-ID: On Mon, 19 Apr 2021 08:54:49 GMT, Eric Liu wrote: >> The vector shift count was defined by two separate nodes(LShiftCntV and >> RShiftCntV), which would prevent them from being shared when the shift >> counts are the same. >> >> >> public static void test_shiftv(int sh) { >> for (int i = 0; i < N; i+=1) { >> a0[i] = a1[i] << sh; >> b0[i] = b1[i] >> sh; >> } >> } >> >> >> Given the example above, by merging the same shift counts into one >> node, they could be shared by shift nodes(RShiftV or LShiftV) like >> below: >> >> >> Before: >> 1184 LShiftCntV === _ 1189 [[ 1185 ... ]] >> 1190 RShiftCntV === _ 1189 [[ 1191 ... ]] >> 1185 LShiftVI === _ 1181 1184 [[ 1186 ]] >> 1191 RShiftVI === _ 1187 1190 [[ 1192 ]] >> >> After: >> 1190 ShiftCntV === _ 1189 [[ 1191 1204 ... ]] >> 1204 LShiftVI === _ 1211 1190 [[ 1203 ]] >> 1191 RShiftVI === _ 1187 1190 [[ 1192 ]] >> >> >> The final code could remove one redundant ?dup?(scalar->vector), >> with one register saved. >> >> >> Before: >> dup v16.16b, w12 >> dup v17.16b, w12 >> ... >> ldr q18, [x13, #16] >> sshl v18.4s, v18.4s, v16.4s >> add x18, x16, x12 ; iaload >> >> add x4, x15, x12 >> str q18, [x4, #16] ; iastore >> >> ldr q18, [x18, #16] >> add x12, x14, x12 >> neg v19.16b, v17.16b >> sshl v18.4s, v18.4s, v19.4s >> str q18, [x12, #16] ; iastore >> >> After: >> dup v16.16b, w11 >> ... >> ldr q17, [x13, #16] >> sshl v17.4s, v17.4s, v16.4s >> add x2, x22, x11 ; iaload >> >> add x4, x16, x11 >> str q17, [x4, #16] ; iastore >> >> ldr q17, [x2, #16] >> add x11, x21, x11 >> neg v18.16b, v16.16b >> sshl v17.4s, v17.4s, v18.4s >> str q17, [x11, #16] ; iastore > > Eric Liu has updated the pull request incrementally with one additional commit since the last revision: > > code backup > > Change-Id: Ie9046b1d7e8f5e2669767756b6b074b564523039 VectorAPI would *not* profit from this two nodes' separation as the input of RShiftVNode may *not* be a RShiftCntVNode[1]. Inserting a special 'VNEG' only for AArch64 in mid-end maybe work but seems too ugly and merging those two nodes would harm AArch32. It's quite hard to compromise the benefits between AArch64 and other architectures. [1] https://github.com/openjdk/jdk/blob/jdk-17%2B18/src/hotspot/cpu/aarch64/aarch64_neon.ad#L5179 ------------- PR: https://git.openjdk.java.net/jdk/pull/3371 From jiefu at openjdk.java.net Mon Apr 19 09:26:35 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Mon, 19 Apr 2021 09:26:35 GMT Subject: RFR: 8265325: Optimize StubRoutines::dpow() for Math.pow(x, 0.5) [v4] In-Reply-To: R