RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v9]
Emanuel Peter
epeter at openjdk.org
Tue Jan 14 08:23:40 UTC 2025
On Mon, 13 Jan 2025 17:12:31 GMT, Galder Zamarreño <galder at openjdk.org> wrote:
>> This patch intrinsifies `Math.max(long, long)` and `Math.min(long, long)` in order to help improve vectorization performance.
>>
>> Currently vectorization does not kick in for loops containing either of these calls because of the following error:
>>
>>
>> VLoop::check_preconditions: failed: control flow in loop not allowed
>>
>>
>> The control flow is due to the java implementation for these methods, e.g.
>>
>>
>> public static long max(long a, long b) {
>> return (a >= b) ? a : b;
>> }
>>
>>
>> This patch intrinsifies the calls to replace the CmpL + Bool nodes for MaxL/MinL nodes respectively.
>> By doing this, vectorization no longer finds the control flow and so it can carry out the vectorization.
>> E.g.
>>
>>
>> SuperWord::transform_loop:
>> Loop: N518/N126 counted [int,int),+4 (1025 iters) main has_sfpt strip_mined
>> 518 CountedLoop === 518 246 126 [[ 513 517 518 242 521 522 422 210 ]] inner stride: 4 main of N518 strip mined !orig=[419],[247],[216],[193] !jvms: Test::test @ bci:14 (line 21)
>>
>>
>> Applying the same changes to `ReductionPerf` as in https://github.com/openjdk/jdk/pull/13056, we can compare the results before and after. Before the patch, on darwin/aarch64 (M1):
>>
>>
>> ==============================
>> Test summary
>> ==============================
>> TEST TOTAL PASS FAIL ERROR
>> jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java
>> 1 1 0 0
>> ==============================
>> TEST SUCCESS
>>
>> long min 1155
>> long max 1173
>>
>>
>> After the patch, on darwin/aarch64 (M1):
>>
>>
>> ==============================
>> Test summary
>> ==============================
>> TEST TOTAL PASS FAIL ERROR
>> jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java
>> 1 1 0 0
>> ==============================
>> TEST SUCCESS
>>
>> long min 1042
>> long max 1042
>>
>>
>> This patch does not add an platform-specific backend implementations for the MaxL/MinL nodes.
>> Therefore, it still relies on the macro expansion to transform those into CMoveL.
>>
>> I've run tier1 and hotspot compiler tests on darwin/aarch64 and got these results:
>>
>>
>> ==============================
>> Test summary
>> ==============================
>> TEST TOTAL PA...
>
> Galder Zamarreño has updated the pull request incrementally with one additional commit since the last revision:
>
> Make sure it runs with cpus with either avx512 or asimd
test/hotspot/jtreg/compiler/loopopts/superword/MinMaxRed_Long.java line 2:
> 1: /*
> 2: * Copyright (c) 2024, Red Hat, Inc. All rights reserved.
Should probably be 2025 by now, right? At least that is our policy for Oracle, not sure what you want to do at Red Hat
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/20098#discussion_r1914414960
More information about the graal-dev
mailing list