RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v12]
Emanuel Peter
epeter at openjdk.org
Thu Mar 6 15:26:09 UTC 2025
On Thu, 27 Feb 2025 16:38:30 GMT, Galder Zamarreño <galder at openjdk.org> wrote:
>> Galder Zamarreño has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 44 additional commits since the last revision:
>>
>> - Merge branch 'master' into topic.intrinsify-max-min-long
>> - Fix typo
>> - Renaming methods and variables and add docu on algorithms
>> - Fix copyright years
>> - Make sure it runs with cpus with either avx512 or asimd
>> - Test can only run with 256 bit registers or bigger
>>
>> * Remove platform dependant check
>> and use platform independent configuration instead.
>> - Fix license header
>> - Tests should also run on aarch64 asimd=true envs
>> - Added comment around the assertions
>> - Adjust min/max identity IR test expectations after changes
>> - ... and 34 more: https://git.openjdk.org/jdk/compare/dfbb2ee6...a190ae68
>
> Also, I've started a [discussion on jmh-dev](https://mail.openjdk.org/pipermail/jmh-dev/2025-February/004094.html) to see if there's a way to minimise pollution of `Math.min(II)` compilation. As a follow to https://github.com/openjdk/jdk/pull/20098#issuecomment-2684701935 I looked at where the other `Math.min(II)` calls are coming from, and a big chunk seem related to the JMH infrastructure.
@galderz about:
> Additional performance improvement: extend backend capabilities for vectorization (see Regression 2 + 3). Optional.
I looked at `src/hotspot/cpu/x86/x86.ad`
bool Matcher::match_rule_supported_vector(int opcode, int vlen, BasicType bt) {
1774 case Op_MaxV:
1775 case Op_MinV:
1776 if (UseSSE < 4 && is_integral_type(bt)) {
1777 return false;
1778 }
...
So it seems that here lanewise min/max are supported for AVX2. But it seems that's different for reductions:
1818 case Op_MinReductionV:
1819 case Op_MaxReductionV:
1820 if ((bt == T_INT || is_subword_type(bt)) && UseSSE < 4) {
1821 return false;
1822 } else if (bt == T_LONG && (UseAVX < 3 || !VM_Version::supports_avx512vlbwdq())) {
1823 return false;
1824 }
...
So it seems maybe we could improve the AVX2 coverage for reductions. But honestly, I will probably find this issue again once I work on the other reductions above, and run the benchmarks. I think that will make it easier to investigate all of this. I will for example adjust the IR rules, and then it will be apparent where there are cases that are not covered.
@galderz you said you would add some extra comments, then I will review again :)
-------------
PR Comment: https://git.openjdk.org/jdk/pull/20098#issuecomment-2704159992
PR Comment: https://git.openjdk.org/jdk/pull/20098#issuecomment-2704161929
More information about the core-libs-dev
mailing list