RFR: 8277426: Optimize mask reduction operations on x86 [v2]
Mai Đặng Quân Anh
duke at openjdk.java.net
Wed Nov 24 14:18:12 UTC 2021
On Wed, 24 Nov 2021 03:52:13 GMT, Jie Fu <jiefu at openjdk.org> wrote:
>> Mai Đặng Quân Anh has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
>>
>> - Merge branch 'master' into vectorMaskReduction
>> - reduce some dependencies with spare register
>> - improve mask reduction logic on AVX
>
> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4065:
>
>> 4063: void C2_MacroAssembler::vector_mask_operation(int opc, Register dst, KRegister mask,
>> 4064: int masklen, int masksize, int vec_enc) {
>> 4065: assert(VM_Version::supports_popcnt() &&
>
> New instructions like `lzcntq` and `tzcntq` are used for the optimized code gen without detecting the availability.
> I'm a bit worried about that.
>
> So do all AVX512 platforms support them?
> Thanks.
Yes, you are right. I can't find concrete evidence that AVX512 implies BMI1. In addition, `VectorMaskGen` does a check for AVX3 simultaneously with a check for BMI2. So it seems safer to do the same as we did with AVX1 - 2. As a result, I refactored the reduction step into a separate function.
Thank you very much.
-------------
PR: https://git.openjdk.java.net/jdk/pull/6447
More information about the hotspot-compiler-dev
mailing list