RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v20]
Volodymyr Paprotski
duke at openjdk.org
Thu Nov 17 03:23:52 UTC 2022
On Wed, 16 Nov 2022 23:41:32 GMT, Volodymyr Paprotski <duke at openjdk.org> wrote:
>> Yes, please. And for the upper half of register file, just code it as a loop over register range:
>>
>> for (int rxmm_num = 16; rxmm_num < 30; rxmm_num++) {
>> XMMRegister rxmm = as_XMMRegister(rxmm_num);
>> __ vpxorq(rxmm, rxmm, rxmm, Assembler::AVX_512bit);
>> }
>>
>> or even
>>
>> // Zeroes zmm16-zmm31.
>> for (XMMRegister rxmm = xmm16; rxmm->is_valid(); rxmm = rxmm->successor()) {
>> __ vpxorq(rxmm, rxmm, rxmm, Assembler::AVX_512bit);
>> }
>
> Will do.. ("loop" erm.. wow.. "duh, this isn't assembler!") Thanks!!
done
(Note: disassembler proof for vzeroall encoding
0x7fffed0022f8: vzeroall
0x7fffed0022fb: vpxorq zmm16,zmm16,zmm16
0x7fffed002301: vpxorq zmm17,zmm17,zmm17
0x7fffed002307: vpxorq zmm18,zmm18,zmm18
0x7fffed00230d: vpxorq zmm19,zmm19,zmm19
0x7fffed002313: vpxorq zmm20,zmm20,zmm20
0x7fffed002319: vpxorq zmm21,zmm21,zmm21
0x7fffed00231f: vpxorq zmm22,zmm22,zmm22
0x7fffed002325: vpxorq zmm23,zmm23,zmm23
0x7fffed00232b: vpxorq zmm24,zmm24,zmm24
0x7fffed002331: vpxorq zmm25,zmm25,zmm25
0x7fffed002337: vpxorq zmm26,zmm26,zmm26
0x7fffed00233d: vpxorq zmm27,zmm27,zmm27
0x7fffed002343: vpxorq zmm28,zmm28,zmm28
0x7fffed002349: vpxorq zmm29,zmm29,zmm29
0x7fffed00234f: vpxorq zmm30,zmm30,zmm30
0x7fffed002355: vpxorq zmm31,zmm31,zmm31
0x7fffed00235b: cmp ebx,0x10
0x7fffed00235e: jl 0x7fffed0023e6
)
-------------
PR: https://git.openjdk.org/jdk/pull/10582
More information about the security-dev
mailing list