RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v20]
Vladimir Ivanov
vlivanov at openjdk.org
Wed Nov 16 23:41:16 UTC 2022
On Wed, 16 Nov 2022 23:14:45 GMT, Volodymyr Paprotski <duke at openjdk.org> wrote:
>> Or simply switch to `vzeroall` for `xmm0` - `xmm15`.
>
> ah.. I remember thinking about doing that.. `vzeroall` isnt encoded yet and I figured since I already have to do the xmm16-29, might as well do them all.. should I add that instruction too?
Yes, please. And for the upper half of register file, just code it as a loop over register range:
for (int rxmm_num = 16; rxmm_num < 30; rxmm_num++) {
XMMRegister rxmm = as_XMMRegister(rxmm_num);
__ vpxorq(rxmm, rxmm, rxmm, Assembler::AVX_512bit);
}
or even
// Zeroes zmm16-zmm31.
for (XMMRegister rxmm = xmm16; rxmm->is_valid(); rxmm = rxmm->successor()) {
__ vpxorq(rxmm, rxmm, rxmm, Assembler::AVX_512bit);
}
-------------
PR: https://git.openjdk.org/jdk/pull/10582
More information about the security-dev
mailing list