X25519 experiment: access to VPMULDQ
Adam Petcher
adam.petcher at oracle.com
Tue Jul 17 19:59:27 UTC 2018
I'm continuing with my experiment with X25519 on the vectorIntrinsics
branch, and I have a Vector API question. Is there a way to express a
vectorized 32x32->64 bit multiply? On AVX, this translates to the
VPMULDQ instruction. In other words, I think I'm looking for something
like IntVector::mul(Vector<Integer, S>) that returns a LongVector<S>.
I'm currently using LongVector::mul, but I don't have VPMULLQ on my
system, so the resulting assembly does some unnecessary work to
incorporate the high dwords (which are always zero) into the result.
For more background on my goal, I'm trying to implement a variant of
Sandy2x[1]. Specifically, I want to be able to do something like the the
radix 2^25.5 multiplication/reduction in section 2.2. Though I'm using a
signed representation, so I would prefer to use VPMULDQ instead of
VPMULUDQ, but I could probably make it work either way.
[1] https://eprint.iacr.org/2015/943.pdf
More information about the panama-dev
mailing list