Vector API - Tail vs. Dummy vector entries
Joachim.Schwarte at ps.rolls-royce.com
Joachim.Schwarte at ps.rolls-royce.com
Sun Aug 14 16:34:10 UTC 2022
Dear Panama Team,
I found no „silent“ way to get in touch with just one of you to find out:
Why do you need a (scalar) tail to finish off a vector algorithm, when the last elements of the vector don‘t fill the whateverer sized register ?
If the assumption holds, that SIMD vector operations are „almost“ as fast as their scalar counterparts, then wouldn‘t it (statistically) be faster to just fill the gap with some dummy values, and mask away that gap before the result is provided to the receiving result array ?
On top of that, wouldn‘t it be much more elegant for those who use Vector API ?
Or do I just have to be a bit patient, because it is an interim solution ?
Best regards,
Achim
PS: Is there a more appropriate way to pose such questions ?
Rolls-Royce Power Systems and its affiliates respects the protection of your personal data. For further information, please click here for our privacy notice.<https://www.mtu-solutions.com/eu/en/legal-pages/privacy-policy.html>
More information about the panama-dev
mailing list