Vector API - Tail vs. Dummy vector entries

Joachim.Schwarte at ps.rolls-royce.com Joachim.Schwarte at ps.rolls-royce.com
Sun Aug 14 16:34:10 UTC 2022


Dear Panama Team,

I found no „silent“ way to get in touch with just one of you to find out:

Why do you need a (scalar) tail to finish off a vector algorithm, when the last elements of the vector don‘t fill the whateverer sized register ?

If the assumption holds, that SIMD vector operations are „almost“ as fast as their scalar counterparts, then wouldn‘t it (statistically) be faster to just fill the gap with some dummy values, and mask away that gap before the result is provided to the receiving result array ?

On top of that, wouldn‘t it be much more elegant for those who use Vector API ?

Or do I just have to be a bit patient, because it is an interim solution ?

Best regards,

Achim

PS: Is there a more appropriate way to pose such questions ?

Rolls-Royce Power Systems and its affiliates respects the protection of your personal data. For further information, please click here for our privacy notice.<https://www.mtu-solutions.com/eu/en/legal-pages/privacy-policy.html>


More information about the panama-dev mailing list