[vectorIntrinsics] Feedback on Vector for Image Processing
Peter A
peter.abeles at gmail.com
Mon Mar 22 19:39:46 UTC 2021
I ported a few already optimized functions related to matrix multiplication
and image processing to the Vector API and posted the results here:
https://github.com/lessthanoptimal/VectorPerformance
Results look fairly good! In most cases performance was sped up by about
1.7x, in a few cases it did get worse. I'll just discuss image processing
here since I don't think this use case has come up yet.
1) Support for Comparison operators, support unsigned byte and unsigned
short type. Based on comments in the JDK looks like this is planned. This
is a critical requirement for image processing.
2) Add support for output to the same primitive type as the input array for
Comparison operators. Right now there's only support boolean[]. booleans
are not ideal for image processing which is why BoofCV uses byte[] for it's
binary images.
3) Add a new lower level API which enables (nearly) allocation free usage.
Forcing memory allocations inside the inner post loop kills
performance, even if the code looks more elegant and is the Java way. This
is especially true for code which is optimized for small arrays. You can
see this in Linear Algebra libraries where all the highly performant ones
are basically written like C libraries in their lowest level functions.
Might be best to create a new thread for this comment. Could be an "easy"
30% performance boost.
Would also like to point out how much faster the manually unrolled image
convolution code was than even the Vectorized version.
Cheers,
- Peter
--
"Now, now my good man, this is no time for making enemies." — Voltaire
(1694-1778), on his deathbed in response to a priest asking that he
renounce Satan.
More information about the panama-dev
mailing list