Calculating integral images / cumulative add

Stefan Reich stefan.reich.maker.of.eye at googlemail.com
Wed Jan 6 22:20:16 UTC 2021


Hi Paul and John,

thank you for your replies! I think I may have found a workable compromise.
I will process 8x8 blocks of the image at a time, calculating only the
integral image values at grid points (x & 7 == 0 || y & 7 == 0) and filling
in the blocks only when requested by the image recognition algorithm
looking at the integral image. So basically the conversion of a block is
just this:

  VectorSpecies<Integer> species = IntVector.SPECIES_256;
  int[] rowSums = new int[8];
  int[] colSums = new int[8];
  IntVector vColSums = IntVector.zero(species);
  for (int y = 0; y < 8; y++) {
    IntVector v = IntVector.fromArray(species, image, y*8);
    rowSums[y] = v.reduceLanes(VectorOperators.ADD);
    vColSums = vColSums.add(v);
  }
  vColSums.intoArray(colSums, 0);


For turning a whole 1080p screenshot into an integral image (in B/W), I am
already below 10 ms using two cores. Still I'd love to bring this down even
more, just for the sake of it.

Video about my nascent image recognition
<https://www.youtube.com/watch?v=pvDmT7uERH8&ab_channel=StefanReich> if
anyone is interested

Thanks

On Wed, 6 Jan 2021 at 23:10, John Rose <john.r.rose at oracle.com> wrote:

> On Jan 4, 2021, at 5:51 PM, Stefan Reich <
> stefan.reich.maker.of.eye at googlemail.com> wrote:
>
>
> Oh I just saw this in the presentation
> <http://cr.openjdk.java.net/~jrose/pres/201907-Vectors.pdf>:
>
>  ▪ Segmented scan (reduce with partials and mask-driven reset)
>
> That's probably what I want... so it's not there yet?
>
>
> You can do a lot of really nice algorithms, including parsing
> and nested parallelism, with good segmented scan primitives.
> (I learned this from Guy Steele in the Connection Machine days.)
>
> I hope the V-API will support this some day, but IMO we need
> some good use cases (like yours) to motivate the work, plus some
> pathfinding.  Also, to be honest, the SIMD hardware today is
> not yet up to the standards of the Connection Machine of the
> ‘80s, with respect to those use cases.  (The other thing today’s
> HW is missing, compared to the CM, is reverse permutations,
> AKA sorting networks, AKA parallel deposit.  Scatter is the
> current approximation, but it doesn’t handle collisions well.)
>
> One step at a time…  For pathfinding, I suggest building the
> primitives you need (using binary tree decomposition) on top
> of the existing stuff.  It won’t be as fast w/o JIT support, and
> your patience will be tried sorely by the lack of support for
> OO abstraction over vector computations.  (Function
> abstraction is likely to work better.)  I think that’s a direct
> route to exploring the opportunities here.
>
> — John
>
>
>
>

-- 
Stefan Reich
BotCompany.de // Java-based operating systems


More information about the panama-dev mailing list