reduction / graal / kaveri / heterogeneous queueing ?
Jules Gosnell
jules_gosnell at yahoo.com
Tue Jun 17 20:43:27 UTC 2014
Guys,
I have been playing with a system of gpu based reduction that is
intended to work as follows:
e.g.
public void kernel(Object[] input, Object[] output, int i) {
output[i] = foo(input[i*2], input[(i*2)+1];
}
foo is the reducing function and is expected to return the reduction of
two elements of the sequence being reduced.
kernel would be called with e.g.
input=Object[2n], output=Object[n], i=n.
The idea is to go through a number of reduction steps, each one taking
an input of size 2x and producing an output of size x, which can then be
fed back into the same kernel as input for the following round. Odds and
ends can be picked up by the cpu and folded in at a suitable juncture -
repeat until output array is too small to reduce further on the gpu and
so finish up the reduction on the cpu....
Questions:
- does this sound sensible ?
- I've read about Kaveri h/w supporting heterogeneous (including
gpu->gpu) queueing. Is this available, or are there plans to surface it,
in clumatra/graal/okra ? I need this to sequence the steps of my
reduction efficiently.
- anything else that anyone feels is relevant :-)
looking forward to hearing from you,
Jules
More information about the graal-dev
mailing list