Bulk operations, decomposition functions

Tue Jul 24 15:33:23 PDT 2012

(sorry for resending twice, but I suspect the first message got filtered
out because of the attached GPG signature)

Hi,

I wanted to check if someone had looked into current decomposition
functions (from now on, "DF" for brevity) for bulk operations (i.e.
ParallelIterables.calculateDepth, BaseTask, etc.); or otherwise has the
rationale behind them?

Trying the analyze the decomposition we have now, I had coded the DF in
R [1], built some graphs [2] and have some preliminary questions:

I. The DF depends on both the guess for the task size and available
parallelism. This follows naturally from the code, and also backed up by
the chart 1. What if the guess is wrong? It seems dangerous to depend on
the guess to drive the decomposition decisions.

II. The DF always decomposes into 2^n leaf tasks. This seems technically
plausible, but it brings up the rounding issues into consideration: now,
the little change in problem size can drastically change the subtask
size, and as such, bulk operation performance. You might see that on
chart 2: the only z-levels are the powers of two; and also the
transitions between levels are abrupt and have significant jitter around
the cliff.

III. The DF seems to favor very fast functions to operate on data. It
will naturally prefer to have larger subtasks, boosting the performance
for short operations, but limiting the scalability for heavy ones. For
example, the current decomposition will generate only 64 subtasks for
100-element external task and 100 workers in FJP. This is partially
demonstrated by chart 3 at low-end external task sizes: the average
number of tasks per worker falls below one there.

It is also interesting to see that the "load factor" for FJP is probably
too low even in non-pathological cases: there are only 1-2 tasks per
worker, which can inhibit the balancing for heterogeneous tasks, albeit
that is needed to be proven empirically.

-----

I would like to dive more deeply into that. I think this task could not
be solved without the live feedback from both FJP and the collection
being processed, which should guide decomposition decisions.

Any thoughts, comments, suggestions, etc. are appreciated.

-Aleksey.

[1] http://shipilev.net/pub/jdk/lambda/decomp-size-1/plot.r
[2] http://shipilev.net/pub/jdk/lambda/decomp-size-1/plots.pdf