Bulk operations, decomposition functions

Wed Jul 25 04:20:48 PDT 2012

On 07/25/12 06:30, Aleksey Shipilev wrote:
> Aha. The big question I've been asking myself over and over these days
> if we should disregard N whatsoever, and instead rely on P only to
> decompose. I.e. have the number of tasks to be P*L, where L is load
> factor, having L=8..16. Naively, that gives us the depth of
> ceil(log2(P*L)).

I would be more in favor of doing this if I believed that
the value of any load-factor-reporting method available
was even remotely accurate! Even the "accurate" ones
tend not to reflect loads with a fine enough granularity
to help with suddenly generating a possibly large number
of tasks. It would be great if this could be improved,
but I gather that on Windows, especially, it's unlikely.

>
> Yes, I was vaguely referring to these while saying this task can be
> solved by using the feedback from FJP. My several-months-old experiments
> [1] show these heuristics work as fast as the good static guess for
> threshold when FJP is saturated, and work reasonably well when FJP is
> under edge effects like going from idle to full-scale parallelism. No
> surprises there.
>
> Keeping in mind that most bulk operations would probably be ad-hoc, that
> would probably mean FJP is idle right before the operation starts, which
> makes the SQTC/QTC heuristics unreliable to act as the primary drivers
> for the decomposition.

Yes. This is a variation of the above issue --
The times you most need accurate information are the times that
you are least likely to get it. Which leaves these methods
(possibly enhanced with system information?) as best for
dynamic fine-adjustment. The point at which you consult
the fine-tuners is itself intrinsically heuristic though.

On the other hand, for some data structures; e.g., those
for which you don't even have an O(1) size() method
but do have a fast way of splitting, relying entirely on the
dynamic schemes looks like the best move.

...

One overall moral is that there is no perfect one-size-fits-all
way of decomposing data structures. I'm imagining that there
will be support for some common cases (like one for Collections that
can supply spliterators, one for RandomAccess Lists, and a "dumb"
one that just peels off tasks one-by-one from an intrinsically
linear iterator), but many classes will want to supply custom
versions.

-Doug