Bulk operations: utilization trouble

Wed Aug 1 09:02:04 PDT 2012

Hi,

I'm following up on the experiment I did before, looking for peculiar
behaviors. Here is one of them. I have a really heavy operation done on
rather small collection. In my previous notation, that means:
  N = 80 (collection size)
  P = 80 (#CPUs on target machine, FJP parallelism)
  C = 1  (only one external client)
  Q = 10^9 (operation cost, that is a lot, closer to 6 sec per op)

When I run this test continuously with fresh FJP, the CPU utilization
never grows beyond 50%, even though my tasks are completely
compute-bound and non-contended. The next thing to do in this case is to
trace the execution with fjp-trace [1]. The result is here [2]. The
trace render there [3] is coherent with 1/2 utilization, red bars are
threads stuck waiting on join, and not consuming the cycles.

I am open for the ideas if that makes sense, or what can go wrong. My
first idea is that my tasks end up consuming rather different amount of
time, which is partly backed up by subtask execution time chart [4]. It
is also weird that some workers wake up late, and thus stall the execution.

-Aleksey.

[1] https://github.com/shipilev/fjp-trace
[2] http://shipilev.net/pub/jdk/lambda/bulk-util-trouble-1/
[3] http://shipilev.net/pub/jdk/lambda/bulk-util-trouble-1/trace.png
[4]
http://shipilev.net/pub/jdk/lambda/bulk-util-trouble-1/exectime-exclusive.png