C/P/N/Q par vs. seq break-even analysis with 10ms think time

Wed Oct 17 07:34:38 PDT 2012

On 10/16/12 11:27, Aleksey Shipilev wrote:

>   - threads are waking up rather slow (on this timescale), full-blown
> parallelism lasts for somewhat 50us.
>
> So, here's what we got on the table. If I understand this data
> correctly, then the 500us execution divides as:
>     ~70us: handoff to FJP
>    ~200us: FJP rampup

It is hard to distinguish these two based on your current data
because handoff usually entails some signalling,
There are two aspects of ramp up -- the cost to signal
other threads, and the lost opportunity time between when
they are signaled and actually do something. One way to
better tear apart costs is to compare these vs simple
tasks like Fibonacci.

>     ~50us: FJP steady (even though lots of balancing)
>    ~100us: result handoff

100us seems much, much too high for result handoff. It
probably averages in some GC time? (GC is prone to happen
at this point.)

> Another thing is the interface between submitter and the FJP. I vaguely
> recall the infrastructure for allowing submitters to run the tasks
> themselves in in place, but how much effort that would take to get to at
> least experimental readiness?

It's there already (FJP.pollSubmission). But you probably have
in mind help-outs of subtasks rather than full submissions?
In which case the easiest way to check out the benefit is
to make sure the callers are themselves FJWorkerThreads;
in turn, by ONLY using FJWTs, not regular threads, in
the entire program. The you can use fork+join vs FJP.invoke
everywhere.

-Doug