London Lambdas Hackday: Performance and Parallelism

Tue Jul 3 04:43:59 PDT 2012

During the hackday some people wrote a functional merge sort, and then
tried to parallelise it.  My understanding of the situation is that
fundamentally the main motivation behind the parallel collection
operations is performance.  They hadn't realised that there wasn't
support in the distributed build for the parallelism features
mentioned in the overview doc until they tried it but I still think
there's interesting points to be learned from their methodology.

1. Optimisation before Analysis

They didn't profile, they guessed what the bottleneck in their program
was.  This will definitely happen in real life development scenarios,
and I don't think many people even have sensible rules of thumb.  With
hindsight I should have surveyed the room to find out what percentage
of developer's knew what Amdahl's Law was, but nevermind.  They key
point is that when you make parallelism easy to obtain people will
attempt it without considering the consequences of their actions.

2. What is the decomposition model of parallel()?

This isn't obvious from the documentation.  Is there a cost model for
splitting things up?  Can people easily figure out a computational
cost budget between different components and work out at what point
there's a benefit to parallelism?

3. Readability of Profiling

If you profile for execution time then you may hit a scenario where a
lot of your CPU time is begin eaten by a lambda.  In the build 45
implementation these were being compiled as anonymous inner classes,
which won't be the final strategy but it meant I couldn't really
experiment with profiling.  What will a lambda method look like when
displayed in a profiler?  Have you figured out a method naming
convention?

I guess the point of all of the above is that while making it easy to
write data parallel algorithms is a nice feature to have, performance
analysis frequently dominates the time taken to find an effective
solution to performance problems.  When we ran the hackday and the
first thing that people who were parallelising did was to guess and
not measure it really validated my worst fears on the situation.
Performance analysis is not something I've seen discussed on the
observer's mailing list but it seems an issue that should be
discussed.