Performance implications of the lambda draft specification

Fri Feb 19 10:56:25 PST 2010

Just to add some 2c... a good rule of thumb IMHO, is planning to _enable_
optimizations (i.e., after higher-priority design aspects are satisfied,
pick remaining choices to make optimizations possible/easier); but at the
same time, NOT planning to _depend_ on any new optimization, unless it's
something that we can really, really rely on.

Brian Goetz has once [in]famously said "Go ahead, make a mess", wrt Java
memory management. I think the intention was good: preventing non-expert
developers from trying to out-smart modern garbage collectors with
garage-quality object pools and such. But taken literally, the advise of
making a mess, with blind faith on advanced compiler/runtime tech to undo
that mess, is a disaster.

We are always optimistic about cool new optimizations. But let's do a quick
reality check on scalar replacement / stack allocation:

- Has taken ages to be implemented in HotSpot. IIRC this was planned for 5.0
and maybe even earlier, and always moved to the next release. This means,
this stuff is hard, and even though it's confimed to ship with JDK7-FCS, I
won't trust on it completely (may ship with "Early Access" quality, may ship
with some limitations, etc., so maybe we'll wait another year of maintenance
updates to really rely on it).
- Will ship only for HotSpot Server. There is no estimation of how long we
must wait to get this in HotSpot Client. I'd bet in the JDK8-FCS timeframe -
and this is because I am optimist.
- Will not be available on sub-desktop runtimes (JavaME, or whatever ME
evolves into) for even longer time.
- Wherever and whenever available, this optimization may have some extra
tradeoffs, at least extra JIT overheads.
- This is a particularly "fragile" optimization, that may be lost after
trivial, hard-to-spot code changes. For example, m1() allocates an object
that only "escapes" as a parameter in a call to m2(), but we still get
scalar replacement because m2() is inlined; however, after some trivial
maintenance m2() becomes slightly bigger than the inlining budget for m1()'s
callsite, then we lose the inlining and also the scalar replacement, and
maybe get a order-of-magnitude slowdown. (Most specific optimizations, and
remarkably optimization chains, exhibit this kind of fragility --
optimization is by nature brittle -- but the problem is more severe if we
have entire language features that take such optimization as a precondition
to not suck perf-wise.)

Having said that I'm enthusiastic about the upcoming EA-based scalar
replacement capability of JDK7, I've been testing it for a while. But much
less enthusiastic about designing a language feature that is very important,
and would depend on that optimization.

A+
Osvaldo

2010/2/19 Neal Gafter <neal at gafter.com>

> On Fri, Feb 19, 2010 at 8:26 AM, Marcelo Fukushima <takeshi10 at gmail.com>
> wrote:
> > are these concerns problematic in practice, though? i was under the
> > impression that almost all the use cases for converting a function to a
> SAM
> > would be optimized by hotspot anyway (stack allocation and method
> inlining),
> > no?
> > on the other hand, convertion at compile time seens feasible in the
> current
> > (draft) spec, right?
>
> These concerns are real.  Optimizing two levels of nested object
> creation away is less likely than optimizing one level of object
> creation away.  Conversion at compile-time can only be done in very
> restricted circumstances (for example, debugging disabled, no
> reference to "this", etc) and would be the only nontrivial
> optimization in javac.  I think we're much better off not starting
> with a factor of two handicap.
>
> Cheers,
> Neal
>
>