That was the year that was.

Mon Jan 5 15:50:00 UTC 2015

Since it's now the new year I thought it was a good opportunity to look back on progress we've made in Magik on Java over the course of the last twelve months.

In my JVMLS talk I mentioned LF memory usage and startup time as areas of concern, as did Marcus and others. Over the last couple of months I and a couple of other team members have been given the time to seriously look at our startup time and performance and, along with the changes made in 8u40, have made substantial progress.

Startup time

Getting our system to boot on Linux, using Solaris Studio and other profiling tools, and producing piles and piles of flame graphs has proved very useful in analysing startup time, and has shown up some areas of our own legacy infrastructure that were contributing substantially to our startup time, but reducing the total number of classes generated has also greatly reduced our startup.

Due to the nature of the language we do need to evaluate as we compile, so have introduced a two stage compilation process where we compile and evaluate files in small chunks but do not write out those class files, rather generating one large class file representing the whole source file at the end. On typical application code this has reduced the class count from by 75% and substantially reduced the class loading time (also greatly reducing the time spent resolving method handle constants - partly why I haven't had version 2 of that patch higher on my priority queue - sorry John). linkCallSite and friends (especially setTarget) still show up significantly on flame graphs (almost 17% of samples). The time to create a mutable callsite appears to be almost completely dominated by the MethodHandleNatives.setCallsiteTargetNormal call commonly done in a the constructor of the callsite itself).

Some quick and dirty instrumentation shows that we create about 50% more constant call sites for symbols than we do mutable call sites for method calls, but the constant sites show up in about 1/60th of the traces compared to the mutable sites.

Another 12% of startup is taken up with reseting callsite targets after the fallback has been invoked.

I’m not sure how much more time we’ll get to work on this area, or whether startup time (or at least this portion of it) will be regarded as ‘good enough’ but there seem to be a couple of avenues we could explore to improve things

  1.  We could look at refactoring our code so that setTarget does not need to be used when initialising our mutable call sites. Since most sites need a fallback method bound to themselves in some way this would require refactoring our code to create objects that hold a MutableCallSite, rather than subclass MutableCallSite. This might help to further our plans at decomposing call sites into their functional parts, but is something I’m not going to explore without doing some thorough benchmarking first.
  2.  It’s also worth digging into when it is worth resetting a callsite’s target. Mutable sites hit during bootstrap frequently only get used once, or at most a small number of times, so we might do better gathering some type information and only setting the target when it seems worth the cost.

We’ve considered a couple of more radical approaches to reducing startup time, mostly around either implementing an interpreter to handle the bootstrap code (because it’s always fun to maintain an interpreter and a compiler) or some form of serialisation (tricky to get right and fit in with modularisation work) but I’m more than open to any other wacky ideas people want to throw in.

Memory

The LambdaForm changes have had an excellent effect on application memory usage. There's still plenty of room to reduce it but that it's probably more for us to optimise our core and application code rather than fundamental JVM issues now.

Anyway, happy new year to everyone on the mlvm list,

Duncan.