Truffle and mlvm

Charles Oliver Nutter headius at headius.com
Sat Aug 30 19:21:41 UTC 2014


Removing all context, so it's clear this is just my opinions and thoughts...

As most of you know, we've opened up our codebase and incorporated the
graciously-donated RubyTruffle directly into JRuby. It's available on
JRuby master and we are planning to ship Truffle support with JRuby
9000, our next major version (due out in the next couple months).

At the same time, we have been developing our own next-gen IR-based
compiler, which will run unmodified on any JVM (with or without
invokedynamic, though I still have to implement the "without" side).
Why are we doing this when Truffle shows such promise?

I'll try to enumerate the benefits and problems of Truffle here.

* Benefits of using Truffle

1. Simpler implementation.

>From day 1, the most obvious benefit of Truffle is that you just have
to write an AST interpreter. Anyone who has implemented a programming
language can do this easily. This specific benefit doesn't help us
implement JRuby, since we already have an AST interpreter, but it did
make Chris Seaton's job easier building RubyTruffle initially. This
also means a Truffle-based language is more approachable than one with
a complicated compiler pipeline of its own.

2. Better communication with the JIT.

Truffle, via Graal, has potential to pass much more information on to
the JIT. Things like type shape, escaped references, frame access,
type specialization, and so on can be communicated directly, rather
than hoping and praying they'll be inferred by the shape of bytecodes.
This is probably the largest benefit; much of my time optimizing JRuby
has been spend trying to "trick" C2 into doing the right thing, since
I don't have a direct way to communicate intent.

The peak performance numbers for Truffle-based languages have been
extremely impressive. If it's possible to get those numbers reasonably
quickly and with predictable steady-state behavior in large,
heterogeneous codebases, this is definitely the quickest path (on any
runtime) to a high-performance language implementation.

3. OSS and pure Java

Truffle and Graal are just OpenJDK projects under OpenJDK licenses,
and anyone can build, hack, or distribute them. In addition, both
Truffle and Graal are 100% Java, so for the first time a plain old
Java developer can see (and manipulate) exactly how the JIT works
without getting lost in a sea of plus plus.

* Problems with Truffle

I want to emphasize that regardless of its warts, we love Truffle and
Graal and we see great potential here. But we need a dose of reality
once in a while, too.

1. AST is not enough.

In order to make that AST fly, you can't just implement a dumb generic
interpreter. You need to know about (and generously annotate your AST
for) many advanced compiler optimization techniques:

A. Type specialization plus guarded fallbacks: Truffle will NOT
specialize your code for you. You must provide every specialized path
in your AST nodes as well as annotating "slow path", "transfer to
interpreter", etc.

B. Frame access and reification: In order to have cross-call access to
frames or to squash frames created for multiple inlined calls, you
must use Truffle's representation of a frame. This means loads/stores
within your AST must be done against a Truffle object, not against an
arbitrary object of your own creation.

C. Method invocation and inlining: Up until fairly recently, if you
wanted to inline methods you had to essentially build your own call
site logic, profiling, deopt paths within your Truffle AST. When I did
a little hacking on RubyTruffle around OSS time (December/January) it
did *no* inlining of Ruby-to-Ruby calls. I hacked in inlining using
existing classes and managed to get it to work, but I was doing all
the plumbing myself. I know this has improved in the Truffle codebase
since then, but I have my concerns about production readiness when the
inlining call site parts of Truffle were just recently added and are
still in flux.

And there's plenty of other cases. Building a basic language for
Truffle is pretty easy (I did a micro-language in about two hours at
JVMLS last year), but building a high-performance language for Truffle
still takes a fair investment of effort and working knowledge of
dynamic compiler optimizations.

2. Long startup and warmup times.

As Thomas pointed out in the other thread, because Truffle and Graal
are normally run as plain Java libraries, they can actually aggravate
startup time issues. Now, not only would all of JRuby have to warm up,
but the eventual native code JIT has to warm up too. This is not
surprising, really. It is possible to mitigate this by doing some form
of AOT against Graal, but for every case I have seen the Truffle/Graal
approach makes startup time much, much worse compared to just running
atop JVM.

Warmup time is also worsened significantly.

The AST you create for Truffle must be heavily mutated while running
in order to produce a specialized version of that AST. This must
happen before the AST is eventually fed into Graal, which means you
have a self-modifying interpreter spinning AST objects like mad while
executing the early phases of your application. Compare to a dumb
interpreter as in JRuby's old AST, where interpreting the AST produces
no additional objects other than those necessary for execution of the
code.

The Truffle approach itself adds overhead too. Until optimized, the
fully-reified frame objects, specialization markup (which triggers AST
rewriting), deoptimization guards, and so on are all done manually
against heap-level data structures. This is in addition to the
JVM-level overhead of executing an AST (native frame-per-node, boxing
and type-widening, poor inlining profile).

Some amount of AOT *might* be applicable here, but the benefit of
Truffle and Graal is lost in the AOT case if we're not getting
real-world profile information. The Substrate VM has ben brought up to
aid startup and warmup too...but that direction produces a
closed-world executable based on optimizing all code up front...not
exactly what we're looking for in a general-purpose language runtime.

3. Limited concurrency

The RubyTruffle runtime currently has to execute code under the
watchful eye of a global lock. Yes, you read that right...RubyTruffle
is single-threaded right now.

I would like to know if there's a deeper reason for this, but the
obvious shallow reason is that you can't have multiple threads
executing at the same time if they're making thread-unsafe mutations
to the executing code. This is similar to the major stumbling block
for e.g. Pypy, which rewrites currently-executing assembly
instructions at deopt/reopt safe points.

I believe once the code has transitioned to native, you can execute
that safely across threads...but this is opaque to your Truffle-based
language, and it's unclear how you'd manage re-acquiring some sort of
lock when transferring back to the interpreter.

The fact that concurrency has so far been hand-waved (or so it seems
to me from the outside) scares the living hell out of me, especially
when there's talk about rolling this stuff into Java 9.

Obviously some of this could be mitigated with an immutable AST
structure or other thread-friendly tree-transformation algorithm, but
making the Truffle AST thread-safe may also make it even more
object-heavy during interpretation, aggravating startup time further.

4. Limited availability

This is the chicken-and-egg issue. Truffle is just a library, so we
can ignore that for the moment (given any JVM, you can run a Truffle
language).

Graal is required for Truffle to perform well at all. The Truffle
interpreter is without a doubt the slowest interpreter we've ever had
for JRuby, and that's saying something (there could be startup/warmup
effects in play here too). In order for us to go 100% Truffle, we'd
need a Graal VM. That limits us to either pre-release or hand-made
builds of Graal/OpenJDK. Even if Graal somehow did get into Java 9,
we'd still have legions of users on 8, 7, ... even 6 in some cases,
though we're probably leaving them behind with JRuby 9000. Ignoring
other platforms (non-OpenJDK, Android) and assuming Graal in Java 9,
I'd conservatively estimate JRuby could still not go 100% Truffle
until 2017 or later.

And it gets worse. Graal will probably never exist on other JVMs.
Graal will probably never exist in an Android VM. Graal may not even
be available in other non-Oracle OpenJDK derivatives for a very long
time. We have users on dozens of different platform/JVM combinations,
so there's really no practical way for us to abandon our JVM bytecode
runtimes in the near future.

Now of course if Graal became essential to users, it would be
available in more places. We recognize the potential of Truffle and
Graal, which is why we've been thrilled to work with Oracle on a
RubyTruffle that's part of JRuby. We also recognize that the
Truffle/Graal approach has some very compelling features for our
users, and that our users may often be comfortable running custom
JVMs. We're allowing all flowers to bloom and our users will pick the
ones that work for them.

5. Unclear benefits for real-world applications

There have been many published microbenchmarks for Truffle-based
languages, but very few benchmarks of real-world applications
performing significantly better than custom-made VMs (JS versus V8).
There have been practically no studies of a Truffle-based language
running a large application for a long period of time...and by long I
mean server-scale.

Chris Seaton has pushed this forward recently for Ruby, getting
general-purpose, numeric-heavy libraries to run and optimize very well
(a png library and a psd library). Going deeper requires having more
of the language's standard libraries to be available, and I believe
this is where Chris has spent much of his time (RubyTruffle currently
requires mostly-custom versions of JRuby's core classes...versions
that Truffle can recognize, specialize, and escape-analyze away).

* Conclusion

I again want to emphasize that we think Truffle and Graal are really
awesome technology. I spent years with my nose smooshed against the
glass, watching the Pypy guys add optimizations I wanted and make good
on their promise of "just implement an interpreter...we'll do the
rest". Finally we have what I wanted: a Pypy for JVM (in Truffle) and
an LLVM for JVM (in Graal). These are exciting times indeed.

But reality steps in. There's a long road ahead.

I think we need to separate the questions about Truffle from questions
about Graal. Truffle is ultimately just a library that uses Graal.

Graal is promising JIT technology. Graal is simpler than C2 and may be
able to match or beat its performance. Graal provides a better way to
communicate intent to the JIT. These facts are not in question.

However, Graal is not (other than when used as the JVM's JIT) a JVM.
Targeting Graal directly acts against the promise of a standard,
platform-and-VM-agnostic bytecode -- and that's the promise that
brought most of us here. Graal is not yet ready to replace C2, which
would mean adding to the size and complexity of Java 9. And Graal is
almost completely untested in large production settings.

I personally would love to see Graal get into a Java release soon as
an experimental feature, but Java 9 seems ambitious but any standard.
It *might* be possible/reasonable to include Graal as experimental in
9. Java 10 is certainly feasible for experimental, and may be feasible
for product. But even if Graal got into mainstream OpenJDK and Java,
there's a very long adoption tail ahead.

I'd like to hear more from folks on the Graal and Truffle teams. Prove
me wrong :-)

- Charlie


More information about the mlvm-dev mailing list