How to understand the mechanics of Truffle

Tue Sep 18 18:53:01 UTC 2018

> On 18 Sep 2018, at 14:57, Timothy Baldridge <tbaldridge at gmail.com> wrote:
> 
> I'm evaluating Truffle for use in a language I'm developing. I've written
> several interpreters (with JITs) in RPython in the past, so I'm familiar
> with the concepts involved. I've also read quite a few papers on Truffle
> and the differences between it and RPython.
> 
> However, I have a few questions, these are things that are still vague in
> my mind even after reading the literature available on Truffle:
> 
> 1) When does Truffle start the merging of AST nodes into compiled code? Is
> this demarcation specified by the programmer, or by a profiler in Truffle?
> Is there a way for the programmer to influence these mechanics?

There is a call threshold for compilation Truffle methods (a method represented as Truffle AST nodes). When the threshold is hit, all nodes in the AST are compiled as one compilation unit.

Exceptions to this include on-stack-replacement, inlining, and splitting (creating multiple copies of ASTs).

> 2) How much of a given AST is compiled into a single code unit by Truffle?
> How is that controlled?

All of the AST that is compiled into a single unit.

Exceptions to this include branches that the profiler has found are never used, or that through some other constant value there is never actually a branch that executes them, even though they are there in the AST.

> 3) Does Truffle search an AST graph via partial execution, or by reflection
> (by walking all fields on nodes that are marked with @Node annotations?).
> What is the point of the annotations for Nodes and Node children, is it
> purely programmer convenience or does it tie into the JIT?

Truffle has a method `replace(otherNode)` to replace a node with another node. To do that automatically, it needs to know which fields the node has that points to other nodes, so it can update those fields. Instead of manually specifying these, the `@Node` annotations allow Truffle to find them automatically via reflection.

The annotation also tells Truffle’s partial evaluator to treat the fields as final, even though they’re mutable for the purposes of the interpreter.

> 4) The literature states that a requirement for a Truffle interpreter is
> that the AST should stabilize at some point in order to not require
> continual re-compilation. How does this work for code that uses generators
> (like ZipPy) that use the special Control-Flow exception? Is it possible to
> have a control-flow exception return an AST node that is executed by the
> exception handler?

Control-flow exceptions do not cause deoptimisation in most cases.

The execute method of nodes (really, any method that takes a `VirtualFrame` parameter) needs to be called on an object reference that is compilation final. If an exception is thrown that contains a node as a field, then that node is unlikely to be constant by the time you read it back out (it could be, if the exception is thrown and caught in the same compilation unit, and the catch-site can see the node is only ever one object, and probably more conditions). That’s not normally how you write a Truffle interpreter and I wouldn’t recommend that. I’m not sure what ZipPy is doing that you’re referring to.

> 5) What is the definition of "not changing" for a AST graph? Is it that the
> AST nodes should stop throwing de-optimization exceptions at some point? Is
> it that the node types must change? Or is it that the actual instances of
> the AST nodes must not change from one evaluation of the AST tree to the
> next?

The key thing is that you stop deoptimising.

Changing the AST (replacing one node with another) is one cause of deoptimisation. It doesn’t matter if the type changes - it’s the reference changing to refer to another object that is the problem.

> Thanks for the help, and for the great work on Truffle!
> 
> Timothy Baldridge