A Typical Truffle Specialization Pitfall

Wed Apr 18 12:21:13 UTC 2018

Hi everyone,

First of all, congratulations on the 1.0 release of GraalVM!
Here's a blog post by Stefan Marr and me on a common Truffle specialization
pitfall:

[This is also cross-posted on Stefan's blog, where it might be easier to
read
http://stefan-marr.de/2018/04/a-truffle-specialization-pitfall/]

The Truffle framework[0] allows us to write language interpreters in an easy
way. In combination with the Graal compiler[1] and its partial evaluator,
such
Truffle interpreters are able to be as fast as custom VMs. A crucial part of
the framework to achieve performance are so-called specializations[sp],
which
are used to define highly optimized and speculative optimizations for the
very
basic operations of a language.

Writing such specializations is generally pretty straight forward, but
there is
at least one common pitfall. When designing specializations, we need to
remind
ourselves that the parameter types of specializations are technically
guards[gd]. This means, the activation semantics of specializations depends
not
only on explicit guards, but also on the semantics of Java's type system.

### Pitfall for Specializations

Let's have a look at the following code example. It sketches a Truffle node
that can be used to check whether an object is some kind of number.

public abstract class IsNumberNode extends Node {

  public abstract int executeEvaluated(Object o);

  @Specialization
  protected final int doInteger(final int o) {
    return 1;
  }

  @Specialization
  protected final int doFloat(final float o) {
    return 2;
  }

  @Specialization
  protected final int doObject(final Object o) {
    return 0;
  }
}

Truffle generates a concrete implementation for this abstract class. To use
it,
the `executeEvaluated(Object)` method can be called, which will
automatically
select one of the three specializations for `int`, `float`, and `Object`
based
on the given argument.

Next, let's see this node in action:

IsNumberNode n = IsNumberNodeGen.create();

n.executeEvaluated(42);            // --> 1
n.executeEvaluated(44.3);          // --> 2
n.executeEvaluated(new Object());  // --> 0

n.executeEvaluated(22.7);          // --> 2

Great, so the node works as expected, right? Let's double check:

IsNumberNode n = IsNumberNodeGen.create();

n.executeEvaluated(new Object());  // --> 0
n.executeEvaluated(44.3);          // --> 0
n.executeEvaluated(42);            // --> 0

This time, the node seems to always return `0`. But why?

The first time the node is invoked, it sees an `Object` and returns the
correct
result. Additionally, and this is the important side effect, this invocation
also activates the `isObject(Object)` specialization inside the node. When
the
node is invoked again, it will first check whether any of the previously
activated specializations match the given argument. In our example, the
`float`
and `int` values are Java `Objects` and therefore the node always returns
`0`.
This also explains the behavior of the node in the previous series of
invocations. First, the node was called with an `int`, a `float`, and then
an
`Object`. Therefore, all specializations were activated and the node
returned
the expected result for all invocations.

One reason for these specialization semantics is that we need to carefully
balance the benefits of specializations and the cost of falling back to a
more
general version of an operation. This *falling back*, or more technically
*deoptimizing* can have a high run-time overhead, because it might require
recompilation of methods by the just-in-time compiler. Thus, if we saw the
need
for a more general specialization, we try to continue to use it, and only
activate another specialization when none of the previously used ones is
sufficient.

In case we do not actually want the Java semantics, as in our example, the
`isObject(Object)` specialization needs to be *guarded*. This means, we
need to
be sure that it cannot be called with and activated by `ints` and `floats`.
Here's how this could look like in our example:

public abstract class IsNumberNode extends Node {
  // ...

  protected final boolean isInteger(final Object o) {
    return o instanceof Integer;
  }

  protected final boolean isFloat(final Object o) {
    return o instanceof Float;
  }

  @Specialization(guards = {"!isInteger(o)", "!isFloat(o)"})
  protected final int doObject(final Object o) {
    return 0;
  }
}

These `guards` are parameters for the `@Specialization` annotation and one
can
use helper functions that perform`instanceof` checks to guard the
specialization
accordingly.

For nodes with many specializations, this can become very tedious, because
we
need to repeat all implicit and explicit guards for such specializations. To
avoid this in cases there is only one such *fallback* specialization, the
Truffle framework provides the `@Fallback` annotation[fb] as a shortcut. It
will implicitly use all guards and negate them. Thus, we can write the
following for our example:

public abstract class IsNumberNode extends Node {
  // ...

  @Fallback
  protected final int doObject(final Object o) {
    return 0;
  }
}

### How to Avoid Specialization Pitfalls?

As the example demonstrates, the described problem can occur when there are
specializations for types that are in the same class hierarchy, especially
in
case of a specialization for the most general type `Object`.

At the moment, Truffle users can only manually check if they have nodes with
such specializations to avoid this issue. But perhaps we can do a little
better.

Very useful would be a testing tool that ensures coverage for all
specializations as well as all possible combinations. This would allow us to
find erroneous/undesired generalization relationships between
specializations,
and could also ensure that a node provides all required specializations.
Especially for beginners, it would also be nice to have a visual tool to
inspect specializations and their activation behavior. Perhaps it could be
possible to have it as part of [IGV].

Depending on how commonly one actually wants such generalization or
subsumption
semantics of specializations, one could consider using Truffle's annotation
processors[ap] to perform extra checks. They already perform various checks
and
triggers errors, for example, for syntax errors in guard definitions.
Perhaps,
it could also generate a warning or an info message in case it detects
specializations for types that are part of the same class hierarchy to make
users aware of this issue. Thus, if generalization/subsumption are less
common,
one might simply indicate them explicitly, perhaps in addition to the
existing
`replaces` parameter for the `@Specialization` annotation.

[0]: https://github.com/oracle/graal/tree/master/truffle#readme
[1]: https://github.com/oracle/graal/tree/master/compiler#readme
[sp]:
https://github.com/oracle/graal/blob/master/truffle/src/com.oracle.truffle.api.dsl/src/com/oracle/truffle/api/dsl/Specialization.java
[gd]:
https://github.com/oracle/graal/blob/master/truffle/src/com.oracle.truffle.api.dsl/src/com/oracle/truffle/api/dsl/Specialization.java#L221
[fb]:
https://github.com/oracle/graal/blob/da54d0f9bccb47c5c686aa1122a844fff9b6dab0/truffle/src/com.oracle.truffle.api.dsl/src/com/oracle/truffle/api/dsl/Fallback.java#L35-L43
[ap]:
https://docs.oracle.com/javase/8/docs/api/javax/annotation/processing/Processor.html
[IGV]: http://ssw.jku.at/General/Staff/TW/igv.html

--
Fabio Niephaus
Software Architecture Group
Hasso Plattner Institute
https://www.hpi.uni-potsdam.de/swa/