Type Hierarchies and Guards in Truffle Languages

Mon Jan 4 11:20:00 UTC 2016

Hi:

In an effort to post a little more here on the list and discuss perhaps relevant questions for Truffle language implementers, I wanted to report on some changes in SOMns.

Over the last couple of days, I refactored the main message dispatch chain in SOMns.
As in Self and Newspeak, all interactions with objects are message sends.
Thus, field accesses as well as method invocation are essentially the same.
This means that message sending is a key to good performance.

In my previous design, I structured the dispatch chain in a way that, I thought, I’d reduce the necessary runtime checks.

My ’naive’ design essentially distinguished two different cases.
One case where when the receiver were standard Java objects, for instance boxed primitives such as longs and doubles, or other Java objects that are used directly. The second case were objects from my own hierarchy of Smalltalk objects.

The hierarchy is a little more involved, it includes an abstract class, a class for objects that have a Smalltalk class `SObjectWithClass`, a class for objects without fields, for objects with fields, and that one is then again subclassed by classes for mutable and immutable objects. There are still a few more details to it, but I think you get the idea.

So, with that, I thought, let’s structure the dispatch chain like this, starting with a message send node as its root:

MsgSend
 -> JavaRcvr -> JavaRcvr -> CheckIsSOMObject -> SOMRcvr -> SOMRcvr -> UninitializedSOMRcvr
                                           \-> UninitializedJavaRcvr

This represents a dispatch chain for a message send site that has seen four different receivers, two primitive types, and two Smalltalk types. This could be the case for instance for the polymorphic ‘+’ message.

The main idea was to split the chain in two parts so that I avoid checking for the SOM object more than once, and then can just cast the receiver to `SObjectWithClass` in the second part of the chain to be able to read the Smalltalk class from it.

Now it turns out, this is not the best idea.
The main problem is that `SObjectWithClass` is not a leaf class in my hierarchy. This means, at runtime, the check, i.e., the guard for `SObjectWithClass` is pretty expensive. When I looked at the compilation in IGV, I saw many `instanceof` checks that could not be removed and resulted in runtime traversal of the class hierarchy, to confirm that a specific concrete class was indeed a subclass of `SObjectWithClass`.

In order to avoid these expensive checks, I refactored the dispatch nodes to extract the guard into its own node [1] that does only the minimal amount of work for each specific case. And it only ever checks for the specific leaf class of my hierarchy, that is expected for a specific receiver.

This also means, the new dispatch chain is not separated in parts anymore as it was before. Instead, the nodes are simply added in the order in which the different receiver types are observed over time.

Overall the performance impact is rather large. I saw on the Richards benchmark a gain of 10% and on DeltaBlue about 20% [3]. Unfortunately my refactoring [3] also changed a few other details beside the changes related to `instanceof` and casts. It also made the guards for objects with fields depend on the object layout instead of the class, which avoids having multiple guards for essentially the same constraint further down the road.

So, the main take-away here is that the choice of guard types can have a major performance impact.
I also had a couple of other @Specialization nodes that were using non-leaf classes.
For instance like this: `@Specialization public Object doSOMObject(SObjectWithClass rcvr) {…}`

This looks inconspicuous at first, but fixing those and a few other things resulted in overall runtime reduction on multiple benchmarks between 20% and 30%.

A good way to find these issues is to see in IGV that `instanceof` or checked cast snippets are inlined and not completely removed. Often they are already visible in the list of phases when the snippets are resolved. Another way to identify them is the use of the Graal option `-Dgraal.option.TraceTrufflePerformanceWarnings=true` (I guess that would be `-G:+TraceTrufflePerformanceWarnings` when mx is used). The output names the specific non-leaf node checks that have been found in the graph. Not all of them are critical, because they can be removed by later phases. To check that, you can use the id of the node from the output and search for it in the corresponding IGV graph using for instance `id=3235` in the search field.

Hope such posts help.
Best regards
Stefan

[1] https://github.com/smarr/SOMns/blob/master/src/som/interpreter/nodes/dispatch/DispatchGuard.java
[2] https://github.com/smarr/SOMns/commit/a6d57fd1a4d7d8b2ce28927607ea41a52a171760
[3] http://somns-speed.stefan-marr.de/changes/?rev=bb54b1effe&exe=14&env=1

-- 
Stefan Marr
Johannes Kepler Universität Linz
http://stefan-marr.de/research/