Boxing

Sat Aug 3 10:53:12 PDT 2013

Hi all,

I've been playing around with truffle for a few days now, and I think I'm starting to understand how basic types and node specialization work.  My biggest problem is that I have a fairly good understanding of how Java classes are mapped into memory and the assembly the JVM will generate to operate on these.  When it comes to Truffle/Graal, I have changed my mental model to Java Classes are simply structs and I am responsible for associating "computation units" with these (like c code).  What I don't have a good feeling for is how Truffle/Graal break down the "leaf most" structures of my language.   For example, in Java (without box removal and luck) I know that an Integer is a pointer to an int, and if I compute over this a lot, I need remove the boxing as early as possible in the computation.  In Java this is manual and in Truffle it happens through specialization.  This brings up a number of questions:

Are Java boxed types in Truffle special?  Can I add my own box types and will they get the same treatment?  For example, say I have a Timestamp type which is just a boxed long, but I want a normal Java class for this so I can interact with user provided functions written in Java.  Will Truffle automatically, do the box removal when the Timestamp is not null?  How do I set this up with the DSL so I don't have to hand code all of the specializations?

Now that I have a Timestamp type, I want to perform calculations over ver large vectors (100m elements) of these.  In my existing engine, I represent this vector using to primitive vectors, one to hold the long value and one to hold the null flag.  If I don't have any null values, Truffle would specialize to a guard on the null flag and a primitive tree.  But say, I do have some very rare nulls in the vector.  How do I structure the Truffle code, so I don't end up with all computation always paying the slow cost?  For example, should I simply new up a Timestamp class and assume that Truffle will elide the boxing and magically handle the nulls. Or, should I make the Timestamp class have a "isNull" field, handle the "null" check manually in the nodes, and Truffle will remove the box entirely (leaving the boolean and long in registers).

My language also has "complex" primitives like TimestampWithTimezone which is the tuple (long, int) but from a language perspective it is a single primitive value.  For this type, I need to have a real Java class to carry these values to user provided functions written in Java and for Java functions to actually return one of these.  As above, when I'm operating on a large vector of these values, I don't want to pay the cost of the "box", so how do I structure these types in Truffle to avoid this?  

Does the Truffle/Graal magic extend to Java code called from Truffle nodes.  Specifically, in my language my users can plugin functions written in Java which operate on types like TimestampWithTimezone.  Will Truffle/Graal inline these Java methods and remove the boxing?

And of course, my users can define new types like TimestampWithTimezone.  Is there a way to make all of this stuff above generic and fast, or do I need to byte code generate Truffle nodes at runtime?

-dain