Boxing

Mon Aug 5 00:01:48 PDT 2013

Dain,

A few additional clarifications:

Graal's partial escape analysis tries to push allocations down as far as possible. In case Graal sees both the allocation and all usages of an object in one compilation unit (i.e., inline tree of Java methods or Truffle tree), it will not allocate the object at all and instead store its values in registers. This representation is highly efficient. Given that Graal knows in this case that the object is only accessible by the current thread, it can perform very aggressive optimizations. Therefore, you should not refrain from using nicely structured custom boxed types for your intermediate data structures.

Regarding custom subtypes of your main types: You do not need to write specialisations for each of those types. Make sure Graal sees the type of the object in its compilation unit, then every virtual call on the object becomes a direct inlinable call site. This means when the object's pointer is loaded from a location with a static type of the base type, you could insert a Truffle node that performs dynamic profiling using an Class.isInstance check against the most specific type read. In case Graal sees the object allocation, Graal knows the specific type anyway. In general, the Truffle specialisations are mainly a way to specify an additional hint or simpler algorithm to the compiler that it could not figure out itself. Therefore it is indeed beneficial as Chris says to first take a look at the generated machine code or compiler graph before adding new specialisations. We have an IR viewer for Graal IR that you can start via "./mx.sh igv" and use via "-G:Dump=Truffle". There should never be the need to generate bytecodes.

- thomas

On Aug 4, 2013, at 10:45 AM, Chris Seaton <chris at chrisseaton.com> wrote:

> Hello Dain,
> 
> 
>> I've been playing around with truffle for a few days now, and I think I'm
>> starting to understand how basic types and node specialization work.  My
>> biggest problem is that I have a fairly good understanding of how Java
>> classes are mapped into memory and the assembly the JVM will generate to
>> operate on these.  When it comes to Truffle/Graal, I have changed my mental
>> model to Java Classes are simply structs and I am responsible for
>> associating "computation units" with these (like c code).  What I don't
>> have a good feeling for is how Truffle/Graal break down the "leaf most"
>> structures of my language.   For example, in Java (without box removal and
>> luck) I know that an Integer is a pointer to an int, and if I compute over
>> this a lot, I need remove the boxing as early as possible in the
>> computation.  In Java this is manual and in Truffle it happens through
>> specialization.  This brings up a number of questions:
>> 
> 
> I know there's nothing more annoying than someone quoting Knuth at you when
> you're trying to get a good understanding of performance models of a new
> system, but I will say that the way that we encourage Truffle to be used is
> to not think about these things first.
> 
> If you are thinking about these things before you have a working
> interpreter for most of the language, you may be thinking about them too
> early. These sort of questions probably shouldn't be a consideration when
> designing the architecture of your Truffle program. A well-written Truffle
> program should be a pretty natural Java program. In fact, you should not
> need to know anything about the internals of boxing to write a high
> performance Truffle program.
> 
> I have found that the kind of performance I am looking for generally comes
> quite easily with Truffle, and performance problems are generally nicely
> localised (a single node or a single interaction between nodes is the
> problem), and that the problem is a higher level algorithmic problem,
> rather than low level things.
> 
> 
>> Are Java boxed types in Truffle special?  Can I add my own box types and
>> will they get the same treatment?  For example, say I have a Timestamp type
>> which is just a boxed long, but I want a normal Java class for this so I
>> can interact with user provided functions written in Java.  Will Truffle
>> automatically, do the box removal when the Timestamp is not null?  How do I
>> set this up with the DSL so I don't have to hand code all of the
>> specializations?
>> 
> 
> Java boxed types are not special to Truffle, but the difference is that
> they can be more successfully removed, because those optimisations which we
> already perform intra-method in HotSpot are in Truffle performed across the
> aggregated AST method code, as if it were a single method. So boxing
> elimination, for example, is more likely to be successful. Also, we
> normally remove the requirement for boxing in Truffle via static typing of
> the AST.
> 
> However I see what you're doing here - you want a basic long, which you
> want to sort of typedef to do something in particular, and you have to do
> that via boxing. This is interesting and not something I've done before. I
> would say in the first instance, use a box and see what happens. If you are
> using straight Truffle without the code generator, then you can have both
> 'long executeLong()' and 'long executeTimestamp()' - I don't think there is
> anything that will cause problems with that. The code generator may not
> expect this use case though, so be careful there.
> 
> 
>> Now that I have a Timestamp type, I want to perform calculations over ver
>> large vectors (100m elements) of these.  In my existing engine, I represent
>> this vector using to primitive vectors, one to hold the long value and one
>> to hold the null flag.  If I don't have any null values, Truffle would
>> specialize to a guard on the null flag and a primitive tree.  But say, I do
>> have some very rare nulls in the vector.  How do I structure the Truffle
>> code, so I don't end up with all computation always paying the slow cost?
>> For example, should I simply new up a Timestamp class and assume that
>> Truffle will elide the boxing and magically handle the nulls. Or, should I
>> make the Timestamp class have a "isNull" field, handle the "null" check
>> manually in the nodes, and Truffle will remove the box entirely (leaving
>> the boolean and long in registers).
>> 
> 
> Nulls and other special values in data structures are an open problem I
> believe - I know it's easy to have a single value mess up your otherwise
> nicely specialised data structure. Ask the FastR people - they have these
> kind of problems.
> 
> 
>> My language also has "complex" primitives like TimestampWithTimezone which
>> is the tuple (long, int) but from a language perspective it is a single
>> primitive value.  For this type, I need to have a real Java class to carry
>> these values to user provided functions written in Java and for Java
>> functions to actually return one of these.  As above, when I'm operating on
>> a large vector of these values, I don't want to pay the cost of the "box",
>> so how do I structure these types in Truffle to avoid this?
>> 
> 
> What I said about very successful boxing elimination doesn't apply in
> larger data structures that persist outside of a method, so I see your
> problem here. But if you control the code of the collection, you can just
> do something like pack all the values - so you have
> a TimestampWithTimezoneList that has an array of longs. Each timestamp is
> two of those values, next to each other. When you get a value you create a
> proper object from it. That's boxing of course, but only at the point of
> use, and if that code is compiled as part of an AST method then we may be
> able to eliminate it.
> 
> Does the Truffle/Graal magic extend to Java code called from Truffle nodes.
>> Specifically, in my language my users can plugin functions written in Java
>> which operate on types like TimestampWithTimezone.  Will Truffle/Graal
>> inline these Java methods and remove the boxing?
>> 
> 
> Yes - and actually this can be a problem. For example, in the past I was
> using HashMap methods and these were all getting inlined in a huge number
> of places and the code was exploding in size. I think Graal is more careful
> about that these days. The way to check of course is to use
> CompilerAsserts.neverPartOfCompilation() - if you want to know if Java code
> is being inlined, add that.
> 
> 
>> And of course, my users can define new types like TimestampWithTimezone.
>> Is there a way to make all of this stuff above generic and fast, or do I
>> need to byte code generate Truffle nodes at runtime?
> 
> 
> I think bytecode generating Truffle nodes, while perhaps technically
> possible, is surely a path to madness at this stage. Give it a go and let
> us know how it works out, please, but I think you may build a system where
> you're not seeing on of the great benefits of Truffle - simplicity.
> 
> Regards,
> 
> Chris
> 
> 
> On 3 August 2013 18:53, Dain Sundstrom <dain at iq80.com> wrote:
> 
>> Hi all,
>> 
>> I've been playing around with truffle for a few days now, and I think I'm
>> starting to understand how basic types and node specialization work.  My
>> biggest problem is that I have a fairly good understanding of how Java
>> classes are mapped into memory and the assembly the JVM will generate to
>> operate on these.  When it comes to Truffle/Graal, I have changed my mental
>> model to Java Classes are simply structs and I am responsible for
>> associating "computation units" with these (like c code).  What I don't
>> have a good feeling for is how Truffle/Graal break down the "leaf most"
>> structures of my language.   For example, in Java (without box removal and
>> luck) I know that an Integer is a pointer to an int, and if I compute over
>> this a lot, I need remove the boxing as early as possible in the
>> computation.  In Java this is manual and in Truffle it happens through
>> specialization.  This brings up a number of questions:
>> 
>> Are Java boxed types in Truffle special?  Can I add my own box types and
>> will they get the same treatment?  For example, say I have a Timestamp type
>> which is just a boxed long, but I want a normal Java class for this so I
>> can interact with user provided functions written in Java.  Will Truffle
>> automatically, do the box removal when the Timestamp is not null?  How do I
>> set this up with the DSL so I don't have to hand code all of the
>> specializations?
>> 
>> Now that I have a Timestamp type, I want to perform calculations over ver
>> large vectors (100m elements) of these.  In my existing engine, I represent
>> this vector using to primitive vectors, one to hold the long value and one
>> to hold the null flag.  If I don't have any null values, Truffle would
>> specialize to a guard on the null flag and a primitive tree.  But say, I do
>> have some very rare nulls in the vector.  How do I structure the Truffle
>> code, so I don't end up with all computation always paying the slow cost?
>> For example, should I simply new up a Timestamp class and assume that
>> Truffle will elide the boxing and magically handle the nulls. Or, should I
>> make the Timestamp class have a "isNull" field, handle the "null" check
>> manually in the nodes, and Truffle will remove the box entirely (leaving
>> the boolean and long in registers).
>> 
>> My language also has "complex" primitives like TimestampWithTimezone which
>> is the tuple (long, int) but from a language perspective it is a single
>> primitive value.  For this type, I need to have a real Java class to carry
>> these values to user provided functions written in Java and for Java
>> functions to actually return one of these.  As above, when I'm operating on
>> a large vector of these values, I don't want to pay the cost of the "box",
>> so how do I structure these types in Truffle to avoid this?
>> 
>> Does the Truffle/Graal magic extend to Java code called from Truffle
>> nodes.  Specifically, in my language my users can plugin functions written
>> in Java which operate on types like TimestampWithTimezone.  Will
>> Truffle/Graal inline these Java methods and remove the boxing?
>> 
>> And of course, my users can define new types like TimestampWithTimezone.
>> Is there a way to make all of this stuff above generic and fast, or do I
>> need to byte code generate Truffle nodes at runtime?
>> 
>> -dain