Use java.util.List in type system

Wed Apr 8 21:53:32 UTC 2015

Hi Renze,

I created a basic List interface so I can swap the current basic implementation for a more optimized one later on without changing the type system and execute functions. 
Is this considered good practice in Truffle or should I just use the implementation class? And you noted that Truffle inlines everything until it sees a virtual call, so if I understand correctly using the List interface in the type system creates virtual calls when methods on those List interface are called? For example, in “@Specialization concat(List a, List b) { a.add(b); }" a and b would actually be of class BasicList but specified with interface List, so the add method would be a virtual call and stop inlining even though it should be inlined? 
That's exactly where you should add a type profile in order to ensure that the class of a and b is known to the compiler. If you leave it as it is now it might also work because there is just one implementation of List loaded at runtime. Also there are automatic type profiles on CallTarget arguments but this are things you should normally not depend on too much.
For a quick solution you could add some type profiles for all specializations of the operation like this
private final ValueProfile aProfile = ValueProfile.createClassProfile();
private final ValueProfile bProfile = ValueProfile.createClassProfile();
@Specialization
List concat(List a, List b) {
return aProfile.profile(a).add(bProfile.profile(b));
}
Or using the @Cached annotation just for one specialization like this:
@Specialization
List concat(List a, List b, 
@Cached("createClassProfile()") ValueProfile aProfile, 
@Cached("createClassProfile()") ValueProfile bProfile) {
return aProfile.profile(a).add(bProfile.profile(b));
}
I can recommend you to have a look at the ExactClassValueProfile class which implements the ValueProfile you are using here. Its interesting to see that these profiles just use low-level Truffle API. It basically just injects the constant type information into the value using Class#cast.
You might also want to think of adding type profiles to your variable reads and global lookups. If this is a good idea depends on the guest language you are implementing and how many types and specialized types are flowing through the interpreter.
As you go further along with your array operations you might have several specialized data types that implement List. For instance one version that uses an int[] as backing store another that uses an Object[]. This way its likely that you hit the case where having just one type profile is not good enough because the operation is polymorphic. To be fast even in these cases you can implement a type cache that looks roughly like this:
       @Specialization(limit = "3", guards = {"a.getClass() == aClass", "b.getClass() == bClass"})
        public Object doCached(List a, List b, 
                        @Cached("a.getClass()") Class<List> aClass, 
                        @Cached("b.getClass()") Class<List> bClass) {
            return aClass.cast(a).add(bClass.cast(b));
        }
        @Specialization(contains = "doCached")
        public Object doGeneric(List a, List b) {
            // might produce a virtual call
            return a.add(b);
        }
This operation caches for 3 combinations of classes that might occur at this place in the guest language. To avoid generating code for all possible combinations of types the node rewrites to the doGeneric specialization as soon as the limit of 3 instantiations of the doCached specialization is reached (specified by contains="doCached" and limit ="3"). Please also see the javadoc of @Cached for further details.
I'd like to add that you can also use these kind of type caches in a generic way by passing in a node that implements your operation from the outside or alternatively create it dynamically using the @Cached annotation again. 
Also all logic of the list operations like concatenation and insertion is implemented in the BasicList class that implements the List interface, and the addition node simply calls 
the concat method of the list interface to do the work for it. But Chris Seaton advised me to not implement logic in the type classes like I did now but in the node classes 
where the operation is specialized (see http://mail.openjdk.java.net/pipermail/graal-dev/2015-April/003015.html ). If I apply his advise I think I should limit the List interface 
to getting and putting objects and use those methods to implement the concat and other operations in the node classes directly? 
The main problem with implementing everything in the BasicList class is that you can only store runtime feedback in nodes. Storing runtime feedback (like type profiles or caches) is an essential capability when dealing with complex operations. You have two options here. Either you always pass the runtime feedback into your list methods or you implement your operations inside of nodes. The first option has the advantage that you can implement your operation in the implementation class and hide the details from your nodes (although the runtime profile you have to pass could be seen as an implementation detail). I would recommend this option for the basic primitive operations of your list, those which are essential to hide the inner workings of your datastructure like random access set or get. The second option gives you the full power to express specializations with Truffle.  Be careful not to duplicate your logic over multiple nodes and reuse your Nodes as much as possible just like methods.
Hope this helps.

- Christian Humer
On 08.04.2015 22:05:14, Renze Torensma <renzetorensma at gmail.com> wrote:
Hi Christian,

I created a basic List interface so I can swap the current basic implementation for a more optimized one later on without changing the type system and execute functions.
Is this considered good practice in Truffle or should I just use the implementation class? And you noted that Truffle inlines everything until it sees a virtual call, so if I understand correctly using the List interface in the type system creates virtual calls when methods on those List interface are called? For example, in “@Specialization concat(List a, List b) { a.add(b); }" a and b would actually be of class BasicList but specified with interface List, so the add method would be a virtual call and stop inlining even though it should be inlined?

Also all logic of the list operations like concatenation and insertion is implemented in the BasicList class that implements the List interface, and the addition node simply calls
the concat method of the list interface to do the work for it. But Chris Seaton advised me to not implement logic in the type classes like I did now but in the node classes
where the operation is specialized (see http://mail.openjdk.java.net/pipermail/graal-dev/2015-April/003015.html ). If I apply his advise I think I should limit the List interface
to getting and putting objects and use those methods to implement the concat and other operations in the node classes directly?

Sorry for the long post,
Renze

> On 7 apr. 2015, at 23:46, Renze Torensma wrote:
>
> Thanks for your suggestions! Now I know why no current implementation uses the java.util.List subclasses :)
>
> I will have a look at the implementations of jRuby and ZipPy, and I will start with a minimal version of my own List class.
>
> Renze
>> On 7 apr. 2015, at 20:17, Stefan Marr wrote:
>>
>> Hi Renze:
>>
>>> On 07 Apr 2015, at 20:09, Christian Humer wrote:
>>>
>>> You can get some inspiration on the array implementation of JRuby:
>>> https://github.com/jruby/jruby/blob/master/truffle/src/main/java/org/jruby/truffle/runtime/core/RubyArray.java
>>
>> Also interesting might be ZipPy’s solution:
>>
>> https://bitbucket.org/ssllab/zippy/src/03c969e32efd9dd6f2949f71b5a8ecd79b962741/edu.uci.python.runtime/src/edu/uci/python/runtime/sequence/?at=default
>>
>> Best regards
>> Stefan
>>
>> --
>> Stefan Marr
>> INRIA Lille - Nord Europe
>> http://stefan-marr.de/research/
>>
>>
>>
>