Truffle DSL feature suggestions

Mon Nov 11 15:04:14 PST 2013

Stefan,

I've updated the SimpleLanguage [1]. It now contains a basic inline cache
similar to the one described in my previous mail.

[1] http://hg.openjdk.java.net/graal/graal/rev/71991b7a0f14

- Christian Humer

On Fri, Nov 8, 2013 at 5:48 PM, Christian Humer
<christian.humer at gmail.com>wrote:

> Hi Stefan,
>
> Seems that your making good progress!
>
> Support for inheriting @NodeChild
>> ————————————————————————————————-
>> The next thing I noticed is that @NodeChild annotation are not inherited
>> when the subclass defines additional @NodeChilds. While it is not a big
>> issue, it feel counter intuitive when I noticed it for the first time. My
>> expectation was that these annotations have an additive semantics.
>> As an example, I have a AbstractMonomorphicMessageNode [3] with the
>> @NodeChild(value = “receiver”, ExpN.class) annotation, and in the subclass
>> BinaryMonomorphicNode [4] I need to specify the receiver again, because I
>> want to add an argument NodeChild.
>
>
> @NodeChild elements should be inherited already. Unfortunately there is a
> bug when inheriting more than one @NodeChild defined int he base class and
> there is another @NodeChild annotation defined in the inheriting class, the
> second @NodeChild of the base class is not inherited. Seems that nobody
> used this annotation in this way so far ;-). Will fix this and keep you
> posted.
>
> However, what already works is using a variable number of arguments
> @NodeChild in the base class like this:
>
>     @NodeChildren({@NodeChild(value = "receiver", type = ValueNode.class),
> @NodeChild(value = "arguments", type = ValueNode[].class)})
>     abstract static class BaseNode extends ValueNode {
>         public abstract ValueNode getReceiver();
>         public abstract ValueNode[] getArguments();
>     }
>
>     abstract static class UnaryNode extends BaseNode {
>         @Specialization
>         public int doIt(Object receiver, int value0) {
>             return value;
>         }
>     }
>
>     abstract static class BinaryNode extends BaseNode {
>         @Specialization
>         public int doIt(Object receiver, int value0, int value1) {
>             return value0 + value1;
>         }
>     }
>
>     abstract static class TernaryNode extends BaseNode {
>         @Specialization
>         public int doIt(Object receiver, int value0, int value1, int
> value2) {
>             return value0 + value1 + value2;
>         }
>     }
>
> Also take a look at the generated code. There you can see that instead of
> accessing a field per child the generated source now accesses the children
> in an array using fixed indexes. So for the third value it accesses
> arguments[1].execute() to execute the value. Please note that all
> specializations in one operation must have the same number of arguments.
> Using the NodeFactory#getExecutionSignature() method you can find out the
> actual number of arguments that were used dynamically. So you would be able
> to write one generic rewrite mechanism.
>
>
> Allowing @Generic for Simple Nodes
>
> —————————————————————————————————-
>> In my new node hierarchy, I use UnaryMessageNode (..Binary/Ternary/KeywordMsgNode)
>> [2] to represent the initial state of message send as generated by the
>> parser. So, the main idea is to avoid specialization before execution
>> starts. The parser could do some simple specializations already purely
>> based on the static name of a message. However, I decided against doing
>> that, following what I understand to be one of the main suggestions for
>> Truffle.
>> As a result I have the UnaryMessageNode (Binary/Ternary/Keyword) class
>> just implementing a doGeneric(.) method, which replace the node by a
>> UnaryMonomorphicNode. For the doGeneric(.) method, I feel that @Generic
>> would be the better annotation, and putting @Specialization there, because
>> it otherwise wouldn’t have a specialization, feels a little unnatural. The
>> @Generic annotation also has another benefit, because it enforces certain
>> constraints on the method signature, which is useful to make the type
>> system happy.
>
>
> I see your point. Just @Generic should be possible. I've put it on my list.
>
> I do not recommend doing manual replaces inside of @Specialization or
> @Generic annotated methods at all. Think of the use-case of polymorphic
> chains where a call to replace would just replace one entry of the
> polymorphic chain. This would lead to a wrong tree. If you want do do that
> please turn off the polymorphic chaining by using @PolymorphicDepth(0). But
> I don't think you will need that if you do it the way I describe it below.
>
> Polymorphic Inline Caching
>> —————————————————————————-
>
> First thanks for your input on the new DSL feature.  I hope we will be
> able to implement it soon.
>
> Let me explain to you how polymorphic inline caches are currently
> implemented in most of the truffle languages by referencing what i think
> you want to achieve.
>
> final class Message {
>   String name;
>   ...
>   CallTarget target;
>   ...
> }
>
> Each available message is represented as such an instance. A CallTarget
> for the message can be created by a call to
> Truffle.getRuntime().createCallTarget(...). These Message instances are
> usually stored in a global map or a similar concept.
>
> For a call you start off with one uninitialized call/message node which
> has a child for receiving the function and one for the receiver.
>
> class UninitializedSendNode {
>  message = ... // returns a Message (may do a lookup in a local variable
> or global map)
> receiver = ... // returns the receiver object
>  uninitializedArguments[] = ... // the uninitialized arguments of the
> call as they are parsed
> }
>
> When the first call to the execute method in UninitializedMessageNode
> occurs the uninitialized node creates a cache like this:
>
> class CachedSendNode {
> final CallTarget cachedTarget; // the call target this node is specialized
> on
> receiver, message;
>  uninitializedArguments[];
> current = DispatchedSendNode // Implementation of the call (may get
> inlined later or immediatly)
>  arguments // copied from uninitializedArguments
> next = UninitializedMessageNode or again CachedSendNode
>  // how an implementation for that node could look like:
> Object execute(VirtualFrame frame) {
>  Object receiver = receiver.execute(frame);   // we need to ensure that
> the receiver is just evaluated once per call
>  Message m = message.execute(frame);
>  if (this.cachedTarget == m.getCallTargetTarget()) {
> return current.execute(frame, receiver);
>  }
> return next.execute(frame, receiver);
> }
> }
>
> This CachedSendNode can be chained as a cache using the next field. After
> reaching some depth the chain should rewrite itself to a
> GenericSendNode which then does not store the callTarget as a final field
> but always dispatches the call to the callTarget (no inlining possible
> anymore).
>
> Each DispatchedSendNode knows about their callTarget as a final field and
> just calls the constant/final CallTarget in the default case.
> After some inlining policy (which could be immediately, later or not at
> all) the DispatchedSendNode gets replaced by a compatible node which
> represents the CallTarget. This can usually be achieved by casting the
> CallTarget to DefaultCallTarget and looking at its RootNode to copy it.
>
> I think the described pattern should work for you and it has been proven
> to work for our implementations.
> I will try to put together an example using the SimpleLanguage next week.
>
> Have a nice weekend!
>
> Christian Humer
>
>
> On Thu, Nov 7, 2013 at 5:30 PM, Stefan Marr <java at stefan-marr.de> wrote:
>
>> Hi Christian,
>> Hi all:
>>
>> On 07 Nov 2013, at 14:27, Christian Humer <christian.humer at gmail.com>
>> wrote:
>>
>> > Unfortunately it is not and I also failed myself to abuse it in that
>> way. There are some features (eg. caching of values) missing to declare
>> polymorphic inline caches using the DSL. At the moment all polymorphic
>> inline caches are implemented manually using Truffle API only. We are still
>> in phase of collecting required use-cases to compile a list of required DSL
>> features to support it. There are some concepts like resorting of cache
>> entries or generic cache entry promotion (not completely go to generic but
>> keep some hot cache entries) which would also be quite difficult to
>> support. But we would be happy if we could use your implementation as an
>> additional use-case for DSL support.
>>
>> I am just about to finish a rather large rewrite of TruffleSOM [1].
>> And for that it would be very useful if I could coerce the DSL into
>> generating multiple cases out of a single specialization in order to handle
>> polymorphic message sends.
>>
>> Overall, I think, I have three suggestions based on the TruffleSOM
>> rewrite.
>> I’ll try to detail them below to outline the use case and my
>> understanding. Perhaps somethings are by design and should be done
>> differently.
>>
>> The three things I would like to suggest are:
>>
>>  - Allowing @Generic for simple non-specializing nodes, as a base case
>> (uninitialized state) for custom specialization hierarchies
>>  - Support for inheriting @NodeChilds
>>  - Support for polymorphic inline caching, perhaps based on chaining
>> multiple instances of the same specialization.
>>
>>
>> Allowing @Generic for Simple Nodes
>> —————————————————————————————————-
>>
>> In my new node hierarchy, I use UnaryMessageNode
>> (..Binary/Ternary/KeywordMsgNode) [2] to represent the initial state of
>> message send as generated by the parser. So, the main idea is to avoid
>> specialization before execution starts. The parser could do some simple
>> specializations already purely based on the static name of a message.
>> However, I decided against doing that, following what I understand to be
>> one of the main suggestions for Truffle.
>>
>> As a result I have the UnaryMessageNode (Binary/Ternary/Keyword) class
>> just implementing a doGeneric(.) method, which replace the node by a
>> UnaryMonomorphicNode. For the doGeneric(.) method, I feel that @Generic
>> would be the better annotation, and putting @Specialization there, because
>> it otherwise wouldn’t have a specialization, feels a little unnatural. The
>> @Generic annotation also has another benefit, because it enforces certain
>> constraints on the method signature, which is useful to make the type
>> system happy.
>>
>>
>> Support for inheriting @NodeChild
>> ————————————————————————————————-
>>
>> The next thing I noticed is that @NodeChild annotation are not inherited
>> when the subclass defines additional @NodeChilds. While it is not a big
>> issue, it feel counter intuitive when I noticed it for the first time. My
>> expectation was that these annotations have an additive semantics.
>>
>> As an example, I have a AbstractMonomorphicMessageNode [3] with the
>> @NodeChild(value = “receiver”, ExpN.class) annotation, and in the subclass
>> BinaryMonomorphicNode [4] I need to specify the receiver again, because I
>> want to add an argument NodeChild.
>>
>>
>> Polymorphic Inline Caching
>> —————————————————————————-
>>
>> I assume that you got already a number of specific use case form other
>> Truffle languages, so I will try to outline how I would imagine potential
>> support for PICs in the DSL based on what I see in TruffleSOM.
>>
>> One thing upfront: I decided that it want to try to conflate the notion
>> of monomorphic message sends and primitive implementations.
>> Thus, you will currently find the following hierarchy of nodes in
>> TruffleSOM (still need to figure out whether that’s actually going to work):
>>
>> ExpressionNode
>> +- AbstractMessageNode
>> |  +- AbstractMonomorphicMessageNode
>> |     +- BinaryMonomorphicNode
>> |     |  +- AdditionPrimitive
>> |     |  +- MultiplicationPrimitive
>> |     +- UnaryMonomorphicNode
>> |        +- PrintPrimitive
>> |        +- ...
>> +- BinaryMessageNode
>> +- UnaryMessageNode
>> +- ...
>>
>> The reason why I thought that might be a good idea is to essential see
>> every specialization for the primitive nodes as one variant of a
>> potentially polymorphic message send.
>>
>> Let’s assume we have a #print primitive implemented for strings, and
>> normal #print messages for other types of object, implementing something
>> custom to first convert the value into a string and then printing it.
>>
>> So, a nice polymorphic example would be something like the following loop
>> over an array: `{1. ’two’. #three. 4.0. FiveClass} do: [:elem | elem
>> print]`.
>>
>> The print primitive could than be implemented like this:
>>
>>     public abstract class PrintStringPrim extends UnaryMonomorphicNode {
>>       public PrintStringPrim(final SSymbol selector, final Universe
>> universe, final SClass rcvrClass, final SMethod invokable) {
>>                              super(selector, universe, rcvrClass,
>> invokable); }
>>       public PrintStringPrim(final PrintStringPrim prim) {
>> this(prim.selector, prim.universe, prim.rcvrClass, prim.invokable); }
>>
>>       @Specialization
>>       public Object doSString(final SString receiver) {
>>         Universe.print(receiver.getEmbeddedString());
>>         return receiver;
>>       }
>>     }
>>
>> And the UnaryMonomorphicNode implements a simple monomorphic method cache:
>>
>>     public abstract class UnaryMonomorphicNode extends
>> AbstractMonomorphicMessageNode {
>>
>>       @Specialization(guards = "isCachedReceiverClass", order =
>> PriorityMonomorphicCase)
>>       public SAbstractObject doMonomorphic(final VirtualFrame frame,
>> final SAbstractObject receiver) {
>>         return invokable.invoke(frame.pack(), receiver, noArgs);
>>       }
>>     }
>>
>> So, my understanding is now that a normal node that has multiple
>> specializations, perhaps based on types, can become ’node-polymorphic’.
>> Thus, multiple specializations get instantiated and instead of replacing
>> a previous specialization, they get chained one after another.
>> And then, during execution, you walk the chain to find the one where the
>> guards do not fail.
>> I understand now that this chain can build up because there are multiple
>> cases that can be easily identified based on different guards/tests.
>> However, for the doMonomorphic(.) method, it is not multiple tests, it is
>> not multiple specializations.
>> Instead, the value on which a test is based differs.
>>
>> So, what I would find useful is if I could tell the DSL to multiply my
>> specialization doMonomorphic(.) up to a given number of times, and then, I
>> think for the instantiation, I would also need to give it a function that
>> allows it to derive the necessary state for such ‘value-based’
>> specializations.
>>
>> Perhaps something like this:
>>
>>     @Specialization(guards = “isCachedReceiverClass”, order =
>> PriorityMonomorphicCase,
>>                     multiplicity = 8, specialization = {rcvrClass =
>> “classOfReceiver”, invokable = “lookUp” } )
>>     public SAbstractObject doMonomorphic(final VirtualFrame frame, final
>> SAbstractObject receiver) { /* ... /* }
>>
>>     public SClass classOfReceiver(final SAbstractObject receiver) {
>>       return classOfReceiver(receiver, getReceiver());
>>     }
>>
>>     public SMethod lookUp(final SAbstractObject receiver) {
>>       return classOfReceiver(receiver).lookupInvokable(selector);
>>     }
>>
>> I think, this might give me enough flexibility to express value-based
>> specialization for polymorphic messages.
>>
>> When it comes to cleaning up or optimizing such polymorphic-node chains,
>> well, I don’t know enough how it would behave yet. So, no ideas there so
>> far.
>>
>> Would be interested to hear whether any of that sounds like it fits into
>> how you see Truffle being used.
>>
>> Thanks
>> Stefan
>>
>>
>>
>> [1]
>> https://github.com/smarr/TruffleSOM/commit/482ccaa88306a1e7a989afdfd91e4121c5be163a
>> [2]
>> https://github.com/smarr/TruffleSOM/blob/482ccaa88306a1e7a989afdfd91e4121c5be163a/src/som/interpreter/nodes/UnaryMessageNode.java#L30
>> [3]
>> https://github.com/smarr/TruffleSOM/blob/482ccaa88306a1e7a989afdfd91e4121c5be163a/src/som/interpreter/nodes/messages/AbstractMonomorphicMessageNode.java#L18
>> [4]
>> https://github.com/smarr/TruffleSOM/blob/482ccaa88306a1e7a989afdfd91e4121c5be163a/src/som/interpreter/nodes/messages/BinaryMonomorphicNode.java#L21
>> --
>> Stefan Marr
>> Software Languages Lab
>> Vrije Universiteit Brussel
>> Pleinlaan 2 / B-1050 Brussels / Belgium
>> http://soft.vub.ac.be/~smarr
>> Phone: +32 2 629 2974
>> Fax:   +32 2 629 3525
>>
>>
>