Tradeoffs around Node::Opcode()?

Tue Mar 10 17:53:59 PDT 2009

Peter,

if you wish to add an additional field into Node you may instead
extend _class_id field to 4 bytes to cover all classes and use it

(n->class_id() == Node::Class_SafePoint)

instead of

(n->Opcode() == Op_SafePoint)

Sorry, I forgot to mention that class_id can work as Opcode() also.
The problems with this are, first, maintenance - you need to add
several lines in node.hpp instead of one line in classes.hpp for
each node, and, second, 32 bits could be not enough for all classes.

I agree that it would be nice to remove virtual Opcode() method.

Peter B. Kessler wrote:
> Thanks for the answers.
> 
> I didn't understand why you say that a subclass would overwrite an 
> _opcode field.  If each AddI constructor initializes my proposed 
> (const?) _opcode field in Node to Op_AddI, and if AddF initializes the 
> _opcode field to Op_AddF, then who would overwrite it?  Obviously node 
> instances can't change their opcodes, or we wouldn't be able to use a 
> virtual function for Opcode() now, unless we change the vtable for 
> instances.

I meant super class and subclass, for example, Op_ConI will overwrite Op_Con:

class ConNode : public TypeNode {
public:
   ConNode( const Type *t ) : TypeNode(t,1) {
     init_req(0, (Node*)Compile::current()->root());
     init_flags(Flag_is_Con);
     _opcode = Op_Con;
   }
   virtual int  Opcode() const;

...

class ConINode : public ConNode {
public:
   ConINode( const TypeInt *t ) : ConNode(t) {
     _opcode = Op_ConI;
   }
   virtual int Opcode() const;

> 
> Then there could be a map from Opcodes to what's now in _class_id and 
> _flags.  E.g., there would be a _class_id table someplace whose Op_AddI 
> entry would contain Class_Add.  As would the Op_AddF entry.  But the 
> Op_AddP index would would contain Class_AddP.  Do the _flags fields 

This approach requires 2 loads instead of current one to read _class_id.

> change over the life of a Node instance?  Or could those also be in a 

No, _flags field does not change over time.

Vladimir

> map indexed by Opcode?  (Having separate maps from Opcode to other 
> properties of the Node means that information about the Node subclass 
> isn't visible in text of the subclass constructor.  Sigh.)
> 
> I hear from Chris Vick that making the edges between the nodes 
> bi-directional, which increased the size of the Node instances, didn't 
> have any noticeable effect on the space used collectively by the nodes 
> during a compilation, and didn't have any effect on the speed of the 
> compiler, except maybe to speed it up because traversals that used to be 
> hard with one-way edges became easier with bi-directional edges.
> 
>             ... peter
> 
> Vladimir Kozlov wrote:
>> Yes, it is history :)
>>
>> The _opcode field will not work since subclasses will overwrite it.
>>
>> The _class_id field is not direct map to Opcode, it is bitmap:
>> n->is_SafePoint() will be true for all ideal Call and SafePoint nodes,
>> but n->Opcode() == Op_SafePoint only true for SafePointNode.
>> So we need both.
>>
>> And yes, I want to fit in a jushort :-)
>> When I did implementation of _class_id I was also concern about
>> increasing Node size significantly so I decided to add only 4 bytes:
>> 2 for class_id and 2 for flags.
>>
>> I also did profiling of Opcode() usage. So only hottest Opcode()
>> calls were replaced with class_id code. Adding other nodes did not
>> improve performance.
>>
>> Vladimir
>>
>> Peter B. Kessler wrote:
>>> Cool!  Thanks.  So much for engineering: we already have *both* the 
>>> virtual dispatch *and* the data slot per instance.
>>>
>>> Oh, but those are only the class id's (e.g., Add), rather than the 
>>> opcode (e.g., AddI).  It's confusing that they are both referred to 
>>> as Node "classes", e.g., in classes.hpp, opcodes.cpp, etc.
>>>
>>> So I guess that begs the question: why *isn't* there an _opcode 
>>> field, initialized in the constructors, like (or instead of) 
>>> _class_id?  Why aren't the leaf classes (AddI, etc.) in the _class_id 
>>> maps?  (Other than that the _class_id maps want to fit in a jushort. 
>>> :-)  Couldn't one map (e.g., with a table) from an opcode to a 
>>> _class_id, if one wanted to spend the time as opposed to the space?  
>>> Would an inline method to access a _class_id table be faster than a 
>>> virtual dispatch to Opcode()?
>>>
>>> A possible answer as to why it is the way it is: "History".
>>>
>>> I'm still trying to understand the tradeoffs between space and time.
>>>
>>>             ... peter
>>>
>>> Vladimir Kozlov wrote:
>>>> Peter,
>>>>
>>>> There is already such functionality exactly for such purpose.
>>>> The field is jushort _class_id; which use enum NodeClasses
>>>> and initializer in the constructors init_class_id().
>>>>
>>>> Vladimir
>>>>
>>>> Peter B. Kessler wrote:
>>>>> A lot of things in class Node are organized around having a dense 
>>>>> integer range to identify the Node.  That's Node::Opcode().
>>>>>
>>>>> I'm wondering about the engineering around that method.  
>>>>> Node::Opcode() is a virtual call, to get from a Node instance to a 
>>>>> piece of data on the Node subclass of the instance.  An alternative 
>>>>> would be to add a data field in Node to hold the opcode, and 
>>>>> replace the virtual Node::Opcode() calls with an inline method to 
>>>>> return the contents of that slot.  That trades the virtual dispatch 
>>>>> cost for each call with a space cost in each instance.
>>>>>
>>>>> Does anyone have any feeling about whether that's a good tradeoff?  
>>>>> How often is Node::Opcode() called?  What effect would it have to 
>>>>> make it run faster?  How important is it to minimize the size of 
>>>>> Node instances?  (There are other places where effort is expended 
>>>>> to keep Node's small.)
>>>>>
>>>>> Thanks for any opinions on this.
>>>>>
>>>>>             ... peter
>>>
>