RFR(L): 8003854: PPC64 (part 115): Introduce lateExpand that expands nodes after register allocation.
Lindenmaier, Goetz
goetz.lindenmaier at sap.com
Thu Nov 14 00:15:45 PST 2013
Hi Vladimir,
sorry for that, I fixed it. Looks like I didn't pop all my ppc changes.
Best regards,
Goetz.
-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
Sent: Donnerstag, 14. November 2013 06:27
To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net
Subject: Re: RFR(L): 8003854: PPC64 (part 115): Introduce lateExpand that expands nodes after register allocation.
I looked on files from previous webrev and you addressed all my comments. But I need clean webrev to approve it.
Thanks,
Vladimir
On 11/13/13 7:39 PM, Vladimir Kozlov wrote:
> This webrev contains other changes in src/cpu/ppc/vm. Was it intentional?
>
> Thanks,
> Vladimir
>
> On 11/13/13 3:37 PM, Lindenmaier, Goetz wrote:
>> Hi Vladimir,
>>
>> Thanks for the thorough review!!
>> I made a new webrev:
>> http://cr.openjdk.java.net/~goetz/webrevs/8003854-lateExp-2/
>>
>>> It is good feature.
>> Thanks :)
>>> Can you name it postaloc_expand to be clear when we do it?
>> Fixed.
>>
>>> Names. I know it is picking but for methods and .ad statements names we
>>> use low case and underscore. Camel style is used for class names.
>> No matter, changing names isn't a big deal. I hope I've got all.
>>
>>> I know this was written long time ago but we are trying now to not use
>>> separate enc_class unless it is used by several instructions. Can you
>>> also implement ability to write postaloc_expand inside mach instruction
>>> (As we do now for ins_encode)? You can do it as separate changes later
>>> if it a lot of work.
>> Done. Was really simple ;) I'll adapt our ad file eventually, but that contains
>> 370 ins_encode and 460 instruct declarations, so it will take some time.
>>
>>> Are you allow branches in late expand? I assume not because you don't
>>> have separate graph's blocks for that. It would be nice to have some
>>> verification code which enforces restrictions.
>> No, no branches. That would need a very different approach, as the cfg
>> would have to be changed. We have some conditional moves with branches
>> that would be much better represented if blocks were possible. But we did not implement
>> that as it seems far too much effort. Anyways, on ia64 we have predicated
>> instructions, so there we don't need branches. On Power, starting with
>> Power7, there is a conditional move.
>>
>>> Also it would be nice if in lateExpand rule you need only define nodes
>>> and connect them and the rest (opnds clonning, RA info patching and
>>> pushing result nodes) be done by adlc. But it will require more code in
>>> adlc so it may be for the future. I am worried that current code
>>> requires very careful coding: which operand to clone and to which assign
>>> it, which RA methods to use for resetting (set1(), set2(), set_pair()).
>>> It is very bugs prone.
>> Yes, it is a bit complicated. Basically, adlc could generate constructors
>> for nodes that generate the operands. At least if the node clearly specifies
>> it's operands. If there are operand classes that's ambiguous.
>> I would at least like some checker code like machNode->check_operands
>> that checks that all operands are there and that they have the proper
>> type.
>> The advantage is that it's more flexible than the expand mechanism.
>> For example there are no C-if's possible in expands. Many other things
>> are restricted, like generating nodes that require TEMPs. All these
>> features could be added to expand, but that's always a lot of overhead
>> if you just want to generate some instructions.
>>
>>> Will it be difficult to totally separate lateExpands from ins_encode?
>>> Your current code use _insencode + _is_lateExpand. Can you do one
>>> InsEncode* _lateExpand? Or it will be too intrusive?
>> I will just have to copy lots of code. Also, for the postaloc_expand %{ %}
>> support I just had to call the proper ins_encode function. I don't think
>> replicating all that code makes sense.
>>
>>> block.cpp:
>>> Node_List already has method contains(Node* n), use it in
>>> Block::contains() (you can define it in header file then).
>> Done. Also added const to Node_List::contains().
>>
>> PhaseCFG::LateExpand():
>>
>>> TraceLateExpand is develop flag so it will be true only for #ifdef
>>> ASSERT. It seems all #ifndef PRODUCT should be replaced by #ifdef ASSERT
>>> in this method.
>> Done.
>>
>>> "These kills are not required any more after expanding" - what about
>>> flags kill Proj nodes? Disconnecting input nodes worries me. Do you
>>> clean graph after that to remove unused nodes?
>> Yes, the graph is clean afterwards. Many kills/temps are explicit register
>> deps after expanding. Those that should be kept have to be added in the
>> postaloc_expand method of the node. If that method wants to reuse a Proj, it can
>> add it to the node list returned. Then it will be first removed, and
>> then added again. This is necessary, as the position in the blocks
>> node list must be correct.
>>
>>> I need to look more on LateExpand() in next round.
>>
>>> compile.cpp: Can you add TracePhase/TraceTime to get time spent in
>>> LateExpand.
>> Done. I used TracePhase.
>>
>>> Why you need lateExpand() method in Node class? Generated mach nodes are
>>> based on MachNode so this method could be declared only in MachNode.
>>> (Putting new virtual method into top Node class will increase virtual
>>> tables for all ideal and mach nodes).
>> We put it just right next to format() and emit(). Moved to machnode.
>>
>> Best regards,
>> Goetz.
>>
>>
>> Thanks,
>> Vladimir
>>
>> On 11/12/13 8:32 AM, Lindenmaier, Goetz wrote:
>>> Hi,
>>>
>>> I also updated this webrev to work with the latest staging repository.
>>> http://cr.openjdk.java.net/~goetz/webrevs/8003854-lateExp/
>>>
>>> Best regards,
>>> Goetz.
>>>
>>> From: goetz.lindenmaier at sap.com
>>> Sent: Mittwoch, 2. Oktober 2013 15:49
>>> To: hotspot-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net; 'Vladimir Kozlov'
>>> Subject: RFR(L): 8003854: PPC64 (part 115): Introduce lateExpand that expands nodes after register allocation.
>>>
>>> Hi,
>>>
>>> we extended C2 by what we call lateExpand. LateExpand expands nodes
>>> after register allocation.
>>> http://cr.openjdk.java.net/~goetz/webrevs/8003854-lateExp/
>>>
>>> Some nodes can not be expanded during matching. E.g., register
>>> allocation might not be able to deal with the resulting pattern. To
>>> allow better scheduling in such cases, we introduce lateExpand which
>>> runs after register allocation. Whether and how nodes are expanded is
>>> specified in the ad-file. See block.cpp for a detailed
>>> documentation. We use this for some nodes on ppc, and extensively on
>>> ia64.
>>> For an example, see the webrev.
>>>
>>> We added some utility functions in node.hpp and block.hpp used by
>>> PhaseCFG::LateExpand() or the Node::lateExpand functions. Also,
>>> MachConstantBaseNode gets the two new functions, as we need to
>>> late expand it.
>>>
>>> To skip the walk over the IR if no lateExpand is needed, we use
>>> Matcher::require_late_expand set in the ad file. This ends up in
>>> ad_<cpu>.cpp, thus can not be evaluated by the C++ compiler. If you
>>> agree, I would use a "const bool EnableLateExpand" in
>>> globalDefinitions_<cpu>.hpp. (There is no suitable c2_<xxx>_<cpu>.hpp
>>> file).
>>>
>>> We allready mailed a webrev with this change after last year's
>>> JavaOne, but there was no discussion on it.
>>> Again, this change is 'L', but the code is not used on other
>>> platforms than PPC64 (so far). So the impact on those platforms
>>> should be minimal.
>>>
>>> Please review and test this change.
>>>
>>> Thanks and best regards,
>>> Goetz.
>>>
More information about the ppc-aix-port-dev
mailing list