RFR(L): 8003854: PPC64 (part 115): Introduce lateExpand that expands nodes after register allocation.
Vladimir Kozlov
vladimir.kozlov at oracle.com
Wed Nov 13 19:39:46 PST 2013
This webrev contains other changes in src/cpu/ppc/vm. Was it intentional?
Thanks,
Vladimir
On 11/13/13 3:37 PM, Lindenmaier, Goetz wrote:
> Hi Vladimir,
>
> Thanks for the thorough review!!
> I made a new webrev:
> http://cr.openjdk.java.net/~goetz/webrevs/8003854-lateExp-2/
>
>> It is good feature.
> Thanks :)
>> Can you name it postaloc_expand to be clear when we do it?
> Fixed.
>
>> Names. I know it is picking but for methods and .ad statements names we
>> use low case and underscore. Camel style is used for class names.
> No matter, changing names isn't a big deal. I hope I've got all.
>
>> I know this was written long time ago but we are trying now to not use
>> separate enc_class unless it is used by several instructions. Can you
>> also implement ability to write postaloc_expand inside mach instruction
>> (As we do now for ins_encode)? You can do it as separate changes later
>> if it a lot of work.
> Done. Was really simple ;) I'll adapt our ad file eventually, but that contains
> 370 ins_encode and 460 instruct declarations, so it will take some time.
>
>> Are you allow branches in late expand? I assume not because you don't
>> have separate graph's blocks for that. It would be nice to have some
>> verification code which enforces restrictions.
> No, no branches. That would need a very different approach, as the cfg
> would have to be changed. We have some conditional moves with branches
> that would be much better represented if blocks were possible. But we did not implement
> that as it seems far too much effort. Anyways, on ia64 we have predicated
> instructions, so there we don't need branches. On Power, starting with
> Power7, there is a conditional move.
>
>> Also it would be nice if in lateExpand rule you need only define nodes
>> and connect them and the rest (opnds clonning, RA info patching and
>> pushing result nodes) be done by adlc. But it will require more code in
>> adlc so it may be for the future. I am worried that current code
>> requires very careful coding: which operand to clone and to which assign
>> it, which RA methods to use for resetting (set1(), set2(), set_pair()).
>> It is very bugs prone.
> Yes, it is a bit complicated. Basically, adlc could generate constructors
> for nodes that generate the operands. At least if the node clearly specifies
> it's operands. If there are operand classes that's ambiguous.
> I would at least like some checker code like machNode->check_operands
> that checks that all operands are there and that they have the proper
> type.
> The advantage is that it's more flexible than the expand mechanism.
> For example there are no C-if's possible in expands. Many other things
> are restricted, like generating nodes that require TEMPs. All these
> features could be added to expand, but that's always a lot of overhead
> if you just want to generate some instructions.
>
>> Will it be difficult to totally separate lateExpands from ins_encode?
>> Your current code use _insencode + _is_lateExpand. Can you do one
>> InsEncode* _lateExpand? Or it will be too intrusive?
> I will just have to copy lots of code. Also, for the postaloc_expand %{ %}
> support I just had to call the proper ins_encode function. I don't think
> replicating all that code makes sense.
>
>> block.cpp:
>> Node_List already has method contains(Node* n), use it in
>> Block::contains() (you can define it in header file then).
> Done. Also added const to Node_List::contains().
>
> PhaseCFG::LateExpand():
>
>> TraceLateExpand is develop flag so it will be true only for #ifdef
>> ASSERT. It seems all #ifndef PRODUCT should be replaced by #ifdef ASSERT
>> in this method.
> Done.
>
>> "These kills are not required any more after expanding" - what about
>> flags kill Proj nodes? Disconnecting input nodes worries me. Do you
>> clean graph after that to remove unused nodes?
> Yes, the graph is clean afterwards. Many kills/temps are explicit register
> deps after expanding. Those that should be kept have to be added in the
> postaloc_expand method of the node. If that method wants to reuse a Proj, it can
> add it to the node list returned. Then it will be first removed, and
> then added again. This is necessary, as the position in the blocks
> node list must be correct.
>
>> I need to look more on LateExpand() in next round.
>
>> compile.cpp: Can you add TracePhase/TraceTime to get time spent in
>> LateExpand.
> Done. I used TracePhase.
>
>> Why you need lateExpand() method in Node class? Generated mach nodes are
>> based on MachNode so this method could be declared only in MachNode.
>> (Putting new virtual method into top Node class will increase virtual
>> tables for all ideal and mach nodes).
> We put it just right next to format() and emit(). Moved to machnode.
>
> Best regards,
> Goetz.
>
>
> Thanks,
> Vladimir
>
> On 11/12/13 8:32 AM, Lindenmaier, Goetz wrote:
>> Hi,
>>
>> I also updated this webrev to work with the latest staging repository.
>> http://cr.openjdk.java.net/~goetz/webrevs/8003854-lateExp/
>>
>> Best regards,
>> Goetz.
>>
>> From: goetz.lindenmaier at sap.com
>> Sent: Mittwoch, 2. Oktober 2013 15:49
>> To: hotspot-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net; 'Vladimir Kozlov'
>> Subject: RFR(L): 8003854: PPC64 (part 115): Introduce lateExpand that expands nodes after register allocation.
>>
>> Hi,
>>
>> we extended C2 by what we call lateExpand. LateExpand expands nodes
>> after register allocation.
>> http://cr.openjdk.java.net/~goetz/webrevs/8003854-lateExp/
>>
>> Some nodes can not be expanded during matching. E.g., register
>> allocation might not be able to deal with the resulting pattern. To
>> allow better scheduling in such cases, we introduce lateExpand which
>> runs after register allocation. Whether and how nodes are expanded is
>> specified in the ad-file. See block.cpp for a detailed
>> documentation. We use this for some nodes on ppc, and extensively on
>> ia64.
>> For an example, see the webrev.
>>
>> We added some utility functions in node.hpp and block.hpp used by
>> PhaseCFG::LateExpand() or the Node::lateExpand functions. Also,
>> MachConstantBaseNode gets the two new functions, as we need to
>> late expand it.
>>
>> To skip the walk over the IR if no lateExpand is needed, we use
>> Matcher::require_late_expand set in the ad file. This ends up in
>> ad_<cpu>.cpp, thus can not be evaluated by the C++ compiler. If you
>> agree, I would use a "const bool EnableLateExpand" in
>> globalDefinitions_<cpu>.hpp. (There is no suitable c2_<xxx>_<cpu>.hpp
>> file).
>>
>> We allready mailed a webrev with this change after last year's
>> JavaOne, but there was no discussion on it.
>> Again, this change is 'L', but the code is not used on other
>> platforms than PPC64 (so far). So the impact on those platforms
>> should be minimal.
>>
>> Please review and test this change.
>>
>> Thanks and best regards,
>> Goetz.
>>
More information about the ppc-aix-port-dev
mailing list