RFR(M): 7197327: 40% regression on 8 b41 comp 8 b40 on specjvm2008.mpegaudio on oob
Vladimir Kozlov
vladimir.kozlov at oracle.com
Mon Jan 21 15:33:24 PST 2013
On 1/21/13 2:53 PM, Vladimir Kozlov wrote:
> Roland,
>
> What is effect on refworkload?
>
> I would also use it for trigonometric and math (sqrt, log) intrinsics
> (as separate RFE after checking performance effect).
>
> I like you factoring of sorting code in sources.
>
> Why ciConstantPoolCache::_keys is array of intptr_t and not int type
> elements? We'r wasting space in 64bit VM and do conversion each time we
> access elements.
>
> In growableArray.hpp call f(value, key) only once.
>
> In loopnode.cpp you miss 'return;' in second "if (stop_early)".
>
> It is very expensive to do build_loop_* when stop_early is true. Could
> you check before that if you have Expensive nodes with the same data
> inputs instead of just number of such nodes? Your list sorting should be
> cheaper. May be instead of flat list you should create list of pairs
> which have same data inputs, it will simplify processing code.
And you should remove control edge if there are no pairs after
incremental inlining finished (no new Expensive node added).
>
> Anyway, I think new process_expensive_nodes() method is too complex for
> cases which are very very rare. I think you should narrow cases for this
> optimization:
Also finding best control (move from loop) for Expensive nodes could be
done in build_loop_early() method if data inputs dominate loop and the
loop head is the control edge of Expensive node.
Thanks,
Vladimir
>
> c1 = get_control(n1);
> c2 = get_control(n2);
>
> if (is_dominator(c1, c2)) {
> c2 = c1;
> } else if (is_dominator(c2, c1)) {
> c1 = c2;
> } else if (c1->is_Proj() && c1->in(0)->is_If() &&
> is_dominator(c1->in(0), c2)) {
> c1 = c2 = c1->in(0);
> } else if (c2->is_Proj() && c2->in(0)->is_If() &&
> is_dominator(c2->in(0), c1)) {
> c1 = c2 = c2->in(0);
> }
> if (n1->in(0) != c1) {
> n1->set_req(0, c1);
> }
> if (n2->in(0) != c2) {
> n2->set_req(0, c2);
> }
>
> Note, when you skip UNC you don't check min_dom_depth but some data
> inputs may depend on this UNC. So it may be not safe.
>
> Thanks,
> Vladimir
>
> On 1/21/13 2:59 AM, Roland Westrelin wrote:
>> One of the method has the following structure:
>>
>> for ( .. ) {
>> ..
>> if ( ..) {
>> ..
>> } else {
>> if ( .. ) {
>> if ( .. ) {
>> ..
>> } else {
>> if ( .. ) {
>> ..
>> } else {
>> Math.pow(x, y);
>> }
>> }
>> } else {
>> if ( .. ) {
>> ..
>> } else {
>> Math.pow(x, y);
>> }
>> }
>> ..
>> }
>>
>> Both Math.pow(x,y) have the same inputs and so a single PowDNode is
>> kept and it's scheduled in the else of the outer most if. So the pow
>> computation is executed independently of the other if conditions, more
>> frequently and because the computation is expensive there's a
>> noticeable performance regression.
>>
>> The fix consists in:
>>
>> 1) setting the control input of the expensive nodes (PowDNode,
>> ExpDNode) to prevent IGVN to freely common nodes
>> 2) During the loop optimization pass, consider each expensive node
>> and, when possible, modify the control input to allow optimization by
>> the IGVN while making sure it's not executed more frequently
>>
>> http://cr.openjdk.java.net/~roland/7197327/
>>
>> To test it, given it's quite rare to have 2 PowDNodes with the same
>> inputs, I applied the same technique to a bunch of other nodes:
>>
>> http://cr.openjdk.java.net/~roland/7197327/webrev.test/
>>
>> Roland.
>>
More information about the hotspot-dev
mailing list