SIGSEGV in C2 compiler

Denis Lila dlila at redhat.com
Mon Feb 7 09:06:33 PST 2011


> Ping me directly if you'd like help restructuring the loop to avoid the
> problem.

Thank you, but I already know how to do that. The reason I worked on this
is because I am interested in compilers, and I would like to work on
hotspot. This seemed like a good place to start.

> I'm sorry I didn't follow up on this with you, I lost track of it over
> the Christmas holiday. This appears to be the same issue as 6675699
> where a ConvI2L with a constrained type on it has it's input replaced
> with a constant outside that range. The identity transformation
> converts that to top which causes a phi collapse and the control flow
> to be killed. We've kind of band aided this in a few other places but
> it appears that loop unswitching can create similar issues.

I've spent a lot of time thinking about this and staring at graphs, and
here's what I think: (now, before this, I had never even read anything about
compilers, let alone worked on one, so please, if anything doesn't make
sense, don't go easy on me ;-) )

(Before I begin, I refer to a lot of node indexes below. Here's the graphs
they were taken from:
http://icedtea.classpath.org/~dlila/ecjGraph.xml. I used
IdealGraphVisualizer from src/share/tools)

That bug's evaluation says "We need to remove the band aid fixes for
4781451 and 4799512 and disallow ConvI2L from being initialized with
a narrower type than bottom". I'm not sure I understand why. I mean,
in this case it made sense to me that node 147 should be initialized
with type 1..maxint because, like the comment where this is done says,
we can simply assume that our input is a valid array index because if
it's not a runtime check will take care of that case and the index won't
be used.

What I think is going wrong is that when we do subsume_node(k, i) (in
phaseX.cpp:1108 - in this case that would be subsume_node(357, 296)) we
are losing information about one of the uses of 296. Before the subsume
the uses of 296 are 357 (the phi) and 295. After the subsume they are
259 and 295. When compute_lca_of_uses(296, 314) is called, Node *c in the
loop becomes 259 in the 2nd iteration, and when we call get_ctrl(259) we
get 226. 226 is dominated by 350, and the idom of 350 is 286 (the if that
is executed before the loops - the one generated by the loop splitting)
<i>because the controls from both split loops to 350 are still alive</i>.
286 dominates the early control of 296 (which is 314), so it thinks this
is a bad graph.

If 357 had not been removed, when c.idx==357 in compute_lca_of_uses(296...),
because 357 is a phi node, we know to "skip" over the region and set the
"use" variable to the control input to the region that corresponds to the
data input in question to the phi. In this case 296 is the data input, the phi
is 357, and the corresponding control input to the region is 349, which is
the control from the split loop where 296 is computed. So "use" would be set
to node 349, and nothing would go wrong because 314 dominates both 349 and 303.

So, what I meant when I said "we are losing information about one of the uses
of 296" is that we are losing the information that, if the value produced by
node 296 is used by node 259, the control path taken _must_ have gone through
349, and it _cannot_ have gone through 217 (which is control path from the i!=8
unswitched loop (the one that never runs)).

It seems to me like this problem isn't specific to ConvI2L, but can happen
any time that a data input to a phi node dies, but the control input to that
phi's region that corresponds to the dead data doesn't die. I'm not sure if
this state of affairs can be caused by anything other than a ConvI2L, but even
so, I think this should get its own fix, because in PhiNode::unique_input, when
we do
    if (un == NULL || un == this || phase->type(un) == Type::TOP) {
      continue; // ignore if top, or in(i) and "this" are in a data cycle
    }
we're not stating an invariant such as "a phi's data input dies if and only if
that data's corresponding control dies". So, as long as it's possible for the
data to die, but that data's control to stay alive, I think this will be a
problem, regardless of ConvI2L's behaviour.

I have two potential fixes in mind:
1. Don't subsume a phi node with its unique input if it has dead inputs but
the dead inputs' controls are not dead.
2. For every node N that is used by a phi node P that is killed, save a pointer
to P->in(0) and a number j such that P->in(j) == N. That will allow us to correctly
compute "use" in compute_lca_of_uses even if c isn't a phi.

Like I said above, I'm not sure if any of this makes any sense at all so I
would very much appreciate any criticism.

Thank you,
Denis.

----- Original Message -----

> 
> Pop 147 ConvI2L === _ 20 [[ 153 ]] Type:long:1..maxint:www !jvms:
> Test::main @ bci:32
> &lt; long:1..maxint:www &lt; 147 ConvI2L === _ _ [[]] [3000147]
...


More information about the hotspot-compiler-dev mailing list