SIGSEGV in C2 compiler

Denis Lila dlila at redhat.com
Wed Feb 2 08:16:32 PST 2011


Hi Tom.

I tried to simplify the reproducer for this. I managed to turn
it into a 20 line file that I've attached. However it must be run
with -XX:CompileOnly=Test.main. -Xbatch and
-XX:OnStackReplacePercentage=60 are no longer needed.

I printed a bunch of graphs which you can find here:
http://icedtea.classpath.org/~dlila/ecjGraph.xml

The graphs were generated using the command
~/src/jdk7/build/linux-amd64-debug/bin/java -Xbatch -XX:-DoEscapeAnalysis -XX:-SubsumeLoads -XX:-UseLoopPredicate -XX:-PartialPeelLoop -XX:-PartialPeelAtUnsignedTests -XX:+LoopUnswitching -XX:+VerifyGraphEdges -XX:+VerifyIterativeGVN -XX:+TraceIterativeGVN -XX:+TraceOptoParse -XX:+TraceLoopPredicate -XX:+TracePartialPeeling -XX:+TraceLoopUnswitching -XX:+PrintCompilation -XX:PrintIdealGraphFile=./ecjGraph.xml -XX:PrintIdealGraphLevel=3 -XX:+PrintIdeal -XX:+PrintOpto -XX:+Verbose -XX:CompileOnly=Test.h Test

I tried to turn off as many optimizations as possible.
With the above command opto/compile.cpp:1673,1689 end up
executing (they are PhaseIdealLoop constructors). All the xml files
are from the execution of the first PhaseIdealLoop constructor. By
its second call the graph seems to already be broken because in
in build_loop_late, build_loop_late_post is called on node 296. The
control for 296 is determined to be 314. This seems correct, and it
is the variable Node *early. LCA however, is computed as 39. 39 dominates
314 so in the while( early != legal ) we end up bubbling legal up to the
root, then we call idom(root), and that causes a segfault because the root
isn't dominated by anything.

Now, there doesn't seem to be anything wrong with the dominator computations.
The problem seems to be that in the _igvn.optimize() call at the end of the
first PhaseIdealLoop constructor, node 357 is replaced by its parent 296 because
357 is a phi node and one of its two inputs (node 153) becomes dead. So 296's
children become 259 and 295. When compute_lca_of_uses is called the control of
296 is found to be 303 (this is correct). Then, in the next iteration, we get
the control of 259 which is 227. So we call dom_lca_for_get_late_ctrl(303, 227, 296)
which correctly returns 39.
Now, if 357 hadn't been replaced by 296, in that second iteration the
if( c->is_Phi() ) path would have been executed, so "use" would have been
computed to be 357->in(0)->in(j) == 349, instead of 227. Then
dom_lca_for_get_late_ctrl(303, 349, 296) would have been called, which would
have returned either 333 or 341 (because 333 may be a split ctrl) both of
which are dominated by 314, so no crash would result.

So, the problem seems to be either that the phi's input is killed or that that
input's corresponding control is not dead. This happens in _igvn.optimize, but
I can't see any errors there, so I'm thinking the real problem is in the loop
iteration_split call that precedes it. I haven't found the exact problem yet, but
I'm working on it.

Anyway, I hope this helps (or at least makes sense).

Regards,
Denis.

PS: just to clarify, if the program is run with -XX:-LoopUnswitching, there is no
crash, which supports my beliefs from above.

----- Original Message -----
> I was able to reproduce your crash from the class files. I filed
> 7004570 for it. Running java -d64 -cp pisces.jar -XX:+PrintCompilation
> -Xbatch -XX:OnStackReplacePercentage=60 pisces.Test reproduces it
> reliably for me on Solaris. I'm looking into it now.
> 
> tom
> 
> On Nov 22, 2010, at 12:36 PM, Denis Lila wrote:
> 
> >> What about the latest hotspot which is hs20-b02?
> >
> > That also crashes.
> >
> >> I think we recently fixed a bad graph bug related to EA. You can
> >> try
> >> -XX:-DoEscapeAnalysis. Actually if it reproduces with hs17 then
> >> it's
> >> probably not the same EA bug.
> >
> > Yes, it also crashes with -XX:-DoEscapeAnalysis.
> >
> >> Can you provide the ecj compiled class files too?
> >
> > Certainly.
> >
> > Regards,
> > Denis.
> >
> > ----- "Tom Rodriguez" <tom.rodriguez at oracle.com> wrote:
> >
> >> On Nov 22, 2010, at 11:27 AM, Denis Lila wrote:
> >>
> >>> I'm sorry, I accidentally sent this without finishing it.
> >>>
> >>> I meant to say that gdbdump.txt, hotspot.log, and the hs_err_*.log
> >>> files were obtained using a fastdebug build of hotspot 19-b06.
> >>>
> >>> I can reproduce the crash with hotspot 17 too.
> >>
> >>>>
> >>>> I did not submit a bug report at sun.bugs.com because I couldn't
> >> find
> >>>> a way
> >>>> to attach the 4 files.
> >>
> >> Yes that's kind of lame. You can just include a note in the
> >> description that says to contact you directly for the files.
> >> Directly
> >> including the hs_err as text is a good idea though.
> >>
> >> tom
> > <piscesEcjClasses.tar.gz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Test.java
Type: text/x-java
Size: 464 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20110202/7016c6a9/attachment.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Test.class
Type: application/x-java
Size: 498 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20110202/7016c6a9/attachment-0001.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.javap
Type: application/octet-stream
Size: 2496 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20110202/7016c6a9/attachment.obj 


More information about the hotspot-compiler-dev mailing list