[15] RFR(M): 8237859: C2: Crash when loads float above range check

Christian Hagedorn christian.hagedorn at oracle.com
Wed Mar 11 17:14:17 UTC 2020


Hi

Please review the following patch that is an incremental patch to 
JDK-8240227 [1] which is a prerequisite for this fix:
https://bugs.openjdk.java.net/browse/JDK-8237859
http://cr.openjdk.java.net/~chagedorn/8237859/webrev.incremental.00/ 
(incremental changes to [1])
http://cr.openjdk.java.net/~chagedorn/8237859/webrev.01/ (combined 
changes, JDK-8237859 with [1])

In test(), the loop is unswitched and the slow and fast loop are 
unrolled once. For both loop versions, pre/main/post loops are created. 
The pre-loop is executed once and the main-loop represents 2 iterations. 
Depending on the value of flag, wrappers_array can have length 2 or 3. 
But the unrolled loop bodies contain an access to wrappers_array[1] and 
wrappers_array[2]. This means that in some cases (when flag is true and 
wrappers_array has length 2), the main-loop is not entered and the 
out-of-bounds access of wrappers_array[2] should not be exeucted. 
Therefore, the corresponding LoadP node should not be scheduled before 
the If node that decides if the main-loop is executed or not. However, 
this is exactly what happens: The LoadP node is scheduled before the If 
node which decides if the main-loop should be executed resulting in a 
segfault.

Background:
A first observation was that the loop is additionally unswitched. We do 
not process and rewire control edges of range check predicates to data 
nodes belonging to the loop (for example, 366 IfTrue to 247 LoadP in [2] 
for test()) when unswitching the loop (before doing pre/main/post and 
unrolling). As a result, the slow and fast loop do not have any 
predicates before the loop header node. Some loads could now even be 
performed before the "loop-selection-if" itself (which either selects 
the slow or fast loop). This can be seen in the mach graph [3] where 145 
jmpCon is the loop-selection-if that selects between the fast and slow 
loop and 161 zLoadP reads from index 2 (which is out of bounds when the 
wrappers_array has only length 2). This load can now be scheduled way 
too early before deciding if the main-loop is executed (and in fact is 
scheduled too early resulting in a segfault). The fix for JDK-8240227 
[1] addresses this problem for loop unswitching and clones range check 
predicates to both unswitched loops and updates the control edges to the 
data nodes to the new predicates accordingly. This first seemed to fix 
this bug.

However, it is not enough when only ensuring that the data nodes are not 
scheduled before the loop-selection-if. They can still be scheduled 
before the If that selects if the main-loop should be entered when 
enabling -XX:+StressGCM. This can even be observed when running test3() 
with -XX:+StressGCM which does not do any loop unswitching. A segfault 
occurs after a few runs.

The problem is when creating pre/main/post loops, we do not rewire the 
control inputs from the range check predicate projections to data nodes 
of the main- and post-loop body to the main- and post-loop. For example, 
in [4], 247 LoadP belongs to the main-loop, 377 LoadP belongs to the 
post-loop and 418 LoadP to the pre-loop. But the control edges of the 
LoadP nodes belonging to the main- and post-loop (247, 377) are not 
updated. As a result, all new loads created at unrolling also have a 
control input from 366 IfTrue. These loads belonging to the main-loop 
can then end up being scheduled before the If that decides if the 
main-loop should be entered resulting in a segfault.

This fix addresses this issue and rewires the control input of data 
nodes belonging to the main-loop to a range check predicate before the 
pre-loop to the copied main-loop range check predicates (copied in 
PhaseIdealLoop::copy_skeleton_predicates_to_main_loop_helper). When 
unrolling, the range check predicates of the main-loop are updated 
together with the control edges (done in 
PhaseIdealLoop::update_main_loop_skeleton_predicates). The data nodes 
belonging to the post-loop are rewired to the zero-trip guard If node 
right before the CountedLoopNode (there are no predicates copied down to 
the post-loop). For example, in [5] (in contrast to [4]), we rewired the 
control of 247 LoadP to the main-loop and the control of 377 LoadP to 
the post-loop.

I included some renaming for predicate related code that should make the 
difference between the various predicate types and how they are used 
clearer (skeleton, copies, updated copies...).

Thanks to Stefan Karlsson for finding a reproducer and discussing this bug!

Best regards,
Christian


[1] http://cr.openjdk.java.net/~chagedorn/8240227/webrev.00/
     https://bugs.openjdk.java.net/browse/JDK-8240227
[2] 
https://bugs.openjdk.java.net/secure/attachment/87123/LoadP_control_Input.png
[3] 
https://bugs.openjdk.java.net/secure/attachment/87125/wrong_controls_in_mach_graph.png
[4] 
https://bugs.openjdk.java.net/secure/attachment/87261/wrong_control_edges_pre_main_post.png
[5] 
https://bugs.openjdk.java.net/secure/attachment/87262/fixed_control_edges_pre_main_post.png


More information about the hotspot-compiler-dev mailing list