[9] RFR (S): 8168926: C2: Bytecode escape analyzer crashes due to stack overflow

Zoltán Majó zoltan.majo at oracle.com
Wed Jan 11 08:48:52 UTC 2017


Hi Vladimir,


On 01/10/2017 07:22 PM, Vladimir Kozlov wrote:
> Thank you, Zoltan
>
> Very nice investigation and evaluation.

Thank you, Vladimir!

> I agree that EA Analyzer should use only information in ciMethod 
> otherwise it will be inconsistent.
>
> Changes look good to me.

Thank you for the review (and, for the record, also to Tobias for his 
review in private)!

Best regards,


Zoltan

>
> Thanks
> Vladimir
>
> On 1/10/17 5:25 AM, Zoltán Majó wrote:
>> Hi,
>>
>>
>> please review the fix for 8168926.
>>
>> https://bugs.openjdk.java.net/browse/JDK-8168926
>> http://cr.openjdk.java.net/~zmajo/8168926/webrev.00/
>>
>> This is a bug in C2's escape analyzer (EA) I've been chasing for more
>> than a year now.
>>
>> The bug reproduces very rarely (<10 appearances since Sep '15) and in
>> different forms/with different tests (see JDK-8135159 for a set of
>> different manifestations of the same bug).
>>
>> I tried to reproduce the crash on at least five different occasions and
>> with different tests, but did not succeed, unfortunately. So my findings
>> (and the fix) rely only on source-code/test inspection and post-mortem
>> analysis of crashes I've seen.
>>
>> The bug is caused by the EA having an inconsistent view of the number of
>> parameters taken by a call site 'c'. If call site 'c' in a method 'm' is
>> dynamic (i.e., 'c' is targeted by an invokehandle or invokedynamic
>> instruction), the number of parameters taken by 'c' is different before
>> and after 'c' is resolved. That is, after 'c' is resolved, 'c' takes one
>> more argument than the number of arguments pushed onto the stack by 'm'
>> (as 'c' is dynamic, it needs an extra appendix argument after 
>> resolution).
>>
>> In its current state, EA can have two views of 'c' for the analysis of
>> 'c'. I.e., EA can use both a "before-resolution" and an
>> "after-resolution" view of 'c'. As a result, EA can pop fewer elements
>> from the stack than there were pushed onto the stack, which results in a
>> stack overflow.
>>
>> Here is a detailed scenario to illustrate the problem. Let's assume the
>> following sequence of operations to take place while EA is analyzing
>> method 'm'.
>>
>>
>> Step (1): EA obtains the method targeted by call site 'c' in 'm'. The
>> result is saved into ciMethod 'target':
>>
>> http://hg.openjdk.java.net/jdk9/hs/hotspot/file/026ff073b5ad/src/share/vm/ci/bcEscapeAnalyzer.cpp#l895 
>>
>>
>>
>> Let's assume that 'c' is not yet resolved at this point of time, i.e.,
>> the number of arguments N of 'target' does not include the appendix
>> argument (i.e., N is equal to the number of items pushed onto the stack
>> by the the bytecodes of method 'm').
>>
>>
>> Step (2): A thread different than the compiler thread performing EA of
>> 'm' reaches call site 'c' and executes it. As a result, 'c' is resolved
>> (and bootstrapped) and it now points to a method taking N+1 parameters
>> (one more parameter than before, because the parameters also include the
>> appendix argument).
>>
>>
>> Step (3): EA checks if call site 'c' has an appendix argument.
>>
>> http://hg.openjdk.java.net/jdk9/hs/hotspot/file/026ff073b5ad/src/share/vm/ci/bcEscapeAnalyzer.cpp#l899 
>>
>>
>>
>> As there is an appendix argument, an extra (unknown) argument is pushed
>> onto the stack. I.e., there are N+1 elements on the stack at this point
>> of time.
>>
>>
>> Step (4): EA continues with analyzing the call site
>>
>> http://hg.openjdk.java.net/jdk9/hs/hotspot/file/026ff073b5ad/src/share/vm/ci/bcEscapeAnalyzer.cpp#l903 
>>
>>
>>
>> After being done with the analysis, EA removes 'arg_size' number of
>> arguments from the stack. For example, here:
>>
>> http://hg.openjdk.java.net/jdk9/hs/hotspot/file/026ff073b5ad/src/share/vm/ci/bcEscapeAnalyzer.cpp#l294 
>>
>>
>>
>> The number 'arg_size' of arguments is, however, only N. The reason is
>> that 'arg_size' is obtained from ciMethod 'target' constructed back at
>> Step (1), i.e., from the unresolved call site, and does not include the
>> appendix argument.
>>
>>
>> Summary: If the sequence of operations is executed as outlined by Step
>> (1)-(4), the stack can overflow after call site is analyzed, because
>> some arguments pushed onto it are not popped EA is done with analyzing
>> call site'c'. For the problem to appear, the resolution of call site 'c'
>> has to happen concurrently with EA and exactly after Step (1) and before
>> Step (3). That explains why the problem reproduces so rarely. For more
>> information on the investigation please see [1].
>>
>> The fix I propose determines if a call site 'c' needs an appendix
>> argument solely by looking at the ciMethod 'target' and the current
>> bytecode instruction. By that, EA has only one (consistent) view of call
>> site 'c' (which is either resolved or not).
>>
>> I tested the fix with
>> - RBT (all hotspot tests both with -Xmixed and -Xcomp);
>> - JPRT;
>> - locally executed all jdk/test/java/lang/invoke tests (both with
>> -Xmixed and -Xcomp).
>>
>> No (new) failures appeared.
>>
>> Thank you!
>>
>> Best regards,
>>
>>
>> Zoltan
>>
>> [1]
>> https://bugs.openjdk.java.net/browse/JDK-8168926?focusedCommentId=14035888&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14035888 
>>
>>



More information about the hotspot-compiler-dev mailing list