RFR JDK-8195639: [Graal] nsk/jvmti/PopFrame/popframe009 fail with Graal in -Xcomp

Alex Menkov alexey.menkov at oracle.com
Wed Nov 28 20:36:48 UTC 2018


Hi guys,

I was able to reproduce the issue only with Graal + -Xcomp and only with 
debug build (I tried Graal + -Xint, Graal without Xcomp/Xint, just 
-Xcomp (i.e. C1 is used)).

--alex

On 11/28/2018 01:27, serguei.spitsyn at oracle.com wrote:
> On 11/28/18 01:07, dean.long at oracle.com wrote:
>> OK, reconsidering solution 1, I have a couple more questions:
>>
>> The bug report says the problem is reproducible with -Xcomp, but I 
>> don't see the policy()->event() path taken with -Xcomp. Instead, 
>> -Xcomp uses CompilationPolicy::compile_if_required, so I don't see how 
>> Solution 1 fixes the problem with -Xcomp.
>>
>> > CompilerRuntime always calls policy()->event with CompLevel == 
>> CompLevel_aot.
>>
>> As far as I can tell, CompLevel_aot is only used when there is active 
>> AOT code.  Is this an AOT-specific problem?
> 
> Tracing showed that the CompLevel_aot is always used for JVMCI which is 
> pretty confusing.
> Most likely, it is because now the AOT is enabled by default (is it 
> correct?).
> I'm not sure it actually has anything to do with the AOT mode.
> 
> 
>>   I would expect there to be a problem with c1 as well.
> 
> Not sure about the c1.
> I hope, Alex could comment on this tomorrow.
> 
> 
> Thanks,
> Serguei
> 
>> dl
>>
>> On 11/27/18 3:38 PM, serguei.spitsyn at oracle.com wrote:
>>> Good summary with a list of solutions below!
>>>
>>> On 11/27/18 2:50 PM, dean.long at oracle.com wrote:
>>>> Let me list the proposed solutions:
>>>>
>>>> 1. short-circuit compiles in CompilationPolicy::event in 
>>>> is_interp_only_mode
>>>> 2. detect is_interp_only_mode in resolve stubs and force a c2i call
>>>> 3. don't offer a JVMTI suspend point in resolve stubs (or 
>>>> JavaCallWrapper)
>>>>
>>>> Solution 1 penalizes other threads, and don't prevent other threads 
>>>> from compiling the method, so while it may help a little, it's 
>>>> incomplete.
>>>
>>> I doubt, the Solution 1 penalizes other threads.
>>> First, it is an important pretty frequent/common case that allows to 
>>> save on compilations.
>>> Second, there is no point to compile methods on a thread executed in 
>>> interp_only_mode.
>>> Also, there is no big advantage to compile methods even for other 
>>> threads
>>> when an active debugging (e.g. single stepping) is in progress.
>>> Third, this would make it consistent with what the the interpreter is 
>>> doing.
>>>
>>> So, I think, the Solution 1 is needed independently of other 
>>> solutions we take.
>>> My suggestion is to apply it in the JDK-8195639 bug fix.
>>>
>>>> Solutions 1 and 2 allow the thread to resume in a different method 
>>>> (the callee method), causing other problems.
>>>
>>> Agreed.
>>>
>>>> With Solution 3, the frame we suspend is the same as the frame we 
>>>> end up in after the resume, so I believe it solves all the problems.
>>>> IMHO this is the correct solution to pursue.
>>>
>>> I agree in general.
>>> This solution has some risks involved as it impacts a lot of code 
>>> including platform-independent.
>>> It is also going to fix the other bugs: JDK-8195635, JDK-8214093, 
>>> JDK-8207013 (not sure about all of them yet).
>>>
>>> My suggestion is to pursue it as the JDK-8195635 fix.
>>> Could you, please, confirm if works for you?
>>>
>>> Also, we have no consensus yet on the Solution 1.
>>>
>>> Thanks,
>>> Serguei
>>>
>>>> dl
>>>>
>>>> On 11/27/18 2:19 PM, serguei.spitsyn at oracle.com wrote:
>>>>> Hi Dean,
>>>>>
>>>>> You also asked the questions:
>>>>>
>>>>> > Doesn't is_interp_only_mode() only apply to the current thread?
>>>>>
>>>>> Yes it does.
>>>>>
>>>>>
>>>>> > I don't think it's safe to assume no compiles will happen in 
>>>>> other threads,
>>>>> > or that a method call target is not already compiled, because as far
>>>>> > as I can tell, JVMTI only deoptimizes the active frames.
>>>>>
>>>>> I agree with it.
>>>>> However, compilations on the thread with the interp_only_mode 
>>>>> enabled is the most important case.
>>>>> Disabling compilations such thread improves testing stability.
>>>>>
>>>>> > The bug report describes a case where the caller has been 
>>>>> deoptimized
>>>>> > while resolving a call.  I don't see how this fix prevents the 
>>>>> target
>>>>> > from being a previously compiled method.
>>>>> > But we do need to make sure not to call into compiled code, so I 
>>>>> think
>>>>> > the fix needs to be in code like 
>>>>> SharedRuntime::resolve_static_call_C(),
>>>>> > where it returns get_c2i_entry() if is_interp_only_mode() is true.
>>>>>
>>>>> Thank you for making this point.
>>>>>
>>>>> It would be nice to attack this as well.
>>>>> We can try to investigate this approach further.
>>>>> One problem is that there are more cases like resolve_static_call_C,
>>>>> for instance, resolve_virtual_call_C and resolve_opt_virtual_call_C.
>>>>> We may need to fix the same in other places, like 
>>>>> JavaCallWrapper::JavaCallWrapper.
>>>>>
>>>>> We can start from fixing it in the resolve_static_call_C.
>>>>> If it is successful then continue this effort for other cases as well.
>>>>>
>>>>> What do you think?
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 11/27/18 1:01 PM, Alex Menkov wrote:
>>>>>> Hi Dean,
>>>>>>
>>>>>> Thank you for the feedback and for the comments in Jira.
>>>>>>
>>>>>> >> How does this solve the problem of
>>>>>> >> HotSpotJVMCIRuntime.adjustCompilationLevel being called?
>>>>>>
>>>>>> It doesn't. It fixes another issue.
>>>>>> The test suspend test thread and ensures that top frame is a 
>>>>>> "fibonachi" method. Then it turns SingleStep on, does PopFrame and 
>>>>>> resumes the thread.
>>>>>> The problem is the thread continues execution of compiled code 
>>>>>> (ignoring SingleStep) and we get SindleStep event only when 
>>>>>> adjustCompilationLevel method is called (it's interpreted code).
>>>>>>
>>>>>> There are several other bugs which are (most likely) caused by 
>>>>>> suspend during call setup (JDK-8195635, JDK-8214093, JDK-8207013)
>>>>>>
>>>>>> --alex
>>>>>>
>>>>>> On 11/27/2018 01:39, serguei.spitsyn at oracle.com wrote:
>>>>>>> Hi Dean,
>>>>>>>
>>>>>>> Thank you a lot for looking at this!
>>>>>>> Just a couple of points from me (it is up to Alex to provide a 
>>>>>>> full answer).
>>>>>>>
>>>>>>>
>>>>>>> I think, Alex in this RFR missed to tell that we knew about this 
>>>>>>> issue that an incorrect frame will be popped.
>>>>>>> But after some discussion we decided to file a separate issue on 
>>>>>>> this.
>>>>>>> Alex wanted to create a good stand-alone test case demonstrating 
>>>>>>> this problem before filing it.
>>>>>>>
>>>>>>> Now, as I can see, the JDK-8195635 
>>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8195635> looks very 
>>>>>>> close to a bug that we wanted to file.
>>>>>>> The only difference is that our scenario includes the 
>>>>>>> SharedRuntime::resolve_static_call_C instead of the 
>>>>>>> JavaCallWrapper::JavaCallWrapper as a helper for Java method 
>>>>>>> invocation.
>>>>>>> If I understand corrctly (Alex will fix me if needed), the jtreg 
>>>>>>> test we used to chase this issue did not catch a problem with the 
>>>>>>> HotSpotJVMCIRuntime.adjustCompilationLevel.
>>>>>>>
>>>>>>>
>>>>>>> The suggested fix was to fix the mismatch in the 
>>>>>>> TieredThresholdPolicy::even with the comment in the interpreter 
>>>>>>> code:
>>>>>>> nmethod* 
>>>>>>> InterpreterRuntime::frequency_counter_overflow(JavaThread* 
>>>>>>> thread, address branch_bcp) {
>>>>>>>      . . .
>>>>>>>    if (nm != NULL && thread->is_interp_only_mode()) {
>>>>>>>      // Normally we never get an nm if is_interp_only_mode() is 
>>>>>>> true, because
>>>>>>>      // policy()->event has a check for this and won't compile 
>>>>>>> the method when
>>>>>>>      // true. However, it's possible for is_interp_only_mode() to 
>>>>>>> become true
>>>>>>>      // during the compilation. We don't want to return the nm in 
>>>>>>> that case
>>>>>>>      // because we want to continue to execute interpreted.
>>>>>>>      nm = NULL;
>>>>>>>    }
>>>>>>>
>>>>>>>  > So I think the fix needs to be in code like 
>>>>>>> SharedRuntime::resolve_static_call_C(),
>>>>>>>  > where it returns get_c2i_entry() if is_interp_only_mode() is 
>>>>>>> true.
>>>>>>>
>>>>>>> I'm not sure, the adding this condition and returning the 
>>>>>>> get_c2i_entry() is always correct.
>>>>>>> We need some help from the Compiler team to make it right.
>>>>>>>
>>>>>>> BTW, the interp_only_mode has been enabled only when some 
>>>>>>> interp_only events are enabled.
>>>>>>> It is not set by either PopFrame or ForceEarlyReturn.
>>>>>>> But the popframe009 test enables single stepping, so we wanted to 
>>>>>>> make this part right.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>> On 11/26/18 16:05, dean.long at oracle.com wrote:
>>>>>>>> How does this solve the problem of 
>>>>>>>> HotSpotJVMCIRuntime.adjustCompilationLevel being called?
>>>>>>>>
>>>>>>>> I don't think this fix is the right approach.  Doesn't 
>>>>>>>> is_interp_only_mode() only apply to the current thread? I don't 
>>>>>>>> think it's safe to assume no compiles will happen in other 
>>>>>>>> threads, or that a method call target is not already compiled, 
>>>>>>>> because as far as I can tell, JVMTI only deoptimizes the active 
>>>>>>>> frames. The bug report describes a case where the caller has 
>>>>>>>> been deoptimized while resolving a call.  I don't see how this 
>>>>>>>> fix prevents the target from being a previously compiled 
>>>>>>>> method.  But we do need to make sure not to call into compiled 
>>>>>>>> code, so I think the fix needs to be in code like 
>>>>>>>> SharedRuntime::resolve_static_call_C(), where it returns 
>>>>>>>> get_c2i_entry() if is_interp_only_mode() is true. However, there 
>>>>>>>> is still another problem.  By allowing JVMTI to suspend the 
>>>>>>>> thread during call setup, but reporting the frame as still in 
>>>>>>>> the caller instead of the callee, we confuse JVMTI into thinking 
>>>>>>>> that execution will resume in the caller instead of the callee. 
>>>>>>>> We may want to restrict where we offer JVMTI suspend points, and 
>>>>>>>> not offer a JVMTI suspend point in 
>>>>>>>> SharedRuntime::resolve_static_call_C and friends at all.
>>>>>>>>
>>>>>>>> dl
>>>>>>>>
>>>>>>>>
>>>>>>>> On 11/26/18 11:14 AM, Alex Menkov wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> Please review the fix for
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8195639
>>>>>>>>> webrev:
>>>>>>>>> http://cr.openjdk.java.net/~amenkov/popframe009/webrev.01/
>>>>>>>>>
>>>>>>>>> Description:
>>>>>>>>> The test suspends a thread, turns on single stepping and then 
>>>>>>>>> calls PopFrame. SingleStep event is expected as soon as the 
>>>>>>>>> thread is resumed and PopFrame is processed (so we have call 
>>>>>>>>> stack with the depth 1 less than it was before PopFrame). 
>>>>>>>>> Instead SingleStep event is received with much deeper call 
>>>>>>>>> stack (and PopFrame processes wrong frame).
>>>>>>>>> Research shown that this is caused by missed deoptimization of 
>>>>>>>>> the current frame.
>>>>>>>>> As far as I understand CompilationPolicy::event should handle 
>>>>>>>>> the case when the thread has is_interp_only_mode() set, but 
>>>>>>>>> TieredThresholdPolicy::event checks this only then CompLevel is 
>>>>>>>>> CompLevel_none.
>>>>>>>>> CompilerRuntime always calls policy()->event with CompLevel == 
>>>>>>>>> CompLevel_aot.
>>>>>>>>>
>>>>>>>>> The fix looks quite simple, but I'd appreciate feedback from 
>>>>>>>>> runtime and compiler teams as I'm not sure I completely 
>>>>>>>>> understand all the details of the "PopFrame dance".
>>>>>>>>>
>>>>>>>>> --alex
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>
> 


More information about the serviceability-dev mailing list