RFR: 8222841: Incorrect static call stub interactions with class unloading

Thu May 9 23:57:54 UTC 2019

On 5/9/19 4:56 PM, Vladimir Kozlov wrote:
> Hi Erik,
> 
> I general looks good to me. One comment:

'In general' ;)

> 
> rtmp register seems unused in load_method_holder_cld()
> 
> What testing you did? There are no link to testing results in RFE.
> 
> Thanks,
> Vladimir
> 
> On 5/9/19 3:33 PM, Erik Osterlund wrote:
>> Any more takers? I know you want this stuff.
>>
>> /Erik
>>
>>> On 8 May 2019, at 14:20, Erik Österlund <erik.osterlund at oracle.com> wrote:
>>>
>>> Hi Coleen,
>>>
>>> Thanks for the review!
>>>
>>> /Erik
>>>
>>>> On 2019-05-08 14:16, coleen.phillimore at oracle.com wrote:
>>>>
>>>>
>>>>> On 5/2/19 4:55 AM, Erik Österlund wrote:
>>>>> Hi Coleen,
>>>>>
>>>>> Thank you for the review.
>>>>>
>>>>>> On 2019-04-23 23:32, coleen.phillimore at oracle.com wrote:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~eosterlund/8222841/webrev.00/src/hotspot/cpu/x86/gc/shared/barrierSetAssembler_x86.cpp.frames.html 
>>>>>>
>>>>>>
>>>>>> Can you mention in a comment that incoming rbx is the method?
>>>>>
>>>>> Fixed.
>>>>
>>>> Thank you!
>>>>>
>>>>>> You can't do a JRT_LEAF function from here, can you to some Method::is_alive() function?  Or 
>>>>>> even a straight up call ?  This doesn't seem that it should care about the extra overhead of 
>>>>>> the call instruction.  The assembly code looks correct but yuck not more assembly code.
>>>>>
>>>>> I'm not sure I know how I feel about that. I would have to save and restore all registers then 
>>>>> to go into the runtime just to check if the method is alive. Would you be okay with keeping the 
>>>>> assembly variant instead until I find a better solution to this problem? (I'm on it)
>>>>
>>>> Yes, this is fine with me.  Had to register not liking more assembly.
>>>>>
>>>>>> http://cr.openjdk.java.net/~eosterlund/8222841/webrev.00/src/hotspot/share/code/compiledMethod.cpp.frames.html 
>>>>>>
>>>>>>
>>>>>> 578 case relocInfo::static_stub_type: {
>>>>>> 579 is_in_static_stub = true;
>>>>>> 580 break;
>>>>>> 581 }
>>>>>> 582
>>>>>> 583 case relocInfo::metadata_type: {
>>>>>> 584 if (!is_in_static_stub) {
>>>>>> 585 continue;
>>>>>> 586 }
>>>>>>
>>>>>>
>>>>>> This was mystifying to me.  Can you put a comment that after static_stub_type relocations are 
>>>>>> generated, the next metadata relocation contains the address that is patched at runtime, so 
>>>>>> this is the metadata that can be stale and should be cleared by unloading.
>>>>>
>>>>> Sure, I updated the comments, and added a better description about what is going on here.
>>>>>
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~eosterlund/8222841/webrev.01/
>>>>>
>>>>> Incremental:
>>>>> http://cr.openjdk.java.net/~eosterlund/8222841/webrev.00_01/
>>>>
>>>> Looks good!
>>>> Coleen
>>>>>
>>>>> Thanks,
>>>>> /Erik
>>>>>
>>>>>> The clean_ic_stubs() name is my fault. I didn't know what they were and just wanted this code 
>>>>>> out of my way.
>>>>>>
>>>>>> Thank you for explaining how redefinition is broken by this. I didn't know these 
>>>>>> metadata_reloc guys were patched at runtime.
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On 4/23/19 4:41 AM, Erik Österlund wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> Static call stubs are cleared in debug builds before an assert checking for dead metadata. 
>>>>>>> This is done in the function "clean_ic_stubs()" (which does not clean ICStubs... but 
>>>>>>> static/optimized virtual call stubs for calling c2i adapters from compiled code...) The 
>>>>>>> reason is that the static call stub has metadata, and it complains about dead metadata being 
>>>>>>> left after inline cache cleaning of is_alive nmethods.
>>>>>>>
>>>>>>> ...except class redefinition looks at this metadata to determine if there is old metadata in 
>>>>>>> subsequent safepoints, and they could have been unloaded then. Ouch. So maybe we shouldn't 
>>>>>>> squelch that assert in debug builds, because it actually shows there is a real problem.
>>>>>>>
>>>>>>> ...and with concurrent class unloading (ZGC and seemingly soon Shenandoah), we can't just 
>>>>>>> clear the call stub of a concurrently running JavaThread; they can racingly call in to them 
>>>>>>> before they get cleared, and then get stuck in an infinite loop, because clearing also 
>>>>>>> involves updating the branch target of the static call stub to a self-loop. Or call a dead 
>>>>>>> method, which also seems like a scary scenario.
>>>>>>>
>>>>>>> All things considered, clearing static call stubs when unloading the code cache seems like a 
>>>>>>> bad idea.
>>>>>>>
>>>>>>> My proposal is the following:
>>>>>>> 1) Make sure that code cache unloading *always* clears out the metadata part (but not the 
>>>>>>> branch part) of static call stubs, so that subsequent operations such as class redefinition 
>>>>>>> will not blow up when looking at the embedded metadata, but it won't get stuck in infinite 
>>>>>>> loops with concurrent code cache unloading.
>>>>>>> 2) Make a c2i entry barrier for concurrently unloading GCs that will reresolve the call when 
>>>>>>> calling into a stale static call stub (iff method is NULL or dead).
>>>>>>> 3) Relax MT-safety asserts to allow the previous medatada to be concurrently unloading when 
>>>>>>> setting the new target.
>>>>>>>
>>>>>>> Bug:
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222841
>>>>>>>
>>>>>>> Webrev:
>>>>>>> http://cr.openjdk.java.net/~eosterlund/8222841/webrev.00/
>>>>>>>
>>>>>>> Thanks,
>>>>>>> /Erik
>>>>>>
>>>>>
>>>>
>>>
>>