RFR: 8244416: Remove incorrect assert during inline cache cleaning

Erik Österlund erik.osterlund at oracle.com
Fri May 15 08:10:31 UTC 2020


Hi Vladimir,

On 2020-05-15 01:20, Vladimir Kozlov wrote:
> Do we need to clean up IC in zombie nmethod 'from' here? Sweeper will 
> do that later anyway as you said.

You are 100% right. We do not need to clean up IC in zombie nmethods. In 
fact we try
to actively avoid it. The concurrent GC thread only initiates IC 
cleaning on is_alive()
nmethods. The issue is that from checking if it is_alive() and then 
starting the IC cleaning,
the state might flip over to zombie in a race with make_zombie from the 
sweeper. That is
fine as I described, because the make_zombie() function will in this 
race wait for the GC to
finish and then clear the ICs, releasing spurious IC stubs created by 
the GC mistakenly thinking
it is_alive(), before IC finalization can run in a subsequent safepoint. 
The only issue is that we
can't assert that from must not be zombie. It almost never is, but it is 
sometimes, and
that is fine.

Thanks,
/Erik

> Thanks,
> Vladimir
>
> On 5/14/20 6:07 AM, Erik Österlund wrote:
>> Hi,
>>
>> Months of stress testing triggered the following assert in inline 
>> cache cleaning:
>>
>> assert(!from->is_zombie(), "should not clean inline caches on zombies");
>>
>> It used to be assert(!from->is_alive()), but after finding a race 
>> that resulted in
>> false positives for unloaded nmethods, the assert was relaxed to 
>> check only that
>> we shouldn't have zombies here, with a fat comment explaining why.
>>
>> Unfortunately, the relaxed assert also gets false positives, so I 
>> have chosen to
>> remove it, and further increase the size of that fat comment, 
>> explaning why we can't
>> assert that it isn't a zombie either.
>>
>> What it boils down to is a race where inline cache cleaning is 
>> performed from
>> a concurrent GC thread, concurrent to the sweeper making an nmethod 
>> zombie.
>>
>> The order of the race ingredients making the nmethod zombie from the 
>> sweeper is
>> the following:
>>
>> 1. Transition state to zombie
>> 2. Unregister nmethod form GC (importantly triggers a wait for 
>> concurrent code cache unloading to finish)
>> 3. clear_ic_callsites() importantly releasing all IC stubs >
>> The race happens when the GC finds an is_alive && !is_unloading 
>> nmethod and
>> triggers inline cache cleaning on it before 1, but then when racingly 
>> checking
>> again in the assert, we are past 1.
>>
>> Since the assert will spuriously fail even though there is no actual 
>> problem, and
>> we are at a point where there isn't much else it could assert, I 
>> chose to remove it.
>>
>> Other than triggering the assert, it is worth noting that this 
>> racingly zombiefied
>> nmethod will have its inline caches cleaned *with* IC stubs from the 
>> concurrent
>> GC thread. Subsequent IC finalization in safepoint cleanup requires 
>> live IC stubs
>> to belong to is_alive() nmethods, so the IC stubs must be released 
>> before the next
>> safepoint for this race to not be dangerous. That is ensured, because 
>> the racing
>> sweeper will get stuck waiting for concurrent code cache unloading to 
>> finish,
>> and when it wakes up, it will clear_ic_callsites(), releasing the 
>> spurious ICStubs,
>> before finishing the make_zombie() function. Then it will eventually 
>> hit a safepoint
>> poll, when the nmethod has been purged from IC stubs.
>>
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8244416
>>
>> Webrev:
>> http://cr.openjdk.java.net/~eosterlund/8244416/webrev.00/
>>
>> Thanks,
>> /Erik



More information about the hotspot-dev mailing list