RFR: 8244416: Remove incorrect assert during inline cache cleaning
Erik Österlund
erik.osterlund at oracle.com
Fri May 15 17:58:46 UTC 2020
Hi Vladimir,
> On 15 May 2020, at 19:48, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>
> On 5/15/20 1:10 AM, Erik Österlund wrote:
>> Hi Vladimir,
>>> On 2020-05-15 01:20, Vladimir Kozlov wrote:
>>> Do we need to clean up IC in zombie nmethod 'from' here? Sweeper will do that later anyway as you said.
>> You are 100% right. We do not need to clean up IC in zombie nmethods. In fact we try
>> to actively avoid it. The concurrent GC thread only initiates IC cleaning on is_alive()
>> nmethods. The issue is that from checking if it is_alive() and then starting the IC cleaning,
>> the state might flip over to zombie in a race with make_zombie from the sweeper. That is
>> fine as I described, because the make_zombie() function will in this race wait for the GC to
>> finish and then clear the ICs, releasing spurious IC stubs created by the GC mistakenly thinking
>> it is_alive(), before IC finalization can run in a subsequent safepoint. The only issue is that we
>> can't assert that from must not be zombie. It almost never is, but it is sometimes, and
>> that is fine.
>
> So you are saying that changing assert to from->is_zombie() check (to skip cleaning) will not help but only reduce concurrency window?
Exactly.
/Erik
> Vladimir
>
>> Thanks,
>> /Erik
>>> Thanks,
>>> Vladimir
>>>
>>> On 5/14/20 6:07 AM, Erik Österlund wrote:
>>>> Hi,
>>>>
>>>> Months of stress testing triggered the following assert in inline cache cleaning:
>>>>
>>>> assert(!from->is_zombie(), "should not clean inline caches on zombies");
>>>>
>>>> It used to be assert(!from->is_alive()), but after finding a race that resulted in
>>>> false positives for unloaded nmethods, the assert was relaxed to check only that
>>>> we shouldn't have zombies here, with a fat comment explaining why.
>>>>
>>>> Unfortunately, the relaxed assert also gets false positives, so I have chosen to
>>>> remove it, and further increase the size of that fat comment, explaning why we can't
>>>> assert that it isn't a zombie either.
>>>>
>>>> What it boils down to is a race where inline cache cleaning is performed from
>>>> a concurrent GC thread, concurrent to the sweeper making an nmethod zombie.
>>>>
>>>> The order of the race ingredients making the nmethod zombie from the sweeper is
>>>> the following:
>>>>
>>>> 1. Transition state to zombie
>>>> 2. Unregister nmethod form GC (importantly triggers a wait for concurrent code cache unloading to finish)
>>>> 3. clear_ic_callsites() importantly releasing all IC stubs >
>>>> The race happens when the GC finds an is_alive && !is_unloading nmethod and
>>>> triggers inline cache cleaning on it before 1, but then when racingly checking
>>>> again in the assert, we are past 1.
>>>>
>>>> Since the assert will spuriously fail even though there is no actual problem, and
>>>> we are at a point where there isn't much else it could assert, I chose to remove it.
>>>>
>>>> Other than triggering the assert, it is worth noting that this racingly zombiefied
>>>> nmethod will have its inline caches cleaned *with* IC stubs from the concurrent
>>>> GC thread. Subsequent IC finalization in safepoint cleanup requires live IC stubs
>>>> to belong to is_alive() nmethods, so the IC stubs must be released before the next
>>>> safepoint for this race to not be dangerous. That is ensured, because the racing
>>>> sweeper will get stuck waiting for concurrent code cache unloading to finish,
>>>> and when it wakes up, it will clear_ic_callsites(), releasing the spurious ICStubs,
>>>> before finishing the make_zombie() function. Then it will eventually hit a safepoint
>>>> poll, when the nmethod has been purged from IC stubs.
>>>>
>>>> Bug:
>>>> https://bugs.openjdk.java.net/browse/JDK-8244416
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~eosterlund/8244416/webrev.00/
>>>>
>>>> Thanks,
>>>> /Erik
More information about the hotspot-dev
mailing list