aarch64: Concurrent class unloading, nmethod barriers, ZGC

Thu Jan 9 22:35:12 UTC 2020

Hi Stuart and Andrew,

Right, when it comes to native wrappers, we do inject entry barriers for 
that on x86. The main reason for that is that I am allergic to "special" 
nmethods that you have to remember work differently all the time. We 
have too many of them. The only nmethod that regrettably does not have 
entry barriers is the method handle intrinsic. That seems fine but I'm 
not quite happy about it.

Other than that, we also do need the barriers for correctness. Last time 
I thought about that, I recall there were a few problematic hypothetical 
situations I wanted to avoid. For example, consider the following 
obscure race condition (suitable beverage while reading advised):

1. Load abstract class A with non-static method foo.
2. Load class B, inheriting from A, from a separate class loader, 
overriding foo with a native method (that gets a native wrapper).
3. JIT nmethod with a virtual call to A.foo. The compiler will with CHA 
decide that there is only a single concrete foo implementation in the 
system (B::foo), due to there being a single implementation of A, which 
turns out to be our native wrapper. When this happens an optimized 
virtual call is generated with a direct call emitted (originally 
pointing at a resolution stub for the very first call), but the holder 
oop of B (it's class loader) is not inserted to the oop section. 
Instead, an entry is added in the dependency context to keep track of 
this nmethod so the caller nmethod (calling the native wrapper) can get 
deoptimized if the unique callee for A assumption changes.
4. Call the JIT-compiled call of A.foo with an instance of B, resolve it 
and patch the direct call to the native wrapper (B.foo *verified* entry, 
due to being an optimized virtual call).
5. Release the reference to the class loader of B, and wait until the 
class loader dies, and hence B dies.
6. Before concurrent class unloading kicks in (concurrently) and walks 
dependency contexts of dead things to invalidate them (which would 
invalidate the caller nmethod), load a class C also inheriting from A 
and overriding a concrete implementation of foo. When loading that 
class, the dependency context walk for invalidating e.g. CHA 
inconsistencies skip over the is_unloading() nmethods (including the 
native wrapper), due to race conditions that ended up giving that 
responsibility to the concurrent GC thread (which has not gotten to it yet).
7. Reuse the same JIT-compiled virtual call of A.foo but pass in a new 
instance of C. The state of the callsite is now a direct call to B.foo, 
and it's about to get deoptimized, but isn't yet. But B.foo 
is_unloading() because B is dead, making the one oop of the native 
wrapper (the holder oop of B) dead, and hence the native wrapper 
is_unloading().

Now in this scenario, without an nmethod entry barrier, we can end up 
calling a dead method. The nmethod entry barrier guards that by 
enforcing the invariant that we can't enter dead nmethods.

Hope this makes sense and helps understanding why the native wrapper 
ought to have an entry barrier.

Thanks,
/Erik

On 2020-01-09 17:02, Stuart Monteith wrote:
> Thank you Andrew, that compiles and runs without error - the
> deoptimize method is definitely being provoked. and continues without
> apparent problems.
>
> I've been trying to insert constants, and the issue you mention is
> tripped when we enter a native method wrapper. Eric can perhaps
> correct me, but I presume we might have to deoptimise a native method
> if it was overriding a JIT-compiled method and it is subsequently been
> unloaded.
>
> In x86 it is inserted here:
>    http://hg.openjdk.java.net/jdk/jdk/file/6d23020e3da0/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp#l2204
> on aarch64 I added the change in our generate_native_wrapper.
>
>
> BR,
>     Stuart
>
>
> On Wed, 8 Jan 2020 at 15:37, Andrew Haley <aph at redhat.com> wrote:
>> On 1/8/20 2:23 PM, Stuart Monteith wrote:
>>>   I see there is LIR_Assembler::int_constant, which is only for C1, the
>>> equivalent is MacroAssembler::ldr_constant, which uses an
>>> InternalAddress.
>> There is MacroAssembler::int_constant(n). It is there, and it returns an
>> address that you can use with ADR and/or LDR . It won't work with a native
>> method because they have no constant pool (int_constant() will return NULL)
>> but I don't think you need barriers for native methods.
>>
>> (Um, perhaps you do, for synchronized ones? They have a reference to a class.)
>>
>> Anyway, this is your patch with a working (probably) deoptimize handler:
>>
>> http://cr.openjdk.java.net/~aph/aarch64-jdk-nmethod-barriers-3.patch
>>
>> --
>> Andrew Haley  (he/him)
>> Java Platform Lead Engineer
>> Red Hat UK Ltd. <https://www.redhat.com>
>> https://keybase.io/andrewhaley
>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>>