RFR: 8303512: Race condition when computing is_loaded property of TypePtr::InterfaceSet
Quan Anh Mai
qamai at openjdk.org
Mon May 8 15:55:43 UTC 2023
On Mon, 8 May 2023 14:52:21 GMT, Tobias Hartmann <thartmann at openjdk.org> wrote:
> [JDK-8297933](https://bugs.openjdk.org/browse/JDK-8297933) introduced a race condition among compiler threads computing `TypePtr::InterfaceSet::_is_loaded` for shared types (for example, for [TypeInstPtr::NOTNULL](https://github.com/openjdk/jdk/blob/5726d31e56530bbe7dee61ae04b126e20cb3611d/src/hotspot/share/opto/type.cpp#L545)). One thread can set `_is_loaded_computed` before setting `_is_loaded`:
>
> https://github.com/openjdk/jdk/blob/5726d31e56530bbe7dee61ae04b126e20cb3611d/src/hotspot/share/opto/type.cpp#L3473-L3482
>
> while another thread can already access `is_loaded()` and wrongly observe `_is_loaded = false`:
>
> https://github.com/openjdk/jdk/blob/5726d31e56530bbe7dee61ae04b126e20cb3611d/src/hotspot/share/opto/type.cpp#L3464-L3468
>
> As a result, the klass of a `TypePtr` can be loaded while the interfaces it implements appear to be not loaded.
>
> Another problem is that `TypePtr::InterfaceSet::eq` does not take the `_is_loaded` / `_is_loaded_computed` fields into account:
>
> https://github.com/openjdk/jdk/blob/5726d31e56530bbe7dee61ae04b126e20cb3611d/src/hotspot/share/opto/type.cpp#L3290-L3301
>
> A loaded type can therefore be replaced by an unloaded type during GVN.
>
> In the case of the failure reported by this bug, `LibraryCallKit::inline_native_hashcode` first null checks the receiver and updates the type. Due to the issues described above, the null-free type is GVN'ed with it's unloaded counterpart and propagated to another, redundant null check emitted by `LibraryCallKit::generate_method_call` (I filed [JDK-8307625](https://bugs.openjdk.org/browse/JDK-8307625) to remove it). Since the type is now unloaded, an uncommon trap is emitted and parsing is stopped(). We crash when trying to de-reference `GraphKit::_map->_jvms` which is NULL (in debug we would hit an assert).
>
> Another failure mode is reported by [JDK-8305339](https://bugs.openjdk.org/browse/JDK-8305339). Both failures are extremely intermittent and I was never able to reproduce them.
>
> The fix I propose is to completely remove the `_is_loaded` logic because if a klass is loaded, the interface that it implements should be loaded as well. I added an assert to verify this. Higher tier testing is still running.
>
> In addition, I noticed some `#ifdef DEBUG` checks where `DEBUG` is never defined. I replaced them by `#ifdef ASSERT` and fixed the verification code that failed to build.
>
> Thanks,
> Tobias
It seems to me that if 2 threads access a variable racily without using atomic, it results in undefined behaviours. This patch removes the logic and therefore removes the race condition. Is there any risk with that in the surrounding code, too? I see you reordered `_hash_computed` and `_exact_klass_computed` assignment, if these reorders matter, should they be at least releasing stores instead? Thanks a lot.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/13868#issuecomment-1538632666
More information about the hotspot-compiler-dev
mailing list