RFR: 8313248: C2: setScopedValueCache intrinsic exposes nullptr pre-values to store barriers [v2]
Tobias Hartmann
thartmann at openjdk.org
Wed Aug 2 08:00:13 UTC 2023
On Tue, 1 Aug 2023 16:10:27 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:
>> See the bug for investigation breadcrumbs. The root cause for failures seen with Shenandoah seem to be as follows.
>>
>> The setter (`setScopedValueCache`) intrinsic passes `val_type` of `_gvn.type(arr)`, which is `narrowoop: java/lang/Object *[int:32] (java/lang/Cloneable,java/io/Serializable):NotNull:exact *`, derived from the `argument(0)`, and thus implies non-nullity.
>>
>> So when Shenandoah's SATB barrier loads the `pre_val`, it folds the null-check, assuming the `pre_val` is not null, due to `val_type`. This passes `nullptr` to SATB queues or slowpath, and we crash in either queue filtering or barrier code that does not expect nullptrs on SATB paths. The getter (`scopedValueCache`) constructs the `objects_type` explicitly to imply the value can be null. I think we should do the same for setter, since it can hide the "getter" from SATB barrier inside of it.
>>
>> Arguably, it is a landmine that GC barriers assume the `val_type` is the type of both stored value and the pre-value read from memory. So the non-null-ness derived for stored value gets used to reason for non-null-ness for pre-value. We can explore the solutions to that generic problem after we plug this leak. Other `access_store_at` uses in C2 intrinsics seem to only operate on thread fields that are not null, so the are not susceptible to this problem. `scopedValueCache` is a notable exception of lazily initialized thread OopHandle accessed from C2.
>>
>> I think G1 SATB barriers have the same problem, but I have not tried to reproduce the failure very hard there. (It would, AFAIU, require writing the test which does G1 concurrent marks, not just young GCs.)
>>
>> Attn @theRealAph ;)
>>
>> Additional testing:
>> - [x] Linux x86_64 fastdebug, 10+ iterations of `java/lang/ScopedValue/StressStackOverflow.java` with Shenandoah
>> - [x] Linux x86_64 fastdebug, `hotspot_loom jdk_loom` with Shenandoah
>> - [x] Linux x86_64 fastdebug, `hotspot_loom jdk_loom` with G1
>> - [ ] Linux AArch64 fastdebug, `tier1 tier2 tier3`
>
> Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision:
>
> - Merge branch 'master' into JDK-8313248-shenandoah-nullcheck
> - Proper fix
> - Trying to pin more
> - Reverts
> - Debugging
Looks good to me.
src/hotspot/share/opto/library_call.cpp line 3591:
> 3589: const Type* LibraryCallKit::scopedValueCache_type() {
> 3590: ciKlass *objects_klass = ciObjArrayKlass::make(env()->Object_klass());
> 3591: const TypeOopPtr *etype = TypeOopPtr::make_from_klass(env()->Object_klass());
Suggestion:
ciKlass* objects_klass = ciObjArrayKlass::make(env()->Object_klass());
const TypeOopPtr* etype = TypeOopPtr::make_from_klass(env()->Object_klass());
-------------
Marked as reviewed by thartmann (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/15105#pullrequestreview-1558405243
PR Review Comment: https://git.openjdk.org/jdk/pull/15105#discussion_r1281526194
More information about the hotspot-compiler-dev
mailing list