RFR: 8313248: C2: setScopedValueCache intrinsic exposes nullptr pre-values to store barriers
Aleksey Shipilev
shade at openjdk.org
Tue Aug 1 12:26:56 UTC 2023
See the bug for investigation breadcrumbs. The root cause for failures seen with Shenandoah seem to be as follows.
The setter (`setScopedValueCache`) intrinsic passes `val_type` of `_gvn.type(arr)`, which is `narrowoop: java/lang/Object *[int:32] (java/lang/Cloneable,java/io/Serializable):NotNull:exact *`, derived from the `argument(0)`, and thus implies non-nullity.
So when Shenandoah's SATB barrier loads the `pre_val`, it folds the null-check, assuming the `pre_val` is not null, due to `val_type`. This passes `nullptr` to SATB queues or slowpath, and we crash in either queue filtering or barrier code that does not expect nullptrs on SATB paths. The getter (`scopedValueCache`) constructs the `objects_type` explicitly to imply the value can be null. I think we should do the same for setter, since it can hide the "getter" from SATB barrier inside of it.
Arguably, it is a landmine that GC barriers assume the `val_type` is the type of both stored value and the pre-value read from memory. So the non-null-ness derived for stored value gets used to reason for non-null-ness for pre-value. We can explore the solutions to that generic problem after we plug this leak. Other `access_store_at` uses in C2 intrinsics seem to only operate on thread fields that are not null, so the are not susceptible to this problem. `scopedValueCache` is a notable exception of lazily initialized thread OopHandle accessed from C2.
I think G1 SATB barriers have the same problem, but I have not tried to reproduce the failure very hard there. (It would, AFAIU, require writing the test which does G1 concurrent marks, not just young GCs.)
Attn @theRealAph ;)
Additional testing:
- [x] Linux x86_64 fastdebug, 10+ iterations of `java/lang/ScopedValue/StressStackOverflow.java` with Shenandoah
- [x] Linux x86_64 fastdebug, `hotspot_loom jdk_loom` with Shenandoah
- [x] Linux x86_64 fastdebug, `hotspot_loom jdk_loom` with G1
- [ ] Linux AArch64 fastdebug, `tier1 tier2 tier3`
-------------
Commit messages:
- Proper fix
- Trying to pin more
- Reverts
- Debugging
Changes: https://git.openjdk.org/jdk/pull/15105/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=15105&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8313248
Stats: 20 lines in 2 files changed: 10 ins; 8 del; 2 mod
Patch: https://git.openjdk.org/jdk/pull/15105.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/15105/head:pull/15105
PR: https://git.openjdk.org/jdk/pull/15105
More information about the hotspot-compiler-dev
mailing list