[lworld] RFR: 8361352: [lworld] Weird interaction between OSR and value class instances [v2]
Marc Chevalier
mchevalier at openjdk.org
Mon Aug 25 09:11:06 UTC 2025
> First, credit to @TobiHartmann for the diagnostic, and a lot of the solution.
>
> # Diagnostic
>
> According to [Strict Field Initialization JEP](https://openjdk.org/jeps/8350458), when a strict field is being initialized, it is not quite immutable, but observally immutable: at first, the field can be only set (during the early larval phase), then it can be only read (late larval or initialized phase), so the last set is the actual value one can ever observe.
>
> The interesting part is that in early larval phase, a field can be subject to some side effects. When applied to a value object, that means that until it reaches the unrestricted state, it is not yet immutable. While being not theoretically necessary, avoiding scalarization and keeping the value object behind a reference is a convenient way to make sure that side effects are correctly applied. This strategy means that we shouldn't scalarized before reaching the unrestricted state. Normally, in C2, finding out what is early larval or not is the job of bytecode parsing, but in OSR compilation, everything about the StartOSR is not parsed, and thus some objects are soundly assumed that they might be larval, when they actually aren't. In the reported example, that leads to drastic performance difference between OSR and non OSR compilation: the second one is able to eliminate allocations since it knows more precisely when the value object can be scalarized.
>
> In the original example:
>
> public value class MyNumber {
> private long d0;
> private MyNumber(long d0) { this.d0 = d0; }
> public MyNumber add(long v) { return new MyNumber(d0 + v); }
>
> private static void loop() {
> MyNumber dec = new MyNumber(123);
> for (int i = 0; i < 1_000_000_000; ++i) {
> dec = dec.add(i);
> }
> }
>
> public static void main(String[] args) {
> for (int i = 0; i < 10; ++i) {
> loop();
> }
> }
> }
>
> OSR happens in the loop in `loop`, but here `dec` is not detected to be unrestricted (so immutable, so scalarizable), so the allocation in inlined `add` still needs to happen because we need the buffer for the new `dec`. The first iteration traps at the exit of the loop (unstable if), OSR happens again, followed by a non-OSR compilation, finding correctly that `dec` can be scalarized in the loop, making the third iteration fast.
>
> # Solution
>
> Overall, the solution requires to improve our detection of early larval values. Since we keep parsing as-is, let's do that in `ciTypeFlow`. Th...
Marc Chevalier has updated the pull request incrementally with two additional commits since the last revision:
- Rephrase summary
- +copyright
-------------
Changes:
- all: https://git.openjdk.org/valhalla/pull/1531/files
- new: https://git.openjdk.org/valhalla/pull/1531/files/acce992f..183dfe97
Webrevs:
- full: https://webrevs.openjdk.org/?repo=valhalla&pr=1531&range=01
- incr: https://webrevs.openjdk.org/?repo=valhalla&pr=1531&range=00-01
Stats: 28 lines in 2 files changed: 25 ins; 0 del; 3 mod
Patch: https://git.openjdk.org/valhalla/pull/1531.diff
Fetch: git fetch https://git.openjdk.org/valhalla.git pull/1531/head:pull/1531
PR: https://git.openjdk.org/valhalla/pull/1531
More information about the valhalla-dev
mailing list