The last miles

Mon Aug 21 17:39:03 UTC 2023

> On Aug 18, 2023, at 9:15 PM, John Rose <john.r.rose at oracle.com> wrote:
> 
> I’ve written up in detail how I think Remi’s suggestion can work.
> 
> https://cr.openjdk.org/~jrose/values/larval-values.html
> 
> While this is a rough note, I think all the details are present.

The compatibility wins of this strategy do seem nice. But let me scrutinize a few details, because I think there are some trade-offs:

1) A larval value object is an identity object. This means, in the hand-off between the <init> method and the caller, the object must be heap allocated: the caller and the <init> method need an agreed-upon memory location where state will be set up.

I can see this being optimized away if the <init> method can be inlined. But if not (e.g., the constructor logic is sufficiently complex/large), that's a new cost for value object creation: every 'new' needs a heap allocation. (For <vnew>, we are able to optimize the return value calling convention without needing inlining.)

Am I understanding this correctly? How concerned should we be about the extra allocation cost? (Our working principle to this point has been that optimal compiled code should have zero heap allocations.)

2) If we *do* inline the <init> call, then at the call site, there can be any number of references from locals/stack to the larval value, and at the end of the call, there's this unusual operation where all of those locals/stack get transformed into the value object. I *think* this all just falls out cleanly (locals become compiler metadata that bottoms out at the same registers, no matter how many references there are), but it's something to think carefully about.

3) The <vnew> approach doesn't have any constraints about leaking 'this', and in particular the javac rule we were envisioning is that the constructor can't leak 'this' until all fields are provably set, but aftewards it's fair game. This <init> strategy is stricter: the verifier disallows leaking 'this' at all from any point in the constructor.

Are we okay with these restrictions? In practice, this is most likely to trip up people trying to do instance method calls, plus those who are doing things like keeping track of constructed objects. (Even printf logging seems tricky, since 'toString' is off limits.)

4) I'm not sure the prohibition on 'super' calls is actually necessary. What if, instead, all non-'identity' <init> methods are understood to be working on larval objects, and prohibited from any leaking of 'this'? Instead of disallowing 'super' calls, the verifier would only transition from 'uninitializedThis' to 'LFoo;' in an identity class constructor. Does that make sense or am I missing something? (If it works, does this mean we get support for super fields "for free"?)

5) Do we really need a header state for larval objects? We don't do anything like that to distinguish between uninitialized identity objects (post-'new') and valid identity objects (post-'super()'). We just let the verifier handle it. Same principle here perhaps?