Idea how to implement VT/VO compatibility in JVM

Mon Jan 12 15:57:10 UTC 2015

On Fri, Jan 09, 2015 at 02:12:10PM -0500, Brian Goetz wrote:
> 
> The hard part is: the entire VM field-access model is built around
> translating field accesses like
> 
>   getfield Foo.x [ receiver ]
> 
> to native code that looks like:
> 
>   (ideally elided) verify receiver is a Foo
>   result = *(receiver + sizeof(HEADER) + offsetof(x))
> 
> based on the assumption that all instances of Foo have Foo's fields
> at the same offsets.  (For methods we achieve polymorphism with
> vtables; for fields, we achieve it by adding subtype fields to the
> end of base type fields.)
> 
> But, if I have a class Box<any T> with a single T-valued field, I
> need different layouts; one for erased refs, one for ints, a longer
> one for longs, etc.  (And of course this perturbs the offset of
> other fields.)
> 
> Code that is generic in T that wants to access Box<T>.t now has to
> deal with different layouts; this is a pretty big change to our
> compilation story (with consequences for performance.)

Is this really true? I thought that because there's no instantiation of the
any type, every any-generic method has to be specialised anyways, so there
isn't really generic code that abstracts over both Object-generic and
value-generic.

To me the compromise that we can't write code that iterates say, a
list, without knowing at compile-time what type of value-generics it may
hold is a compromise that will haunt Java forever. I understand how you
came to that compromise, and I understand that you can push the isue
to the caller by making the iteration code Any-generic, but that only
works (like in C++) for compile-time instantiations: if you write a
framework that needs to traverse lists of any value type, including
any sort of value type, then you can't. Unless I got this wrong and
then I apologise, but please explain how ;)

This goes back to the fact that the division between List<Integer> and
List<int> is a terrible compromise. It will be even worse when you have
a List<@ValueType Date> and you want to iterate it like it's a List<Object>,
which it isn't. Especially confusing since there is autoboxing for both
Java primitives and value types.

The idea of specialisation is good, clearly, and it doesn't work for fields, OK.
But why not say that List<@ValueType Date> is also a List<Date> (boxed values)?
Not as subtyping just as type equivalence. I know you can't implement the same
interface twice with difference type arguments in Java but AFAICT the VM doesn't
actually care that much (I say this with experimental evidence). You could get
an instance of the List class that has bridge methods for all the boxed methods
that would get called when using the List<Date> or List<Object> type (boxed calls)
which would forward to the specialised methods for List<@ValueType Date> (unboxed
calls). And call-sites that know we're dealing with a List<@ValueType Date> would
call the specialised methods directly and not box.

This clearly doesn't work with fields, but so what? Add a language limitation that
fields of Any-generic types must be private. Fields are not virtual so this would
work with specialised types. IMO this restriction would be acceptable for collections
and much less of a compromise than the split between Object-generics and Value-generics.

OK this may not be retrofitable to JVM primitives and List<int>/List<Integer> depending
on how possible it is to substitute value types for JVM boxed primitives (in particular
I know not being able to lock boxed value types may exclude JVM boxed primitives from
that evolution), but at least it will be possible to make it work for other value types.

Now, perhaps I'm out of my depth here, I'm not a VM guy, even though I understand 
about compilation, bytecode, memory and assembly, but I'm clearly not as much
an expert as you guys, but still I'd like to understand why my proposal would not
work.

Thanks!
-- 
Stéphane Epardaud