Owner specialization at callsite ?

Sat Feb 18 10:09:37 UTC 2023

I'm trying to implement owner specialization at callsite but i struggle to see the benefits, worst i see a lot of drawbacks.

Currently, at callsite, i've implemented specialized generics either when instantiating a parametric generics
  new ArrayList<String>()

or when calling a parametric method
  List<String>.of()

But the Parametric VM spec goes a step further and also ask that the owner of a calling a method can be specialized,
for example, with
  ArrayList<String> list = new ArrayList<String>();
  list.add("foo");

in the bytecode list.add("foo") should be typed as ArrayList<String>.add(String) and not ArrayList.add(Object).

I've several concerns
- i do not see how to implement that without the VM knowing the exact semantics of the Java generics, making Kotlin or Scala second class citizens,
- it's not a backward compatible change, a lot of codes will start to throw ClassCastException at runtime,
- this will introduce regression in term of performance.

To implement owner specialization at callsite, it means that we are able to check at runtime the specialization of a generic class with instanceof/checkcast,
something like

  Object object = ...
  List<String> list = (List<String>) object;

It means to be able to compare the specialization of the instance referenced by "object" with List<String>, so it's a classcheck between two specialized classes.
But those classes maybe produces by different compiler with different way of storing the specialized parameters.
So either all languages have a kind of common semantics (behave like Java) or there is no way to answer that question.

Worst, a lot of code will start to fail because a lot of code relies on the erasure to exist,
a good example is the code of List.copyOf() in the JDK

    static <E> List<E> listCopy(Collection<? extends E> coll) {
        if (coll instanceof List12 || (coll instanceof ListN<?> c && !c.allowNulls)) {
            return (List<E>)coll;
        } else if (coll.isEmpty()) { // implicit nullcheck of coll
            return List.of();
        } else {
            return (List<E>)List.of(coll.toArray());
        }
    }

The first cast is safe because List12 and ListN are covariant (the implementation is non modifiable) something you can not express currently in Java.
If this method is specialized and the cast does a check at runtime, List.<Object>copyOf(List.<String>of("foo")) will throw a CCE.
The second cast will never work. List.of(coll.toArray()) creates a List<Object> at runtime, so the cast (List<E>) will always fail apart if E is Object.

Whatever the exact semantics of cast is chosen, the VM now as to verified those casts, this will introduce performance regression in the existing code,
even if at the end the underlying array of ArrayList<String> and ArrayList<Object> is an array of pointers.
I'm fine with a perf regression if the underlying array is an array of zero-default value class because here we are introducing a new feature,
not with a perf regression on an existing code.

The more i think about it, the more it seems an intractable problem, i think we should embrace erasure instead of trying to fight it !
And we can have code specialization for value classes without specialization of the owner, it requires a little bit more tracking because some method calls that were not virtual are now virtual but this tracking can be gated behind the fact that a specialization with something else than Object has been created (a kind of CHA).

Rémi