Valhalla EG meeting notes Sep 12 2018

Wed Sep 12 22:22:09 UTC 2018

Attendees: Remi, Brian, Tobi, Dan H, Dan S, Frederic, Karen

AI: Karen - future topic: lazy static field initialization

Corrections welcome.

Value Type Nullability offsite next week. Agenda:
Brian
  anti-agenda: let’s focus on the near term 
   e.g. not cover primitives or value-based-class migration
  goal: what can we provide in LW10?
     worst case for VT:
        support arrays, support erased generics with equivalent of boxing
        this would be useful
        what is stopping us
Remi: do we need value types for pattern matching?
Brian: value types would make pattern matching much easier
    simulate method returns with multiple values
    currently boxing on heap
    if we could provide value types, even if not fully optimized, that would be ok
    other Amber features - would like to try as value types
Karen: if erased generics performance was the same as for Objects, would that be good enough long term?
Brian: Not good enough long term

Brian: We need:
  1. nullable values
  2. user story
     user abstraction for primitives was boxing, what do tell the users about the model?
     to channel Kevin - two part type system was bad enough, do we have to have a three part type system?
     consider box: value class -> nullable value class
       believe this gives us maneuvering room for VBC migration later
Karen: What it make sense to consider a primitive proposal that converts primitives to null-free value types
   (rather than the current nullable Object subtype?)
Brian: Worth considering. Need to deal resolve issues with auto boxing today: Object o = 3 auto-boxes to j.l.Integer.

Remi: Glad to have the language level discussion, we have been avoiding those and we need this.
Brian: there are couplings between the language and vm
    e.g. MethodHandle.asType conversions - match the language behavior
   knows about boxing int->Integer
   Does Point also need a box?
Frederic: what does the box provide today?
Brian: For primitives: the box supports
   1. store as a Object subtype
   2. nullability (works with erased generics)
   3. use a* bytecodes, e.g. if_acmp_eq/ne
   4. sync/wait - note: this is “accidental identity”, may be ok to give up
   boxes currently provide stable identities

Frederic: In LW1 a value type provides everything here except identity
Brian: including null enforcement
Frederic: Clarify: in LW1 value types are nullable. Period.
    Containers can declare that they can contain null or not.
    value arrays are all non-nullable
    value types in LW1 are 1. subtypes of object, 2. support nullability and 3. support a* bytecodes
Brian: brought up a VBC example
Frederic: We want the language to support the distinction between nullable and null-free types.

Remi: Only need this for type arguments, not for all types, only for erased generics
Brian: also need this for arrays
Frederic: yes for arrays, it would be nice to have for all reference types, sufficient to support
   for value types now
Brian: Sufficient yes. Is this the minimal solution, i.e. least invasive? There is a huge follow-on cost.
  consider the box model
Remi: nullable value type is box of null free, provides vm semantics
Brian: can teach “asType” the transform
Remi: user level syntax - TBD. Brian: agreed.

Dan H: How much can we shrink the scope of the problem?
   e.g. VBC migration - can we give up? forever?
   e.g. erased generics
  can we constrain the problem?

Remi: remove VBC from equation.
Dan H: Brian said earlier ok to not solve VBC, but continues to bring it up.
    Erased generics issues: 1. null and 2. backing arrays
Brian: could decide values not interoperate with generics at all, but that would not give users the full benefit of value types
    Believe that users understand concept of a box as a related class, and it could help with VBC migration
Remi: Primitives are in the same class of problems, eventually. If we do NOT see primitives types as value types reified generics will be complex.
Karen: primitives with a conversion to value types to use with reified generics. Not always value types - I would expect to get the performance we have today.

Frederic: LW1 uses NO boxes today. JIT could not optimize boxes, can we consider a model without boxes?
Brian: MVT, L&Q signatures were messy with boxes. With LWorld all those problems go away.
Frederic: Question of the number of types in the vm.
Remi: Two types in the language level and 1 in VM.
Karen: Goal of this exercise is:
   1) user model requirements to support erased generics - requires null and null-free references to value types
   2) JIT optimizations in general for value types (not for erased generics, but in general code and in future in reified generics) - depend on null-free guarantees.
Our goal: minimal user model disruption for maximal JIT optimization.
The way I read your email Remi - I thought you had user model disruption, but no information passed to the JIT, so no optimization benefit.
Remi: If inlining, the JIT has enough information.
Frederic: Actually with field and method signatures it makes a huge difference in potential JIT optimizations.
Remi: Erased generics should have ok performance
Frederic: We are talking about performance without generics - we want full optimization there.
Remi: what happens in LW1 if you send null to a value type generic parameter?
Frederic: LW1 supports nullable value types. We want to guarantee/enforce null-free vs. nullable distinction in the vm.
Remi: there are two kinds of entry points in the JIT’d code
Frederic: 2: i2c which does null checks and if null calls the interpreter, c2c - disallows nulls.
editor’s note: did you get the implication - if we see a null, you are stuck in the interpreter today because we all rely on dynamic checks.
Remi: For the vm, if you have a real nullable VT JIT can optimize/deopt/reopt
Frederic: this is brittle, lots of work, and uncertain
Remi: VM does not want nullable value types
Frederic: VM wants to be able to distinguish null-free vs. nullable value types, so 
for null-free we can optimize like qtypes and fo nullable, we can get back to Object performance.

Brian: Original MVT. Got optimizations, cost was too high in terms of boxing for Object.
  What do we need to add back?
Remi: Not convinced we need to do something. Some programs pass null, some collections never see null.
Frederic: HashMap, List, Vectors, ArrayList - APIs - null has semantic meaning
   tried to generate code to handle both nullable and null-free types with single source, without continual null checks - couldn’t do it
Remi: only have to enforce at public APIs, not even line

Karen: Since we are speaking about Generic Specialization, I need to ask Brian help with requirements.
In order to clarify what we need to do for Erased Generics, we do not need to have all the details of a Specialized Generics solution. However, we do need to have a model of direction that is viable.
We need to understand the transition backward compatibility requirements from Erased Generics to Specialized Generics so that we don’t make this harder for ourselves.

Brian: 1. Migration VBC->VT e.g. LocalDateTime
           2. EG -> SG migration
In a perfect world, we would provide source and binary compatibility.
  e.g. ArrayList: EG->SG
   1. author: opt-in: change code
   2. client: what do they have to do
   3. subclasses: what do they have to do
It is imperative that there be no flag day. We will see a mix and match of the timetable in practice.
We would  like the migrations to null pull the carpet from under clients and subclasses.
e.g. ArrayList<Optional> should still have the same meaning, still allow nulls.

Karen: Can we take two different examples please?
1. Source code that currently has no built-in nullability assumptions
2. Source code that currently has APIs in which null has semantic meaning.

Brian: During Model 1 exploration, explored JDK generic class author changes:
e.g. Conditional code - tried with #ifref for reference vs. value - got out of control quickly, e.g. with multiple type variables
For nullability: specializer could constant fold if (T==null) to false if we knew non-null
This allows writing generic code in a null friendly style.
e.g. ArrayList<T> if T is a String, include source logic: if (T != null)
specialize for ArrayList<int>, T never null: constant fold to false in the class rewriter/instantiator

For examples with semantic nulls, e.g. Map.get()/put()
“Peeling”: peel into erased and generic layers
   default - all members in all layers. Add source annotations - which only make sense based on the type system, e.g. for references
   Put Map.get() “in jail”: only for an erased Hashmap, not for HashMap over int
   add other methods

Karen: What happens if an existing erased client tries to use a jailed method?
Brian: Generated a matching signature in the erased client that throws a NSME or other exception,
  so all species have all methods
(ed. note - I think this means all species have all methods from the erased client, some species may have additional optional species-specific methods).
Brian: Today TreeSet<T extends Comparable>. Javac only lets you instantiate a TreeSet over a Comparable, otherwise generates an exception.

Frederic: Map.put() - returns null if entry did not already exist - how handle?
Remi: Alternative - could return default value of the type, if you have reified generics
Brian: Known problems with migrating existing libraries
Karen: just to double-check - if you have an existing library that does not have null assumptions, do we expect the clients to continue to work if we move from EG->SG and have a newly created Class<null-free VT>?
Brian: This would be a good assumption.
Frederic: will the EG -> SG transition be explicit?
Brian: All transitions will be explicit
  e.g. ArrayList<> -> ArrayList<Any T>
  e.g. instantiation - goal is not to change
  e.g. client - goal - change code if you want a new behavior
goal: Existing instantiation should continue to work with an existing client - both unchanged.
EG->SG have to opt-in
VBC->VC have to opt-in

Brian: Back to boxes. Need to handle in source
Remi: I don’t think we need this, we can use class-dynamic
Brian: let’s not deep-dive right now

Karen: My concern: EG->SG, with a new instantiator, existing clients may not work.
They may get a compilation time exception. Existing binaries may get a runtime exception.
I think this will be worse if we support EG over null-free types.

Brian: Do not want LW10 to support Generic<!Point>, only nullable Point. We all agree!
That is why I asked Srikanth to reject value types for generic parameters today.
Remi: Need to provide EG with nullable VT.
Frederic: In LW1, there are lots of hacks that let you get null

Remi: Summarize
   we want an easy plan to get null-free/nullable VT in language/vm
   expensive to add general null-free/nullable types to language
Frederic: want the same name
Brian: We will discuss the details of “straightening the wire” as John says - working all the details end-to-end next week
   user mode/language/class file/runtime

Remi: Need a backup plan: just for value types which is less invasive
  My earlier email was describing the least benefit plan if we did not need nullable value types in the language and not in the class file, always erase in vm

Brian: Actually we have a range of choices on how to do this.
Remi: MVT worked if you generated your own byecode
Remi: Is a nullable VT the same as the box?
Brian: Box is a user model concept, they know properties.
Karen: Help me with how boxes might work for primitives.
   One of my concerns with boxes is the assumed hard-code duality.
   Today we have int -> boxed to Integer.
   If we want int -> box to a value type
        is this a null-free value type?
        do we then box again to get to a nullable value type?
        or do we initially box to a nullable value type? Would unbox then unbox to a null-free value type like other value types or unbox to an int?

Remi: Vm point of view and Language point of view. Big mistake auto boxing primitives to Object.
  If you know the target type is a specific type of box, then boxing is easy.
  What if an int for the VM was a Value Type?
Karen: I do not expect current performance if you were to replace all primitives today with Value Types.
   Let’s not start our design with a different model for the language and vm and vm magic.
Remi: already magic for primitives.
Frederic: not magic, there are well-defined behaviors for primitives.
Remi: Is the only place we need primitives as Object subtypes for specialized generics?
Brian: generics is the big pain point. Want to avoid needing 8 (or 9 if you count void) manual specializations.
Remi: Plan is to specialize the class for any value, not for a given type
ed. note - did I hear you correctly? I am expecting possible specialization by type - e.g. ValInt vs. ValLong

Dan S: Boxing question for Brian:
  Does conversion allow identity conversion? i.e. instance of a different class when you go from value -> value.box?

Brian: User’s perception of type system
  distinct types value and box, can freely convert except null in one direction
  explicit conversions, or conditions in the language
Frederic: Problem:
    null-free VT v1;
    nullable VT v2;
    v = v2; getClass() return the same answer or not?
For the vm today, these point to the same thing. There is no copying on conversion.
If you have two distinct types, you must copy.
Remi: you could change getClass behavior
Dan H: Issue with arrays.
Karen: With arrays if you have an array with null-free/flattened elements -> convert to nullable elements,
     you have to copy.
Frederic: actually if each type is a different element type, then you have to create all new elements, not just copy.
Dan H: getClass  may leak
Brian: int[] vs. Integer[] today you must copy
  No precedent for arrays in terms of expecting no copy
  with an explicit box, getComponentType returns a different type

Frederic: field analogy

Dan S: high level suggestion:
  more natural to think of nullable/null-free using a widening/narrowing analogy rather than box/unbox

Remi: widening/narrowin requires subtype
Dan S: it is an identity with checks, does not assume copy
Brian: Adjoint pair
  next week: new model
  go through cases in John’s original email of “works like an int/long” - see if the widen/narrow model works

Remi: Really in the VM we have two different types: VT and nullable VT. Can we do this? What is the cost for the vm check?
Dan H: How do we see this in the vm? Descriptors? A bit? other?
Remi: everywhere.
Dan H: Killer is the vtable - what are the inheritance/overriding rules
Frederic: if we had two separate class files and bytecodes - sure we could do this
  If you want one source and 2 types in the vm, there are many complexities
  e.g. JVMTI - if you add a breakpoint, where do you want the breakpoint?
Karen:
  Please read JDK-8204937 - Type Operators in the JVM

  We have three approaches for vm representation:
    1. 2 different types
    2. 2 different species
    3. 2 different references to one type

Brian: Need to think about getClass (class file) vs. getCrass

Many thanks for a lively discussion!
Karen