Refined type checking for GADTs (was: Pattern matching: next steps after JEP 405)

Tue May 24 18:56:31 UTC 2022

>
>  - *Refined type checking for GADTs. *Given a hierarchy like:
>
>     sealed interface Node<T> { }
>     record IntNode(int i) implements Node<Integer> { }
>     record FloatNode(float f) implements Node<Float> { }
>
> we currently cannot type-check programs like:
>
>     <T> Node<T> twice(Node<T> n) {
>         return switch (n) {
>             case IntNode(int x) -> new IntNode(x*2);
>             case FloatNode(float x) -> new FloatNode(x*2);
>        }
>    }
>
> because, while the match constraints the instantiation of T in each 
> arm of the switch, the compiler doesn't know this yet. 

Much of this problem has already been explored by "Generalized Algebraic 
Data Types and Object Oriented Programming" (Kennedy and Russo, 2005); 
there's a subset of the formalism from that paper which I think can 
apply somewhat cleanly to Java.

The essence of the approach is that in certain scopes (which coincide 
exactly with the scope of pattern binding variables), additional _type 
variable equality constraints_ are injected.  For a switch like that 
above, we inject a T=Integer constraint into the first arm, and a 
T=Float into the second arm, and do our type checking with these 
additional constraints.  (The paper uses equational constraints only 
(T=Integer), but we may want additional upper bounds as well (T <: 
Comprable<T>)).

The way it works in this example is: we gather the constraint Node<T> = 
Node<Integer> from the switch (by walking up the hierarchy and doing 
substitution), and unifying, which gives us the new equational 
constraint T=Integer.  We then type-check the RHS using the additional 
constraints.

The type checking adds some new rules to reflect equational constraints, 
FJ-style:

    \Gamma |- T=U   \Gamma |- C<T> OK
    --------------------------------- abstraction
        \Gamma |- C<T> = C<U>

    \Gamma |- C<T> = C<U>
    --------------------- reduction
        \Gamma |- T=U

    \Gamma |- X OK
    --------------  reflexivity
    \Gamma |- X=X

    \Gamma |- U=T
    -------------  symmetry
    \Gamma |- T=U

    \Gamma |- T=U  \Gamma |- U=V
    ----------------------------  transitivity
    \Gamma |= T=V

     \Gamma |- T=U
    ---------------- subtyping
    \Gamma |- T <: U

The key is that this only affects type checking; it doesn't rewrite any 
types.  Since in the first arm we are trying to assign a IntNode to a 
Node<T>, and IntNode <: Node<Integer>, by symmetry + subtyping, we get 
IntNode <: Node<T>, and yay it type-checks.

The main moving parts of this sub-feature are:

  - Defining scopes for additional constraints/bounds.  This can 
piggyback on the existing language of the form "if v is introduced when 
P is true, then v is definitely matched at X"; we can trivially extend 
this to say "a constraint is definitely matched at X".  This is almost 
purely mechanical.
  - Defining additional type-checking rules to support scope-specific 
constraints, along the lines above, in 4.10 (Subtyping).
  - In the description of type and records patterns (14.30.x), appeal to 
inference to gather equational constraints, and which patterns introduce 
an equational constraint.

This is obviously only a sketch; more details to follow.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20220524/0bf05ce2/attachment.htm>