Proposal: Improved Type Inference for Generic Instance Creation

Fri Feb 27 23:33:39 PST 2009

Adding conversions doesn't make type inference work.  Each conversion
that you want to affect type inference has to be accommodated
specifically in the type inference algorithm.

I disagree with your suggestion that this should fail:

     List<?> list = new ArrayList<>(); // should fail

The equivalent code using static factories works just fine.  It isn't
clear why this feature should behave differently, or what part of your
specification makes it fail.

I don't think supporting casts adds value to the feature.  You can
already assign, for example, from ArrayList to List<String> without a
cast.

If you want to hear about the approach that Joe is working on, I'll be
happy to describe it in person at your convenience (with Joe, if he
wants).

On Fri, Feb 27, 2009 at 10:56 PM, Jeremy Manson <jeremy.manson at gmail.com> wrote:
> Neal,
>
> I thought you had disappeared into the wilderness!  I might not have
> written this if I thought that you were on the job.
>
> Anyway, it sounds as if you have an example in mind where G1 and G2
> can have different type parameters.  I have to say that the only ones
> that come to mind are the wildcard examples I give at the end of the
> proposal (where you have to infer the type's LUB).  I was only going
> to pursue that if people thought it was a good idea.  I imagine you
> have some others in mind?
>
> I did look at making modifications to the type inference algorithm,
> but I thought I could get away without it if you basically treat the
> entire operation as a conversion.  It makes the proposal simpler not
> to have to tackle that particular issue.  It seems less likely that
> this would be possible if G1 and G2 have to be able to have different
> type parameters.
>
> What do you think of the idea in the proposal for using this for
> casts?  Do you think it makes sense to be able to say:
>
> List<String> list = (List<>) new ArrayList();
>
> Jeremy
>
> On Fri, Feb 27, 2009 at 10:28 PM, Neal Gafter <neal at gafter.com> wrote:
>> Jeremy-
>>
>> Where your spec says "the type parameters of G1 in the conversion
>> context are deemed to be identical to those of type G2" seems to
>> assume that the types G1 and G2 are related in a way that their type
>> parameters align exactly.  That is not the case in general.  G1 and G2
>> might have different numbers of type parameters or they might have
>> "corresponding" parameters in a different order.
>>
>> The most significant problem with this proposal is that it completely
>> ignores the most difficult aspect of this feature - the required
>> modifications to the type inference algorithm specified in the JLS
>> (15.12.2.7 and 15.12.2.8).  Joe Darcy and I discussed how to specify
>> that on Friday, and I believe he is preparing a proposal.
>>
>> Regards,
>> Neal
>>
>> On Fri, Feb 27, 2009 at 10:10 PM, Jeremy Manson <jeremy.manson at gmail.com> wrote:
>>> Improved Type Inference for Generic Instance Creation
>>>
>>> (Sorry about the length; I probably got a little carried away)
>>>
>>> Author: Jeremy Manson
>>>
>>> Overview
>>>
>>> Feature Summary
>>>
>>> This proposal addresses the addition of limited type inference for
>>> class instance creation expressions to the Java programming language.
>>> In cases where parametrized types need to be explicitly declared for a
>>> constructor, and the full parametrized type < T1 , T2 , ...Tn > of
>>> that constructor is obvious from the context, then the  parameterized
>>> type of the constructor can be replaced with an empty set of type
>>> parameters: <>. The <>  construct is legal to use when constructing an
>>> ob ject and either assigning it to a variable, or when passing  it as
>>> a parameter.
>>>
>>> For example, consider the following assignment statement:
>>>
>>> Map<String, List<String>> anagrams = new HashMap<String, List<String>>();
>>>
>>> This is rather lengthy, so it can be replaced with this:
>>>
>>> Map<String, List<String>> anagrams = new HashMap<>();
>>>
>>> Major Advantages
>>> Generics have had a tremendously positive impact on type safety in the
>>> Java programming language. They  have made it possible to provide
>>> static guarantees about the type of instances used by other classes,
>>> preventing entire classes of runtime errors from occurring. However,
>>> generics have added complexity to Java, making the code far more
>>> verbose and occasionally diﬃcult to read. Although solving this
>>> problem is well outside the scope of this proposal, a limited form of
>>> type inference would remove unnecessary redundancy from the language.
>>>
>>> The requirement that type parameters be duplicated unnecessarily like
>>> this encourages an unfortunate
>>> overabundance of static factory methods, simply because type inference
>>> works on method invocations. For
>>> example, the Google collections library [2] contains a class that
>>> wraps every possible constructor of every
>>> subclass of java.util.List in the JDK:
>>>
>>> public class Lists {
>>>  public static <E> List<E> newArrayList() { return new ArrayList<E>(); }
>>>  public static <E> List<E> newArrayList(int i) { return new ArrayList<E>(i); }
>>>  // ...
>>> }
>>> List<String> list = Lists.newArrayList();
>>>
>>> This approach avoids the redundancy in the declaration, but requires
>>> two declarations for every possible
>>> constructor in every possible class. This is ugly: every time a
>>> constructor or a new subtype of List is created,
>>> this class must be updated. Furthermore, the names of these methods
>>> does not have to be consistent; there
>>> is no reason to favor newArrayList over createArrayList. The proposed
>>> feature would make this approach
>>> obsolete.
>>>
>>> Major Disadvantages
>>>
>>> As a minor change, the impact of this feature would have few
>>> disadvantages. It does have beneﬁts and
>>> drawbacks when compared with other implementations, though. For
>>> example, it could be argued that
>>> inference would look cleaner without the <> token:
>>>
>>> Map<String, List<String>> list = new Map();
>>>
>>> Unfortunately, for the purposes of backwards compatibility, new Map()
>>> indicates a raw type, and therefore
>>> cannot be used for type inference.
>>>
>>> An argument can also be made for more full type inference:
>>>
>>> auto map = new HashMap<String, String>();
>>> map.get("Foo"); // map is understood to be a Map<String, String>
>>>
>>> or, alternatively, for a typedef like feature, where type identiﬁers
>>> are assigned to longer typenames:
>>>
>>> typedef MyMap HashMap<String, List<String>>;
>>> MyMap m = new MyMap();
>>>
>>> Both of these approaches, which are present in the forthcoming C++
>>> standard, would eliminate a lot of
>>> verbosity, at the expense of (potentially) obscuring the type of the
>>> variable. Neither of these proposals are
>>> precluded by this proposal; one, both or neither can be done at a later date.
>>>
>>> 2 Examples
>>>
>>> Here is a simple example, showing a use case for type inference for
>>> blank ﬁnals. An ImmutableMap is a Map
>>> implementation that throws an exception when mutation is attempted. In
>>> this example, the programmer
>>> writes <String, List<String>> twice instead of four times, and
>>> <String> once instead of twice:
>>>
>>> class AllAnagrams {
>>>  final Map<String, List<String>> anagrams;
>>>  AllAnagrams(List<String> originals) {
>>>    Map<String, List<String>> map = new HashMap<>(); // saves typing
>>>    for (String s : originals) {
>>>      String alphagram = alphagram(word);
>>>      List<String> group = anagrams.get(alphagram);
>>>      if (group == null)
>>>        anagrams.put(alphagram, group = new ArrayList<>()); // saves typing
>>>      group.add(word);
>>>    }
>>>   anagrams = new ImmutableMap<>(map); // saves typing
>>>  }
>>>  private static String alphagram(String s) {
>>>    char[] chars = s.toCharArray();
>>>    Arrays.sort(chars);
>>>    return String.valueOf(chars);
>>>  }
>>> }
>>>
>>> Here is another example, showing a use case for type inference in
>>> parameter passing:
>>> class TreeNode<T> {
>>>  T value;
>>>  TreeNode<T> left, right;
>>>  private List<List<T>> buildPathsInternal(
>>>    List<List<T>> list, List<T> currPath, TreeNode<T> node) {
>>>    if (node == null) {
>>>      // type inference does not save typing here; see section on Wildcards.
>>>      list.add(new ArrayList<T>(currPath));
>>>      return list;
>>>    }
>>>    currPath.add(node);
>>>    buildPathsInternal(list, currPath, node.left);
>>>    buildPathsInternal(list, currPath, node.right);
>>>    currPath.remove(node);
>>>    return list;
>>>  }
>>>  public List<List<T>> getAllPaths(TreeNode<T> head) {
>>>    // Type inference saves typing here:
>>>    return buildPathsInternal(new ArrayList<>(), new LinkedList<>(), head);
>>>  }
>>> }
>>>
>>> Details
>>>
>>> Speciﬁcation
>>> Informally, the speciﬁcation adds syntax to replace the type arguments
>>> < T1 ...Tn > to a class being instanti-
>>> ated by a call to new with the simple token <>. It also provides a new
>>> conversion that infers what the type
>>> arguments would have been from the context, and treats the instance
>>> creation as if those type arguments
>>> were present.
>>>
>>> Syntax
>>> The syntax of the new statement is currently given in two diﬀerent
>>> ways in the Java Language Speciﬁcation
>>> (JLS) [1] §15.9 and JLS§18.1 For simplicity, we restrict ourselves to
>>> the grammar given in §15.9.
>>>
>>> ClassInstanceCreationExpression ::=
>>>  new TypeArguments_opt ClassOrInterfaceCreationType (
>>> ArgumentList_opt ) ClassBody_opt
>>>  Primary . new TypeArguments_opt Identifier
>>> PossiblyInferredTypeArguments ( ArgumentList_opt ) ClassBodyopt
>>>
>>> ClassOrInterfaceCreationType ::=
>>>  TypeDeclSpecifier PossiblyInferredTypeArguments_opt
>>>
>>> PossiblyInferredTypeArguments ::=
>>>  <>
>>>  TypeArguments_opt
>>>
>>> Semantics and Compilation
>>> Consider G1 , a generic type that is declared with 0 type arguments, and G2
>>> ⟨T1 , T2 , ..., Tn ⟩, a generic type
>>> that is declared with n type arguments. There is a Type Inference
>>> Conversion from G1 to G2 if and only if
>>>  • G2 ’s erased type is a supertype of G1 ’s erased type and
>>>  • all type arguments T1 , T2 ...Tn uniquely identify reference types
>>> (i.e., they have no wildcard type arguments, although see below for
>>> more discussion of this).
>>>
>>> A Type Inference Conversion can take place in an assignment or method
>>> invocation conversion context
>>> (JLS §5). It might be possible to allow the inference to take place in
>>> a casting conversion context; see
>>> Section 6.1 for more discussion of this.
>>>
>>> Type Inference Conversions are unique in that they are not actually
>>> performed. Instead, they are used
>>> to replace the type they would ordinarily be converting: if a Type
>>> Inference Conversion can take place, then
>>> the type parameters of G1 in the conversion context are deemed to be
>>> identical to those of type G2 .
>>>
>>> Compilation
>>> The resulting bytecode would be identical to bytecode generated by the
>>> same code with the types inferred.
>>>
>>> Testing
>>> The proposed construct can be tested by writing a number of statements
>>> that construct instances using
>>> inferred types. These instances can then be assigned to variables and
>>> ﬁelds of speciﬁc types, and passed to
>>> methods that expect speciﬁc types. These tests should compile and run
>>> as expected:
>>> Set<String> set = new SortedSet<>();
>>> set.add("A");
>>> set.add("B");
>>> set.add("C");
>>> // verify that set contains "A", "B", "C".
>>>
>>> Other tests can be performed by attempted to place new instances of
>>> inferred types in contexts that
>>> contain wildcards; these tests should fail:
>>>
>>> List<?> list = new ArrayList<>(); // should fail
>>> List<String> list = new ArrayList<>();
>>> list.add("A");
>>> list.addAll(new ArrayList<>()); // should fail, since addAll expects
>>>    // Collection<? extends String>
>>>
>>> Library Support
>>> No library support is required, although libraries can be rewritten to
>>> take advantage of the new feature.
>>>
>>> Reﬂective APIs
>>> No reﬂective APIs need to be changed. The only related API is
>>> Class.newInstance; however, because
>>> generics are not reiﬁed in Java, this API does not take type
>>> parameters. A compiler cannot infer non-
>>> existent type parameters.
>>>
>>> Other Changes
>>> No other parts of the platform need to be updated.
>>>
>>> Migration
>>> Libraries can be rewritten to take advantage of this feature, but it
>>> is not imperative that they do so.
>>>
>>> Compatibility
>>> This change is backwards compatible. It will not make any changes to
>>> break existing compilation strategies
>>> or existing code.
>>>
>>> References
>>> Existing Bugs: 4879776, 6242254, 5092426, 6220689, 4983159, 6368076
>>>
>>> [1] James Gosling, Bill Joy, Guy Steele, and Gilad Bracha. The Java
>>> Language Speciﬁcation, 3rd Edition.
>>> Addison Wesley, 2004.
>>> [2] Google Inc. Google Collections Classes.
>>>
>>> Additional Features
>>>
>>> Casting Conversion
>>> It would be possible to specify Type Inference Conversion so that it
>>> takes place in a Cast Context:
>>>
>>> List<String> emptyList() {
>>>  return (List<String>) new ArrayList<>();
>>> }
>>> This would keep the casting context consistent with the other
>>> conversion contexts. However, the utility of
>>> this is questionable, so an argument could be made that this is a
>>> needless change.
>>>
>>> Wildcards
>>>
>>> In the method buildPathsInternal in Section 2, we need to use
>>> ArrayList’s copy constructor to create a copy of a List<T>, but
>>> cannot:
>>>
>>> if (node == null) {
>>>  list.add(new ArrayList<T>(currPath));
>>>  return list;
>>> }
>>>
>>> The inference does not work here because the ArrayList constructor
>>> expects an argument of type Collection<?
>>> extends E>. The approach to inference described above needs to be able
>>> to determine the exact type of
>>> the type parameters being inferred. Since we don’t have an exact type
>>> here, the inference will fail.
>>> One possible tweak to our approach would be the ability to infer an
>>> actual type parameter from a
>>> wildcard. For example, if the least upper bound of a wildcard
>>> expression, when evaluated, would be a legal
>>> type parameter for the type being instantiated, we might consider
>>> inferring that type. This would make it
>>> possible to use type inference in the example above:
>>>
>>> if (node == null) {
>>>  // <List<T>> is inferred from Collection<? extends E>
>>>  list.add(new ArrayList<>(currPath));
>>>  return list;
>>> }
>>>
>>> Further inference for method parameters
>>> As pointed out above, the result of a generic method:
>>>
>>> public static <E> List<E> newArrayList() { return new ArrayList<E>(); }
>>>
>>> can be used for assignment:
>>>
>>> List<String> list = Lists.newArrayList();
>>>
>>> Although this construct can be used for assignment, it cannot be used
>>> in parameter passing when a type argument is required:
>>>
>>> public void expectStringList(List<String> list) {
>>> ...
>>> }
>>>
>>> expectStringList(Lists.newArrayList()); // doesn’t compile
>>> expectStringList(Lists.<String>newArrayList()); // does compile
>>>
>>> The fact that inference works for assignment but not for parameter
>>> passing is very confusing and arbitrary.
>>> Changing this would be consistent with the way type inference is
>>> presented in this proposal.
>>>
>>>
>>
>