Proposal: Improved Type Inference for Generic Instance Creation

Jeremy Manson jeremy.manson at gmail.com
Fri Feb 27 22:10:25 PST 2009


Improved Type Inference for Generic Instance Creation

(Sorry about the length; I probably got a little carried away)

Author: Jeremy Manson

Overview

Feature Summary

This proposal addresses the addition of limited type inference for
class instance creation expressions to the Java programming language.
In cases where parametrized types need to be explicitly declared for a
constructor, and the full parametrized type < T1 , T2 , ...Tn > of
that constructor is obvious from the context, then the  parameterized
type of the constructor can be replaced with an empty set of type
parameters: <>. The <>  construct is legal to use when constructing an
ob ject and either assigning it to a variable, or when passing  it as
a parameter.

For example, consider the following assignment statement:

Map<String, List<String>> anagrams = new HashMap<String, List<String>>();

This is rather lengthy, so it can be replaced with this:

Map<String, List<String>> anagrams = new HashMap<>();

Major Advantages
Generics have had a tremendously positive impact on type safety in the
Java programming language. They  have made it possible to provide
static guarantees about the type of instances used by other classes,
preventing entire classes of runtime errors from occurring. However,
generics have added complexity to Java, making the code far more
verbose and occasionally difficult to read. Although solving this
problem is well outside the scope of this proposal, a limited form of
type inference would remove unnecessary redundancy from the language.

The requirement that type parameters be duplicated unnecessarily like
this encourages an unfortunate
overabundance of static factory methods, simply because type inference
works on method invocations. For
example, the Google collections library [2] contains a class that
wraps every possible constructor of every
subclass of java.util.List in the JDK:

public class Lists {
  public static <E> List<E> newArrayList() { return new ArrayList<E>(); }
  public static <E> List<E> newArrayList(int i) { return new ArrayList<E>(i); }
  // ...
}
List<String> list = Lists.newArrayList();

This approach avoids the redundancy in the declaration, but requires
two declarations for every possible
constructor in every possible class. This is ugly: every time a
constructor or a new subtype of List is created,
this class must be updated. Furthermore, the names of these methods
does not have to be consistent; there
is no reason to favor newArrayList over createArrayList. The proposed
feature would make this approach
obsolete.

Major Disadvantages

As a minor change, the impact of this feature would have few
disadvantages. It does have benefits and
drawbacks when compared with other implementations, though. For
example, it could be argued that
inference would look cleaner without the <> token:

Map<String, List<String>> list = new Map();

Unfortunately, for the purposes of backwards compatibility, new Map()
indicates a raw type, and therefore
cannot be used for type inference.

An argument can also be made for more full type inference:

auto map = new HashMap<String, String>();
map.get("Foo"); // map is understood to be a Map<String, String>

or, alternatively, for a typedef like feature, where type identifiers
are assigned to longer typenames:

typedef MyMap HashMap<String, List<String>>;
MyMap m = new MyMap();

Both of these approaches, which are present in the forthcoming C++
standard, would eliminate a lot of
verbosity, at the expense of (potentially) obscuring the type of the
variable. Neither of these proposals are
precluded by this proposal; one, both or neither can be done at a later date.

2 Examples

Here is a simple example, showing a use case for type inference for
blank finals. An ImmutableMap is a Map
implementation that throws an exception when mutation is attempted. In
this example, the programmer
writes <String, List<String>> twice instead of four times, and
<String> once instead of twice:

class AllAnagrams {
  final Map<String, List<String>> anagrams;
  AllAnagrams(List<String> originals) {
    Map<String, List<String>> map = new HashMap<>(); // saves typing
    for (String s : originals) {
      String alphagram = alphagram(word);
      List<String> group = anagrams.get(alphagram);
      if (group == null)
        anagrams.put(alphagram, group = new ArrayList<>()); // saves typing
      group.add(word);
    }
   anagrams = new ImmutableMap<>(map); // saves typing
 }
  private static String alphagram(String s) {
    char[] chars = s.toCharArray();
    Arrays.sort(chars);
    return String.valueOf(chars);
  }
}

Here is another example, showing a use case for type inference in
parameter passing:
class TreeNode<T> {
  T value;
  TreeNode<T> left, right;
  private List<List<T>> buildPathsInternal(
    List<List<T>> list, List<T> currPath, TreeNode<T> node) {
    if (node == null) {
      // type inference does not save typing here; see section on Wildcards.
      list.add(new ArrayList<T>(currPath));
      return list;
    }
    currPath.add(node);
    buildPathsInternal(list, currPath, node.left);
    buildPathsInternal(list, currPath, node.right);
    currPath.remove(node);
    return list;
  }
  public List<List<T>> getAllPaths(TreeNode<T> head) {
    // Type inference saves typing here:
    return buildPathsInternal(new ArrayList<>(), new LinkedList<>(), head);
  }
}

Details

Specification
Informally, the specification adds syntax to replace the type arguments
< T1 ...Tn > to a class being instanti-
ated by a call to new with the simple token <>. It also provides a new
conversion that infers what the type
arguments would have been from the context, and treats the instance
creation as if those type arguments
were present.

Syntax
The syntax of the new statement is currently given in two different
ways in the Java Language Specification
(JLS) [1] §15.9 and JLS§18.1 For simplicity, we restrict ourselves to
the grammar given in §15.9.

ClassInstanceCreationExpression ::=
  new TypeArguments_opt ClassOrInterfaceCreationType (
ArgumentList_opt ) ClassBody_opt
  Primary . new TypeArguments_opt Identifier
PossiblyInferredTypeArguments ( ArgumentList_opt ) ClassBodyopt

ClassOrInterfaceCreationType ::=
  TypeDeclSpecifier PossiblyInferredTypeArguments_opt

PossiblyInferredTypeArguments ::=
  <>
  TypeArguments_opt

Semantics and Compilation
Consider G1 , a generic type that is declared with 0 type arguments, and G2
⟨T1 , T2 , ..., Tn ⟩, a generic type
that is declared with n type arguments. There is a Type Inference
Conversion from G1 to G2 if and only if
  • G2 ’s erased type is a supertype of G1 ’s erased type and
  • all type arguments T1 , T2 ...Tn uniquely identify reference types
(i.e., they have no wildcard type arguments, although see below for
more discussion of this).

A Type Inference Conversion can take place in an assignment or method
invocation conversion context
(JLS §5). It might be possible to allow the inference to take place in
a casting conversion context; see
Section 6.1 for more discussion of this.

Type Inference Conversions are unique in that they are not actually
performed. Instead, they are used
to replace the type they would ordinarily be converting: if a Type
Inference Conversion can take place, then
the type parameters of G1 in the conversion context are deemed to be
identical to those of type G2 .

Compilation
The resulting bytecode would be identical to bytecode generated by the
same code with the types inferred.

Testing
The proposed construct can be tested by writing a number of statements
that construct instances using
inferred types. These instances can then be assigned to variables and
fields of specific types, and passed to
methods that expect specific types. These tests should compile and run
as expected:
Set<String> set = new SortedSet<>();
set.add("A");
set.add("B");
set.add("C");
// verify that set contains "A", "B", "C".

Other tests can be performed by attempted to place new instances of
inferred types in contexts that
contain wildcards; these tests should fail:

List<?> list = new ArrayList<>(); // should fail
List<String> list = new ArrayList<>();
list.add("A");
list.addAll(new ArrayList<>()); // should fail, since addAll expects
    // Collection<? extends String>

Library Support
No library support is required, although libraries can be rewritten to
take advantage of the new feature.

Reflective APIs
No reflective APIs need to be changed. The only related API is
Class.newInstance; however, because
generics are not reified in Java, this API does not take type
parameters. A compiler cannot infer non-
existent type parameters.

Other Changes
No other parts of the platform need to be updated.

Migration
Libraries can be rewritten to take advantage of this feature, but it
is not imperative that they do so.

Compatibility
This change is backwards compatible. It will not make any changes to
break existing compilation strategies
or existing code.

References
Existing Bugs: 4879776, 6242254, 5092426, 6220689, 4983159, 6368076

[1] James Gosling, Bill Joy, Guy Steele, and Gilad Bracha. The Java
Language Specification, 3rd Edition.
Addison Wesley, 2004.
[2] Google Inc. Google Collections Classes.

Additional Features

Casting Conversion
It would be possible to specify Type Inference Conversion so that it
takes place in a Cast Context:

List<String> emptyList() {
  return (List<String>) new ArrayList<>();
}
This would keep the casting context consistent with the other
conversion contexts. However, the utility of
this is questionable, so an argument could be made that this is a
needless change.

Wildcards

In the method buildPathsInternal in Section 2, we need to use
ArrayList’s copy constructor to create a copy of a List<T>, but
cannot:

if (node == null) {
  list.add(new ArrayList<T>(currPath));
  return list;
}

The inference does not work here because the ArrayList constructor
expects an argument of type Collection<?
extends E>. The approach to inference described above needs to be able
to determine the exact type of
the type parameters being inferred. Since we don’t have an exact type
here, the inference will fail.
One possible tweak to our approach would be the ability to infer an
actual type parameter from a
wildcard. For example, if the least upper bound of a wildcard
expression, when evaluated, would be a legal
type parameter for the type being instantiated, we might consider
inferring that type. This would make it
possible to use type inference in the example above:

if (node == null) {
  // <List<T>> is inferred from Collection<? extends E>
  list.add(new ArrayList<>(currPath));
  return list;
}

Further inference for method parameters
As pointed out above, the result of a generic method:

public static <E> List<E> newArrayList() { return new ArrayList<E>(); }

can be used for assignment:

List<String> list = Lists.newArrayList();

Although this construct can be used for assignment, it cannot be used
in parameter passing when a type argument is required:

public void expectStringList(List<String> list) {
...
}

expectStringList(Lists.newArrayList()); // doesn’t compile
expectStringList(Lists.<String>newArrayList()); // does compile

The fact that inference works for assignment but not for parameter
passing is very confusing and arbitrary.
Changing this would be consistent with the way type inference is
presented in this proposal.



More information about the coin-dev mailing list