Migrating methods in Collections
Brian Goetz
brian.goetz at oracle.com
Wed Dec 16 19:43:04 UTC 2015
The previous memo outlined a tactic for effectively "migrating" a method
in a current generic class to a related but different signature in an
any-generic class, while retaining source and binary compatibility with
existing clients and subclasses, and outlined the list of possibly
problematic methods in the Collections framework.
The effectiveness of this tactic on this set of methods ranges from
"slam-dunk" to "seems like it could work" to "doesn't help." It would
be nice to have one hammer that pounds down all the nails, but I'm not
sure there is one. This memo outlines a range of complementary tactics
that might, in combination with the above approach, enable us to cover
the waterfront.
I'll start with the method Collection.contains(Object). It might at
first seem that we want the signature here to be contains(E), but this
gets in the way of cases like:
dogs.contains(animal)
or of converting dogs::contains to a Predicate<? super Animal>(which is
what filter/removeIf would want.)
Note that this has little to do directly with value types; value types
are invariant. But if we want a single contains() method that ranges
over any E, it needs to accomodate variance for reference instantiations
while not falling back on Object as a top type.
One technique for doing so would be to introduce contravariant inference
variables. Then we could write contains as:
<U super E> boolean contains(U u)
This has three main downsides:
- Even though this works, its still not that obvious.
- It's a *lot* of work in the spec and compiler; it pushes on all the
fragilebits.
- If that weren't enough, it is a theoretical minefield. Papers like
http://www.cis.upenn.edu/~bcpierce/papers/variance.pdf show that adding
contravariance to certain type systems result in subtyping becoming
undecidable.
On the other hand, I'm sure many library writers would jump for joy to
have this in the toolbox; the lack of contravariant tvars seems a
notable inconsistency in the language. (But let's not kid ourselves
about the costs.)
It just so happens that this construct works out; it is binary- and
source- compatible to make a non-generic method generic, as long as the
erasure of the signature remains the same. So changing contains() or
remove() as above would not cause subclasses or clients to fail to
either link or recompile (in the subclass recompilation case, it would
be reinterpreted as a raw override, which is allowed and compatible.)
And we'd end up with a total method that does the right thing both for
refs and values.
Ignoring the costs and risks, this technique applies to a number of the
methods in our rogue's gallery, including toArray(), for which we didn't
yet have a solution:
<U super E> U[] toArray()
This is compatible with existing clients who are expecting an Object[]
to come back from toArray() on a collection of reference types, and
collapses to V[] for any value type, so Collection<int>.toArray()
returns int[]. (We might still want an unchecked warning if the
compiler infers U != Object for reference E, but that's a separate and
easily handled consideration.)
Let's call this technique "superation" (yes, its an intentional
(disgusting) pun. See
http://beta.merriam-webster.com/dictionary/suppurate. And think about
that the next time you pass a "Super 8" motel on the highway.) With
this in our toolbox, the strategy matrix becomes:
*Class**
* *Method**
* *Possible Approaches**
*
Collection
contains(Object)
Superateto <U super E> contains(U)
remove(Object)
removeAll(Collection<?>)
Abandon in favor of existing removeIf(Predicate<? super E>).
retainAll(Collection<?>)
containsAll(Collection<?>)
Migrate to containsElements(Collection<? extends E>), or abandon.
toArray()
Superate to <U super E> U[] toArray()
toArray(T[])
Leave as is, superate, or abandon.
List
remove(int)
Migrate to removeAt(int).
indexOf(Object)
Migrate to Optional-bearing findFirst(Predicate<? super E>)
lastIndexOf(Object)
Map
containsKey(Object)
Superate
containsValue(Object)
Superate
remove(Object)
Superate
put(K,V)
Leave as is
get(K)
Migrate to one (or all) of:
Optional<V> map(K)
mapOrElse(K, V)
tryMap(K, Consumer<? super V>)
getOrDefault(Object, V)
Migrate to mapOrElse, or superate
Queue
poll(), peek() Migrate to tryPoll(Consumer) *or* optional-bearing method
Deque
poll(), peek()
pollFirst(), pollLast()
peekFirst(), peekLast()
removeFirstOccurrence(), removeLastOccurrence()
Migrate to predicate-accepting method, or superate
This is amore satisfying matrix; not only does everything have an
acceptable strategy, but some have more than one, and the user impact of
superating a method is lower (users might just not notice), so the
perception is that fewer methods are affected. Still, super-bounded
tvars are a big hammer for such a small foe. Maybe there's an alternate
approach that has the effect of superation but doesn't need such a big
hammer.
Here's one possibility. We already have a notion of partial methods.
We could have a pair of methods
<where ref E>
Object[] toArray()
<where val E>
E[] toArray()
both of which are reasonable signatures for their restricted domains.
Unfortunately, the natural interpretation of this pair of methods is
that the first is a member of Collection<ref T>, and the second is a
member of Collection<val T>, but there is *no* toArray() method that is
a member of Collection<any T>! This means that code that is generic in
any-T would not see a toArray() method at all. That's a problem (though
not as enormous as it initially sounds, there are possible mitigating
techniques.)
However, it is not unreasonable for the compiler to recognize this
situation and deal. Suppose I have some code generic in any-T:
<any T> void foo(Collection<T> c) {
T[] arr = c.toArray();
}
Now, the compiler doesn't know whether c is a collection of refs or
values, but it knows it's one or the other (ref T and val T form a
partition of any T). So it could (and in some cases, has to anyway) do
type checking by parts -- it can typecheck the above assignment under
the assumption of ref T, and do it again under the assumption of val T,
and if both succeed (and something else, see below), accept the method
invocation as valid. (In this case, the ref-T fork should result in an
unchecked warning, meaning that the merged checking also yields an
unchecked warning.)
The "something else" part is: when doing overload resolution by parts,
both branches must resolve to overloads that are erasure-equivalent to
each other. Which is true for toArray() (and for all the cases for
which superation would work.)
Now, this is a lot of handwaving, and it doesn't even really describe
how we think partial methods should actually work (I'd like to get rid
of the where-val-T slices entirely, this is a separate discussion.) But
its a sketch of an option that achieves the positive result of
superation without engaging the complexity of superation.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/valhalla-spec-experts/attachments/20151216/27037b17/attachment.html>
More information about the valhalla-spec-experts
mailing list