[vector] Vector API -- alignment with value types
Brian Goetz
brian.goetz at oracle.com
Fri Jan 25 21:00:14 UTC 2019
Coming back to this after a long break...
As I mentioned in the past, I think the high-order bit of where we want
to rotate the API is to ensure that when we have value types, the key
abstractions -- Species and Vector -- can be values, because then we'll
get most of the optimizations we want from the general properties of
values, rather than ad-hoc tricks surrounding the vector types.
Part I
------
Here's an idea for simplifying Species, which is: let's drive Species
down to be a simple class that is really just a constant holder for
(element type, shape), and move all the behavior to static methods on
XxxVector. (At that point, Species can just be an enum, or a group of
enums with a common interface parent, if we like.) I think we can
greatly reduce the importance of Species in the API, making XxxVector
the star player.
(The cost here is it becomes harder to write code that is agnostic to
_both_ element type and size -- but I am not convinced this is an
important use case?)
Here are the methods on Species currently:
- Simple state methods: elementType, elementSize, shape, length, bitSize
- Generic factories: zero(), fromByteArray(), fromByteBuffer(),
maskXxx(), shuffleXxx()
- Transforms: reshape(), rebracket(), resize(), cast()
- Specialized factories: broadcast(), single(), random(), scalars(),
fromArray()
My somewhat radical suggestion is: let's get all of these, except the
first line, off of Species, and onto XxxVector, with versions that take
an explicit species argument, and versions that take no species argument
(defaulting to the preferred species for that shape.)
One positive result of this is that code that just wants to multiply int
vectors can remain _entirely ignorant of species_; you just use the
defaults:
IntVector.fromArray(...).add(...).intoArray(...)
Only if users want to have finer control over the vector width do they
need to use species at all.
A slightly negative result is that one loses the ability to write code
that is agnostic across both element type and width.
Another slightly negative result is that some of these methods will have
to dispatch on the species argument, which means we will need strong
constant propagation so that these dispatches fold away. However, I
think those fold away in exactly the same cases that the virtual methods
on species do today, so I don't think this is necessarily a change in
reality.
At this point, we're ready for species to become values, which will only
help our constant propagation story.
Part II
-------
To make the Vector types value-ready, we have to flatten out the
inheritance hierarchy. This is an easy enough game; we make the
concrete vector types into values, and the abstract vector types into
interfaces. So we have:
public interface Vector<T> { .. }
public interface IntVector <: Vector<Integer> { .. }
private value class Int64Vector <: IntVector { .. }
This is easy enough, but I'm sure if we just did this, the carefully
crafted optimizations we've done for vectors would fail (at first) when
the abstract vector types become interfaces. We'll also need to ensure
that type sharpening / nullity analysis is up to the task, so we get
scalarization. But this seems the straightest path to get from the API
we have to one that use value types.
Another path, slightly more circuitous, is to collapse the XxxNnVector
types to a single XxxVector value type, which looks something like:
IntVector<T extends SuperLong> {
IntSpecies species;
T vector;
}
where there are SuperLong types for 64, 128, 256, and 512. This gets us
away from using arrays, which is eventually where we'll want to go, but
it's a longer road.
Thoughts on these directions?
More information about the panama-dev
mailing list