valhalla-dev Digest, Vol 17, Issue 3

Fri Dec 25 00:31:06 UTC 2015

Hi Vitaly,

> I don't see why types/wrappers over one other type are special.  Optional
is special in that when it abstracts over a ref type the
> semantic of the type can be fulfilled naturally by virtue of ref types
already having an absence value: null.  The main point is you
> can exploit type info when it's provided.  That's kind of the point of
specialization.

Well, when you're talking about specializing ArrayList<something> I keep
feeling that ArrayList<Optional<T>> is much more useful -- in terms of
anybody using it -- than ArrayList<boolean>. Am I wrong in this?

Monads/ wrappers such as Optional are probably an "obvious" case to
optimize, since they're at the same cardinality & logical level as the
underlying data;  but currently add significant instance, indirection,
wrapping & storage cost.

If we want to go to lengths & consider tackling storage size & efficiency,
why not tackle it generically at the storage level -- with a system for
"unpacking" value types into arrays -- rather than by specializing at the
Collection implementation (eg. ArrayList) level?

If we want to address storage, array[] is where storage of pluralities
mostly happens. Why not target that directly?

I could propose a "packedarray" type, which would implement (either at
specialization or at VM level) as an array for each component field. Store
& retrieve would be automatically packed & unpacked into it.

We would then have:

public class ArrayList<any T> {
    packedarray T[] elements;
    int size;
}

Store & retrieve from 'elements' would inline the packing & unpacking of
elements. Functionality (at least some) needed for operations within the
collection would also be inlined -- eg. inlined versions of equals() and
hashCode() method would be available, to operate directly onto the packed
layout.

There'd be some heuristic that the value-type was small & simple enough to
benefit from this -- I'd suggest this would apply for value-types with no
more than 4 fields for now. Larger value-types would just get stored flat
in a single array.

If we are trying to target storage, I think general approaches are worth
considering vs. code-level specialization.

I'd far prefer to consider boolean -> single-bit storage in a generic
"packed array" that's implemented once, which everyone -- including all
user types implemented subsequently to the library -- get to use
automatically, than need to code that stuff M*N times (M per-Collection, N
per Element Type) in the library.

> How is this different from creating named types for all those
specializations? It's always preferable to put more burden on lib
> author than its users, when given the choice.

Nice intent, but placing support in the library means it's not extensible &
it's far less useful. The type instantiation site is outside the library --
if somebody needs a special collection for their type, that can be
instantiated out there too.

If we require specialized sections of code in the library, it means only
Java core types can get that special support -- and probably not many of
them. Essentially other people's small types & wrappers become second-class
citizens -- ArrayList will offer bets support only for JDK value types, etc.

Many many great libraries such as Guava, Apache, Spring, Hibernate etc are
leaders in innovation. They implemented features & types the Java platform
needed, long before Java were aware they needed it :) They have also led
with great OO design and innovation, often when the core platform sorely
lacked it. Numerous times Java has followed the third-party & gained from
this innovation.

The Java platform & ecosystem need to support third-party libraries --
including their types --  as first-class citizens. M*N implementation costs
and having efficient support only for "core Java" stuff makes it all seem a
bit limited, in my view.

Regards,
Thomas