Transparency

Thu Jul 15 10:47:43 PDT 2010

What happens if some class has multiple type parameters? Should we create a
Map$Short$Byte? That's going to explode into a very very large number of
types as you add on more type parameters. What seems to be required for your
scheme is a mix of complete reification combined with a lot of magic
bytecode rewriting, which you haven't fully elaborated on.

A few questions that I'm not sure your proposal answers:

 - When should the JVM stop specializing new classes? With for example 4
type parameters, there would be 8*8*8*8 = 4096 different classes required,
which seems quite infeasible.

 - Exactly how does Magic.reify() work? The backing array field in the
ArrayList implementation is not of type T[]; it's of type Object[]. How
could any library know that it should rewrite this Object[] to int[] in the
generated specialization without hardcoding a ruleset for all known classes?

 - Even if ArrayList is rewritten to use T[] instead of Object[], that's no
help for third party classes that used Object[] - Oracle can't very well
change every library ever written for java. In a class where T[] cannot be
rewritten to int[], exactly what does ArrayList$Int even do? Presumably it
would have to have an (I)Z method auto-generated by the classloader as
calling code will expect it to exist.

 - What happens when heap corruption occurs? i.e: List<int> list = new
ArrayList() /* Raw type! */; list.add(10) /* Generates call to non-existing
(I)Z add method */; - Normally this results in a MethodNotFoundError, which
is rarely caught as errors aren't usually supposed to be caught. That seems
entirely inappropriate for heap corruption.

 - If a dummy reify method is allowable, then how is a call to add(I)Z ever
going to work?

 - If the syntax is to remain ArrayList<Integer> and not ArrayList<int>,
then what happens if one attempts to add null?

 - How does the auto-generator know that isEmpty() is not dependent on
reification, but e.g. add() is?

NB: At this point in time, the Oracle JVM does *NOT* eliminate autoboxing
and unboxing, though the JRockit VM does.

 --Reinier Zwitserloot

On Wed, Jul 14, 2010 at 6:16 PM, Osvaldo Doederlein <opinali at gmail.com>wrote:

> A concrete suggestion: we could support this via bytecode transformation,
> e.g.
>
> ArrayList<Integer> list = Magic.reify(ArrayList.class, int.class);
>
> ...where reify() will basically change the type of ArrayList's internal
> Object[] backing store to int[]. This would probably benefit some help from
> the original collection, e.g. annotations that list which
> fields/parameters/returns/locals must be changed (or not); but this mapping
> could be provided on a configuration-by-exception manner. The bytecode
> transformation is mostly simple, it just needs also to translate some API
> calls (e.g. to Arrays.copyOf(T[], int, int)) to equivalent calls for the
> required primitive-array type (or fail the whole thing, if it hits some call
> that is unknown to the transformer or has no primitive-friendly overloads).
>
> Several reifications of the same origin class could be arranged in a
> hierarchy that's designed to both reflect the original hierarchy and to
> share methods that don't directly depend on the element type (e.g. isEmpty()
> { return size()==0; }). So we get synthetic classes like:
>
> AbstractCollection$ extends AbstractCollection
> AbstractCollection$Int extends AbstractCollection$
> AbstractList$Int extends AbstractCollection$Int
> ArrayList$Int extends AbstractList$Int
>
> The new classes can even inherit the top-level class from the original
> hierarchy, as long as that class has no fields or at least no "container"
> Object[] fields.
>
> Returning a type like ArrayList<Integer> means no changes in syntax, method
> signatures, classfile format, reflection etc. to support ArrayList<int>. So
> the proposal is a simple library without any exceptional demands - which is
> the only kind of proposal that we could hope adding to JDK 7, at this late
> time. The tradeoff is that we're stuck with signatures like add(I)Z, so we
> depend on HotSpot to optimize out the boxing/unboxing that happens in each
> method call. At least for inlined calls I guess this optimization is already
> done; the remaining work should be feasible, and it could wait for some 7uXX
> maintenance update if there's no time do do that until FCS.
>
> There is no Object[] field to worry about... but still, some Object[]s in
> APIs like toArray(). In that case there is no significant extra conversion
> cost (toArray() already does a defensive copy so we just do a different copy
> from int[] to Object[]). The reified class could implement some extra,
> type-specific helper interface that adds methods like int[] toArrayInt();
> this would probably make the transformer more tightly-bound to the Java SE
> Collections API, or require extra annotations for custom additions; not
> pretty, but not hard to implement. And while we're at it, I would love to
> add equivalent methods that DON'T do a defensive copy, just return the
> internal array, or assign an external array to the collection's field.
> Perf-critical code often benefits from that; I have personally had this
> problem. When your code is already optimized enough, redundant data copies
> always bubble up to the top of the performance profile. This is why
> estimating the initial size is everybody's favorite collections
> optimization, as it avoids lots of allocation and copy from growing.
>
> Finally, it's also easy to implement as the bytecode transformation, full
> reification for any non-public method. So we can't touch add(Object), but we
> certainly can do that for all private and even protected methods, and also
> entire non-public helper classes like AbstractList's Itr, ListItr, SubList,
> and RandomAccessSubList.
>
> I don't like ideas like new IntList(), or some Lists.of() that's just a
> factory to hide types like IntList. This would add enormous bloat to the
> core, and/or would be very limited - there's a hefty number of collection
> classes in the core (not to mention third-party code) and I may need
> primitive-optimized versions of ANY of these classes. If I need a List of
> int, I may also need the specific features or O(N) traits of particular
> implementations like ArrayList, LinkedList, or even CopyOnWriteArrayList.
> Don't make me choose between optimal element storage (primitive reification)
> and optimal API/structure (compact array, linked nodes, concurrency etc.). A
> partial solution is sometimes worse than no solution at all - it avoids a
> much bigger mess, e.g. using a core collection for ArrayList-of-int, and GNU
> Trove for LinkedList-of-int or whatever.
>
> As a final remark, this implementation would be valid:
>
> <T> T reify (T collection, Class elementType) {
>   return collection;
> }
>
> ..it just wouldn't gain any performance. But it's good enough, for example,
> to add (as a compatibility feature) to platforms where code footprint is a
> higher concern, like the lowest Java ME profiles. Or even to the first
> release of JDK 7, if there is no time for a full implementation of this API.
>
> A+
> Osvaldo
>
>
> 2010/7/14 Reinier Zwitserloot <reinier at zwitserloot.com>
>
> Indeed, having: new ArrayList<int>(); be as fast as an int array requires
>> reification; once ArrayList creates a backing array of type Object[], the
>> performance game is lost.
>>
>> Nothing short of reification can address this issue. To be specific,
>> specialization won't solve it any more than primitives-in-generics does.
>> However, this:
>>
>> List<int> list = Lists.of(int.class);
>> list.add(10);
>>
>> or:
>>
>> List<int> list = new IntList();
>> list.add(10);
>>
>>
>> *CAN* be made fast. Specialization, interestingly enough, cannot do this*,
>> but my primitives in generics proposal can. Which partly makes me wonder
>> why
>> we're talking about the far more complicated specialization concept in the
>> first place. I know scala uses this, but they either introduced their
>> collections API with specialization from day one, or they created a
>> backwards incompatible retrofit later.
>>
>> *) list.add(10), is, as far as the compiler knows, a call to add(T) on a
>> List<int>. If this is to be fast under specialization, it would therefore
>> have to generate a call to an add method with signature (I)Z and not with
>> signature (Ljava/lang/Object;)Z like it would today (and autobox to make
>> the
>> int fit as an object). However, if it is to generate such a call, then
>> *ANY*
>> list, no matter how it was obtained, must have an add (I)Z method, but
>> lists
>> written today do not because that's not part of the List.java interface.
>> We
>> can't add methods to interfaces or we'd break backwards compatibility, and
>> breaking backwards compatibility is not acceptable. Therefore, in list
>> implementations that lack add(I)Z, the JVM or the classloader has to
>> generate these. There's precedence of course: Brian Goetz' proposal for
>> defender methods does something quite similar. Still, this is far from
>> trivial, and I'd like to see a proposal on how one could update List.java
>> so
>> that the JVM / ClassLoader considers the specialization methods as
>> defenderish, but the object-based one itself not. Primitives-in-generics
>> keeps things simple by letting the call to (Ljava/lang/Object;)Z stand,
>> and
>> letting the JVM hotspot compiler eliminate the inefficiency instead.
>>
>>  --Reinier Zwitserloot
>>
>>
>>
>> On Tue, Jul 13, 2010 at 11:08 PM, Nathan Bryant <
>> nathan.bryant at linkshare.com
>> > wrote:
>>
>> > John, it's just array creation
>> >
>> > class Foo<T> {
>> >        T[] arr = new T[SIZE]; // Can't write this without reification
>> > }
>> >
>> > class Foo<T> {
>> >        Object[] arr = new Object[SIZE];
>> >
>> >        void set(int i, T t) { arr[i] = t; } // Either causes autoboxing
>> > unbeknownst to the sorry fool who wrote new Foo<int>, or becomes illegal
>> > }
>> >
>> >
>> > John Nilsson wrote:
>> >
>> > > What exactly is it that requires reification to work?
>> >
>> > BR,
>> > John
>> >
>> >
>>
>>
>