Fwd: Fwd: Proposal for generics over primitives needs a rethink

Remi Forax forax at univ-mlv.fr
Mon Jan 5 11:06:08 UTC 2015


On 01/05/2015 02:54 AM, Gavin King wrote:
> Sorry, Remi, I missed this email, it went straight to my spam folder
> or something for some reason.
>
> On Sat, Jan 3, 2015 at 5:10 PM, Remi Forax <forax at univ-mlv.fr> wrote:
>> On 01/02/2015 07:13 PM, Gavin King wrote:
>> The first major problem of Any is that it introduce a new primitive kind,
>> a lot of codes relies on the fact that you can only have boolean, byte,
>> short,
>> char, int, long, double and Object in Java. Introducing a new type will
>> make a lot of code to behave weirdly.
> You mean code in Java programs?
>
> I'm not seeing this. By nature it's impossible to switch over the
> primitive types with instanceof or whatever, so I'm having a really
> hard time visualizing what kind of Java code would break.
>
> Examples?

switch(type.getName()) { ...

>
>> The second major problem is how to implement it.
>> It can be implemented as a compiler fiction only like in C# or in Scala,
>> but in that case apart that one can write ArrayList<int>, it will
>> rely on a form of boxing.
> Well, that's not quite right. That the concrete instantiation
> ArrayList<Any> works using boxing doesn't say much about whether
> ArrayList<int> does. You can still use specialization, AFAICT.

yes, but you don't need Any to do primitive type specialization,
you just need Any as a bound of a type parameter that's
the whole point of the discussion now ?

>
>> It can be implemented in the VM. It seems to be the best solution
>> but the engineering cost is insane and in fact Any will not solve
>> the specialization problem.
>>
>> Why the engineering cost is very high ?
>> Currently most of the Java VM implementations rely on the fact that
>> you know upfront (without executing the code) if a field or a local variable
>> is a reference or an object. In term of GC, there is a very clear
>> distinction
>> between the precise GC algorithms used in Java VM and the conservative
>> GC algorithms used by example in C. Technically, with Any, you can
>> implement something in the middle between these two categories of GC,
>> nevertheless, it changes the category of GCs, Java will be able to use.
> But you're introducing a new kind of field/variable here. Pre-existing
> code should keep behaving the same. It's only for new usages of this
> new thing that the new behavior is required.
>
> I understand that the VM would need to change to accommodate this. But
> I think it makes the VM more useful,

Yes, it makes the VM more powerful, I think we can all agree with that.
The problem is that its a pervasive change that as far as I know
nobody has even tried.

I will be very happy to see Redhat to found a project like this,
taking the source of hotspot and trying to add Any.

About the separation of behaviors, it's exactly like saying,
if I use a JVM but only with primitive types, no objects,
I will not pay the price of having to deal with objects.
That's not true, the VM will still prepare internal data structures
to be able to crawle the stack during a GC even if a GC
will never occur in that case.

> and I strongly speculate that all the dynamic language folks would thank you for it.

:)

Having all values stored as Any require the VM to check the runtime type 
of all operands
for each operation, this is what CPython does, but not what more advanced
dynamic language runtimes do.
They do optimistic speculative type analysis, i.e. generate a code by 
examples for ints
and de-optimise if a type is a String and then re-generate a new code.

So sure Any will help, but not as much as you think.

>
>> Any will not solve the specialization problem.
>> Let say we now we have introduce Any, so we can create an array of Any.
>> But an array of Any can be at runtime by subtyping either
>> an array of doubles or an array of objects.
>> But by doing this we introduce a subtyping relationship between things that
>> are not structurally equivalent. A possible solution is to align all cells
>> of all arrays
>> to 64 bits which is a very bad property because even if you don't use Any,
>> you pay the cost of that you may use it. Another solution, the one used by
>> V8,
>> is to box all doubles (and long in Java) when you store them in an array*.
>> So an array of Any will not be magically an array of ints, an array of
>> doubles
>> and an array of objects without some trade-offs.
> Hrm, you're right, I had not considered that byte[] is an Any[], due
> to the totally broken "covariance" of arrays in Java.
>
> That sucks, and it's worse that it sucks because of the type system
> being broken :-(
>
> Couldn't we just say that a byte[] actually isn't an Any[]? I mean,
> that's actually *correct*. Why double down on stupid?

how to implement ArrayList in that case ?

class ArrayList<T extends Any> {
   T[] array;   // <-- this is an array of Any !
}

cheers,
Rémi





More information about the valhalla-dev mailing list