null checks vs. class resolution, and translation strategy for casts

Sun Apr 12 12:09:55 UTC 2020

> De: "John Rose" <john.r.rose at oracle.com>
> À: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "Brian Goetz" <brian.goetz at oracle.com>, "valhalla-spec-experts"
> <valhalla-spec-experts at openjdk.java.net>
> Envoyé: Samedi 11 Avril 2020 07:43:28
> Objet: Re: null checks vs. class resolution, and translation strategy for casts

> On Apr 10, 2020, at 4:19 AM, [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ]
> wrote:

>>> So, here’s a recommendation: Use indy, and use a clunkier
>>> fallback in the same places that today use a clunkier fallback
>>> for string concatenation. And, record a line item of technical
>>> debt that we should further explore indy intrinsics, after we
>>> figure out what javac intrinsics look like.

>> What is not clear to me is that javac can replace unbox by a nullcheck, for the
>> VM, the input is an interface and the output is an inline type, given that
>> interfaces are not checked until runtime, how the VM can validate that only a
>> nullcheck is enough ?

> It can’t; that’s why I’m saying javac needs to ask for a null check,
> *and* somehow affirm the inline type (subtype of interface).
> This is two bytecodes, invokestatic Objects.requireNN, plus
> checkcast C.

Ok, 

>> Also it's still not clear to me what indy provide in this case.

> It provides both of the above effects in one bytecode. The bytecode,
> in turn, can expand to some internal JVM intrinsic which the runtime
> will optimize better than a back-to-back combo of the two standard
> instructions. That intrinsic never has to be admitted to by any spec.

>> So i still think that doing a checkcast (reusing checkcast being a trick to
>> avoid to introduce a new bytecode) or having a special unbox opcode is a better
>> idea.

> Changing opcode behaviors and/or adding new opcodes is always
> more expensive than appealing to indy, even if we have to add secret
> optimizations to indy. Specs are almost always harder to change than
> optimizations.

Why do we have the new opcodes defaultvalue and withfield in that case ? 
In both case, the semantics "new inline type" and "unbox inline type" can be express with an indy, but for the former we have chosen to go with 2 new bytecodes and for the later you want to use indy, that doesn't seem logical. 
I understand why you want to use indy but from my armchair it seems like paying the cost upfront (with a new bytecode) or later (when optimizing). 

Indy is good when the linking is complex, for lambdas when you need to create a proxy class out of thin air, for the string concatenation or for the pattern matching because you have a lot of code shapes to link together. 
Indy has three major drawbacks, calling the BSM is slow, it's only fully inlined by c2 and you can not use it before it has been bootstraped. 
Those issues are all severe in our case, i don't see how we can use an inline type to express the entry (the pair of K,V) of a HashMap without being stopped by these issues. 

I heard you about the cost, but here indy is not the silver bullet, it's a shiny tool with its own weaknesses. 
And yes, adding a new opcode has a more upfront cost. 

> — John

Rémi