null checks vs. class resolution, and translation strategy for casts

Remi Forax forax at univ-mlv.fr
Mon Apr 13 19:46:57 UTC 2020


----- Mail original -----
> De: "Frederic Parain" <frederic.parain at oracle.com>
> À: "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
> Envoyé: Lundi 13 Avril 2020 15:24:18
> Objet: Re: null checks vs. class resolution, and translation strategy for casts

>> On Apr 8, 2020, at 14:43, John Rose <john.r.rose at oracle.com> wrote:
>> 
>> I have a proposal for a translation strategy:
>> 
>> 1. Translate casts to inline classes differently from “classic”
>> casts.  Add an extra step of null hostility.  For very low-level
>> reasons, I suggest using “ldc X” followed by Class::cast.
>> 
>> Generally speaking, it’s a reasonable move to use reflective
>> API points (like Class::cast) on constant metadata (like X.class)
>> to implement language semantics.
> 
> 
> There’s an alternative way to implement this:
> 
> Casts to inline classes C can be translated to
>    #ldc C
>    #checkcast C
> 
> with this new definition of checkast:
> 
> "If objectref is null then:
>   - if type C has not been loaded yet, the operand stack is unchanged,
>   - if type C has already been loaded:
>       - if type C is not an inline type, the operand stack is unchanged
>       - otherwise the checkcast instruction throws a ClassCastException

so it's more:
   #ldc C
   #checkcast 0    <---

> 
> Otherwise, the named class, array, or interface type is resolved (§5.4.3.1). If
> objectref can be cast to the resolved class, array, or interface type, the
> operand stack is unchanged; otherwise, the checkcast instruction throws a
> ClassCastException."
> 
> 
> This new definition doesn’t change the behavior of checkcast for old class
> files,
> and doesn’t change the behavior nor the translation strategy for casts to
> non-inline types.
> 
> In new class files, javac will use the ldc/checkcast sequence whenever a cast to
> an
> inline type is required. Note that in many cases, type C would have already be
> loaded
> before ldc is executed (by pre-loading or eager loading).
> 
> With migrated types, an old class file can still have a standalone checkcast
> (without ldc)
> referencing a type which is now an inline type, causing the null reference to
> pass the checkcast
> successfully. This is not a new issue. The same situation can be created by
> reading a field
> declared as ‘LC;’, getfield would simply read the field (only possible value is
> null) and push
> it on the stack without checking if C is an inline type or not. This ’null’
> reference to an
> invalid type can only be used by code that have the wrong information about type
> C, and this
> is in fact the only possible value for this type (any attempt to create a real
> instance of
> ‘LC;’ would fail). In order to use this reference as a legitimate reference to
> the real
> type ‘QC;’, another checkcast, using the proper sequence above, would be
> required and would
> throw an exception.
> 
> Ill formed or malicious class files could be aware that C is an inline type, but
> use a
> single checkcast instruction (without preceding ldc) anyway. This is part of a
> bigger
> problem that has not been discussed yet: L/Q consistency inside a class file.
> the SoV
> document stipules that the value projection only exists in the Q-form and the
> reference
> projection only exists in the L-form. As of today, there’s no verification of
> such kind
> performed on class files. Nothing prevent a class file from declaring a field of
> type
> ‘LC;’ and another of type ‘QC;’, and the same remark applies to method
> arguments.
> 
> 
> Going back to the modified specification of checkcast, new tests are easy to
> implement
> in the interpreter, and can easily be optimized by JIT compilers (most types
> would be
> loaded at compilation time), and there’s no bootstrapping issues.
>

How it is better than a new opcode "unbox" with exactly the same semantics ?
 

> Fred

Rémi


More information about the valhalla-spec-observers mailing list