null checks vs. class resolution, and translation strategy for casts

Frederic Parain frederic.parain at oracle.com
Mon Apr 13 13:24:18 UTC 2020



> On Apr 8, 2020, at 14:43, John Rose <john.r.rose at oracle.com> wrote:
> 
> I have a proposal for a translation strategy:
> 
> 1. Translate casts to inline classes differently from “classic”
> casts.  Add an extra step of null hostility.  For very low-level
> reasons, I suggest using “ldc X” followed by Class::cast.
> 
> Generally speaking, it’s a reasonable move to use reflective
> API points (like Class::cast) on constant metadata (like X.class)
> to implement language semantics.


There’s an alternative way to implement this:

Casts to inline classes C can be translated to
    #ldc C
    #checkcast C

with this new definition of checkast:

"If objectref is null then:
   - if type C has not been loaded yet, the operand stack is unchanged,
   - if type C has already been loaded:
       - if type C is not an inline type, the operand stack is unchanged
       - otherwise the checkcast instruction throws a ClassCastException

Otherwise, the named class, array, or interface type is resolved (§5.4.3.1). If objectref can be cast to the resolved class, array, or interface type, the operand stack is unchanged; otherwise, the checkcast instruction throws a ClassCastException."


This new definition doesn’t change the behavior of checkcast for old class files,
and doesn’t change the behavior nor the translation strategy for casts to non-inline types.

In new class files, javac will use the ldc/checkcast sequence whenever a cast to an
inline type is required. Note that in many cases, type C would have already be loaded
before ldc is executed (by pre-loading or eager loading).

With migrated types, an old class file can still have a standalone checkcast (without ldc)
referencing a type which is now an inline type, causing the null reference to pass the checkcast
successfully. This is not a new issue. The same situation can be created by reading a field
declared as ‘LC;’, getfield would simply read the field (only possible value is null) and push
it on the stack without checking if C is an inline type or not. This ’null’ reference to an
invalid type can only be used by code that have the wrong information about type C, and this
is in fact the only possible value for this type (any attempt to create a real instance of
‘LC;’ would fail). In order to use this reference as a legitimate reference to the real
type ‘QC;’, another checkcast, using the proper sequence above, would be required and would 
throw an exception.

Ill formed or malicious class files could be aware that C is an inline type, but use a
single checkcast instruction (without preceding ldc) anyway. This is part of a bigger
problem that has not been discussed yet: L/Q consistency inside a class file. the SoV
document stipules that the value projection only exists in the Q-form and the reference
projection only exists in the L-form. As of today, there’s no verification of such kind
performed on class files. Nothing prevent a class file from declaring a field of type
‘LC;’ and another of type ‘QC;’, and the same remark applies to method arguments.


Going back to the modified specification of checkcast, new tests are easy to implement
in the interpreter, and can easily be optimized by JIT compilers (most types would be
loaded at compilation time), and there’s no bootstrapping issues.

Fred





More information about the valhalla-spec-observers mailing list