Questions about type annotation on type casts.

Fri Dec 11 00:25:36 UTC 2015

On 12/9/2015 8:19 PM, Srikanth wrote:
> While working on https://bugs.openjdk.java.net/browse/JDK-8144168,
> a few questions have surfaced regarding code generation for a type
> annotated cast.
>
> (1) If the expression that is being type-cast and the target type of the
> cast
> are determined to be the same at compile time, javac does not emit a
> checkcast instruction at all. This looks lossy to me.   ECJ always emits a
> cast if there is a type annotation involved. Which approach is also
> problematic since ...
>
> (2) ... Since checkcast necessarily works only on reference operands
> on the stack. So ECJ ends up generating bad code for
>
> import java.lang.annotation.*;
>
> public class X {
>      @Target(ElementType.TYPE_USE)
>      public @interface T {
>      }
>
>      public static void main(String[] args) {
>          int i = (@T int) 10;
>
>      }
> }
>
> attempting to emit a checkcast due to the policy of an annotated cast
> always being emitted into the classfile.
>
> (3) JVMS 4.7.20.1 states:
>
>   ...
>
> type_argument_target {
>      u2 offset;
>      u1 type_argument_index;
> }
>
> The value of the offset item specifies the code array offset of either the
> bytecode instruction corresponding to the cast expression, the new bytecode
> instruction corresponding to the new expression, ...

How prescient of JVMS8 4.7.20.1 not to name 'checkcast' as the bytecode 
instruction corresponding to the cast expression :-)

> So what exactly intended to be the value of the offset item for a cast ?
> ATM, javac seems to interpret this as the offset of the first
> instruction of chunk
> of code generated to evaluate the the expression being typecast, while
> ECJ makes the offset point to the checkcast instruction itself.

If javac implements a cast operator via checkcast (for a reference 
target type) or {i,l,f,d}2{b,s,i,l,f,d} (for a primitive target type), 
then that's the answer.

If javac does not implement a cast operator at all (usually because the 
target type is identical to the type of the expression being cast, but 
also for other reasons: (float)1 doesn't engender a cast from int to 
float because javac simply emits fconst_1, while a cast to a primitive 
type is redundant when auto-unboxing occurs), then I recommend that 
'offset' indicates the instruction which is responsible for physically 
pushing on to the stack the value whose type is the target type from the 
cast operator.

For the "(@T int)10" expression above, it would be the ldc that pushes 
10. For "(float)1", it would be the fconst_1. For an auto-unbox like 
"int x = (int)new Integer(1);", it would be the invokevirtual on 
Integer.intValue().

In effect, I'm recommending the offset of the last instruction of the 
chunk of code generated to evaluate the expression being cast. However, 
if javac prefers to record the offset of first instruction, I would not 
complain.

Since there is no compiler spec, this is a quality-of-implementation 
issue. An implementation which generates bad code for an annotated cast 
operator is of questionable quality. An implementation which records an 
annotated cast operator at roughly the right offset is not of 
questionable quality.

Alex