instanceof code and OptimizedTypeCheckViolated
Doug Simon
doug.simon at oracle.com
Mon Jan 27 13:45:17 PST 2014
On Jan 27, 2014, at 10:37 PM, Tom Rodriguez <tom.rodriguez at oracle.com> wrote:
> It’s probably some minor difference in the profile. You can use -Dgraal.traceMethodDataFilter=class.method to dump the method data used for that method. In general I would think for GPU we’d want to avoid any overly optimistic profile based optimization since deopt is likely to be hugely expensive for GPU or at least a waste of compute resources. TypeCheckMinProfileHitProbability appears to control this code, so you might try overriding it with a higher value.
>
> I’m actually unclear under what conditions instanceofWithProfile will win over a straight instance of test. Maybe interface types or deep hierarchies?
Definitely over these latter cases as not only will the generated code be tighter, but types after the instanceof node in the graph could potentially be much tighter. There maybe idiomatic Truffle code where this is particularly true?
-Doug
> On Jan 27, 2014, at 12:59 PM, Deneau, Tom <tom.deneau at amd.com> wrote:
>
>> Question on some instanceof codegen in one of our hsail junits
>>
>> Java code is:
>>
>> float[] outArray;
>> Shape[] inShapeArray;
>> (gid)->{
>> outArray[gid] = (inShapeArray[gid] instanceof Circle ? 1.0f : 2.0f);
>> });
>>
>> The inShapeArray is 50/50 Circle and non-Circle (and we have executed as java first so have some profiling)
>>
>> Yet the generated code is taking a deopt path for reason OptimizedTypeCheckViolated when it is not a circle.
>> So the code to store 2.0 is never generated.
>>
>> In igv, I can see that this happens when we inline InstanceOfSnippets.instanceofWithProfile
>>
>> This doesn't happen on every codegen, sometimes the 2.0 store block is generated nicely.
>>
>> This used to work fine until I think a few weeks ago...
>>
>> Any suggestions?
>>
>> -- Tom
>>
>>
>> @L7:
>> ld_global_u32 $d0, [$d0 + 8];
>> shl_u64 $d0, $d0, 3;
>> mov_b64 $d2, 0x1000fcd88; // check if class == Circle
>> cmp_ne_b1_s64 $c0, $d2, $d0;
>> cbr $c0, @L11;
>> @L8:
>> ld_global_s32 $s1, [$d1 + 12];
>> cmp_ge_b1_u32 $c0, $s0, $s1;
>> cbr $c0, @L10;
>> @L9:
>> cvt_s64_s32 $d0, $s0; // code that stores 1.0f for Circle case
>> mul_s64 $d0, $d0, 4;
>> add_u64 $d1, $d1, $d0;
>> mov_b32 $s0, 1.0f;
>> st_global_f32 $s0, [$d1 + 16];
>> ret;
>> @L1:
>> mov_b32 $s0, -21259;
>> @L13:
>> ret;
>> @L11:
>> // not a Circle, deoptimize with OptimizedTypeCheckViolated reason
>>
>
More information about the graal-dev
mailing list