instanceof code and OptimizedTypeCheckViolated
Tom Rodriguez
tom.rodriguez at oracle.com
Mon Jan 27 17:31:47 PST 2014
On Jan 27, 2014, at 2:07 PM, Deneau, Tom <tom.deneau at amd.com> wrote:
> When I use the -Dgraal.traceMethodDataFilter=class.method
> flag, I see that the times it is failing, the branchData is
> 50-50, but when it is succeeding, it is like there is no profile info at all
> (unless I am reading this incorrectly). Why would there be no profile data?
> For example…
Isn’t that how the test is constructed? It appears to run N = 20 iterations alternating types and then forces a compile.
A count of 10 (well 20) is very low to be be used as trustworthy profile information. I’m pushing some changes that make Graal use the is_mature test that C2 does to filter profile data that has relatively low execution counts, so it’s likely this problem will disappear for you.
>
> Raw method data for com.oracle.graal.compiler.hsail.test.lambda.InstanceOfTest.lambda$runTest$4(int):
> 0 bci: 11 TypeCheckData count(0) null_seen(FALSE) exception_seen(FALSE) nonprofiled_count(0) entries(0)
> 56 bci: 14 BranchData taken(0, -1.00) not_taken(0, 2.00) displacement(56)
> 88 bci: 18 JumpData taken(0) displacement(24)
>
>
> vs. other times seeing
>
> 0 bci: 11 TypeCheckData count(2147483647) null_seen(FALSE) exception_seen(FALSE) nonprofiled_count(0) entries(1)
That big count is fairly odd looking. I wonder if that’s a printing bug.
tom
> com.oracle.graal.compiler.hsail.test.lambda.VirtualCallTest$Circle (10, 1.00)
> 56 bci: 14 BranchData taken(10, 0.50) not_taken(10, 0.50) displacement(56)
> 88 bci: 18 JumpData taken(10) displacement(24)
>
> -- Tom
>
>
>> -----Original Message-----
>> From: Doug Simon [mailto:doug.simon at oracle.com]
>> Sent: Monday, January 27, 2014 3:45 PM
>> To: Tom Rodriguez
>> Cc: Deneau, Tom; graal-dev at openjdk.java.net
>> Subject: Re: instanceof code and OptimizedTypeCheckViolated
>>
>>
>> On Jan 27, 2014, at 10:37 PM, Tom Rodriguez <tom.rodriguez at oracle.com>
>> wrote:
>>
>>> It's probably some minor difference in the profile. You can use -
>> Dgraal.traceMethodDataFilter=class.method to dump the method data used
>> for that method. In general I would think for GPU we'd want to avoid
>> any overly optimistic profile based optimization since deopt is likely
>> to be hugely expensive for GPU or at least a waste of compute resources.
>> TypeCheckMinProfileHitProbability appears to control this code, so you
>> might try overriding it with a higher value.
>>>
>>> I'm actually unclear under what conditions instanceofWithProfile will
>> win over a straight instance of test. Maybe interface types or deep
>> hierarchies?
>>
>> Definitely over these latter cases as not only will the generated code
>> be tighter, but types after the instanceof node in the graph could
>> potentially be much tighter. There maybe idiomatic Truffle code where
>> this is particularly true?
>>
>> -Doug
>>
>>> On Jan 27, 2014, at 12:59 PM, Deneau, Tom <tom.deneau at amd.com> wrote:
>>>
>>>> Question on some instanceof codegen in one of our hsail junits
>>>>
>>>> Java code is:
>>>>
>>>> float[] outArray;
>>>> Shape[] inShapeArray;
>>>> (gid)->{
>>>> outArray[gid] = (inShapeArray[gid] instanceof Circle ? 1.0f :
>>>> 2.0f); });
>>>>
>>>> The inShapeArray is 50/50 Circle and non-Circle (and we have executed
>>>> as java first so have some profiling)
>>>>
>>>> Yet the generated code is taking a deopt path for reason
>> OptimizedTypeCheckViolated when it is not a circle.
>>>> So the code to store 2.0 is never generated.
>>>>
>>>> In igv, I can see that this happens when we inline
>>>> InstanceOfSnippets.instanceofWithProfile
>>>>
>>>> This doesn't happen on every codegen, sometimes the 2.0 store block
>> is generated nicely.
>>>>
>>>> This used to work fine until I think a few weeks ago...
>>>>
>>>> Any suggestions?
>>>>
>>>> -- Tom
>>>>
>>>>
>>>> @L7:
>>>> ld_global_u32 $d0, [$d0 + 8];
>>>> shl_u64 $d0, $d0, 3;
>>>> mov_b64 $d2, 0x1000fcd88; // check if class ==
>> Circle
>>>> cmp_ne_b1_s64 $c0, $d2, $d0;
>>>> cbr $c0, @L11;
>>>> @L8:
>>>> ld_global_s32 $s1, [$d1 + 12];
>>>> cmp_ge_b1_u32 $c0, $s0, $s1;
>>>> cbr $c0, @L10;
>>>> @L9:
>>>> cvt_s64_s32 $d0, $s0; // code that stores
>> 1.0f for Circle case
>>>> mul_s64 $d0, $d0, 4;
>>>> add_u64 $d1, $d1, $d0;
>>>> mov_b32 $s0, 1.0f;
>>>> st_global_f32 $s0, [$d1 + 16];
>>>> ret;
>>>> @L1:
>>>> mov_b32 $s0, -21259;
>>>> @L13:
>>>> ret;
>>>> @L11:
>>>> // not a Circle, deoptimize with OptimizedTypeCheckViolated
>>>> reason
>>>>
>>>
>>
>
>
More information about the graal-dev
mailing list