instanceof code and OptimizedTypeCheckViolated

Tue Jan 28 09:06:15 PST 2014

On Jan 28, 2014, at 5:32 AM, Deneau, Tom <tom.deneau at amd.com> wrote:

> Yes, the tests runs N = 20 iterations alternating types
> and then forces a compile.
> 
> I understand the 50-50 results, but I didn't understand
> the taken=0, not-taken=0 results.  Shouldn't the profile
> data be the same for every run?

Actually it’s odd that you get any profile at all.  Normally hotspot doesn’t create a method data until it’s run quite a few times, usually 1/3 of CompileThreshold.  Some of the tests call HotSpotJavaMethod.reprofile to force the creation of an MDO earlier but I don’t think you want to do that here.

tom

> 
> -- Tom
> 
>> -----Original Message-----
>> From: Tom Rodriguez [mailto:tom.rodriguez at oracle.com]
>> Sent: Monday, January 27, 2014 7:32 PM
>> To: Deneau, Tom
>> Cc: Douglas Simon; graal-dev at openjdk.java.net
>> Subject: Re: instanceof code and OptimizedTypeCheckViolated
>> 
>> 
>> On Jan 27, 2014, at 2:07 PM, Deneau, Tom <tom.deneau at amd.com> wrote:
>> 
>>> When I use the -Dgraal.traceMethodDataFilter=class.method
>>> flag, I see that the times it is failing, the branchData is
>>> 50-50, but when it is succeeding, it is like there is no profile info
>> at all
>>> (unless I am reading this incorrectly).  Why would there be no profile
>> data?
>>> For example...
>> 
>> Isn't that how the test is constructed?  It appears to run N = 20
>> iterations alternating types and then forces a compile.
>> 
>> A count of 10 (well 20) is very low to be be used as trustworthy profile
>> information.  I'm pushing some changes that make Graal use the is_mature
>> test that C2 does to filter profile data that has relatively low
>> execution counts, so it's likely this problem will disappear for you.
>> 
>>> 
>>> Raw method data for
>> com.oracle.graal.compiler.hsail.test.lambda.InstanceOfTest.lambda$runTes
>> t$4(int):
>>> 0      bci: 11    TypeCheckData       count(0) null_seen(FALSE)
>> exception_seen(FALSE) nonprofiled_count(0) entries(0)
>>> 56     bci: 14    BranchData          taken(0, -1.00) not_taken(0,
>> 2.00) displacement(56)
>>> 88     bci: 18    JumpData            taken(0) displacement(24)
>>> 
>>> 
>>> vs. other times seeing
>>> 
>>> 0      bci: 11    TypeCheckData       count(2147483647)
>> null_seen(FALSE) exception_seen(FALSE) nonprofiled_count(0) entries(1)
>> 
>> That big count is fairly odd looking.  I wonder if that's a printing
>> bug.
>> 
>> tom
>> 
>>> 
>> com.oracle.graal.compiler.hsail.test.lambda.VirtualCallTest$Circle (10,
>> 1.00)
>>> 56     bci: 14    BranchData          taken(10, 0.50) not_taken(10,
>> 0.50) displacement(56)
>>> 88     bci: 18    JumpData            taken(10) displacement(24)
>>> 
>>> -- Tom
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Doug Simon [mailto:doug.simon at oracle.com]
>>>> Sent: Monday, January 27, 2014 3:45 PM
>>>> To: Tom Rodriguez
>>>> Cc: Deneau, Tom; graal-dev at openjdk.java.net
>>>> Subject: Re: instanceof code and OptimizedTypeCheckViolated
>>>> 
>>>> 
>>>> On Jan 27, 2014, at 10:37 PM, Tom Rodriguez
>> <tom.rodriguez at oracle.com>
>>>> wrote:
>>>> 
>>>>> It's probably some minor difference in the profile.  You can use -
>>>> Dgraal.traceMethodDataFilter=class.method to dump the method data
>> used
>>>> for that method.  In general I would think for GPU we'd want to avoid
>>>> any overly optimistic profile based optimization since deopt is
>> likely
>>>> to be hugely expensive for GPU or at least a waste of compute
>> resources.
>>>> TypeCheckMinProfileHitProbability appears to control this code, so
>> you
>>>> might try overriding it with a higher value.
>>>>> 
>>>>> I'm actually unclear under what conditions instanceofWithProfile
>> will
>>>> win over a straight instance of test.  Maybe interface types or deep
>>>> hierarchies?
>>>> 
>>>> Definitely over these latter cases as not only will the generated
>> code
>>>> be tighter, but types after the instanceof node in the graph could
>>>> potentially be much tighter. There maybe idiomatic Truffle code where
>>>> this is particularly true?
>>>> 
>>>> -Doug
>>>> 
>>>>> On Jan 27, 2014, at 12:59 PM, Deneau, Tom <tom.deneau at amd.com>
>> wrote:
>>>>> 
>>>>>> Question on some instanceof codegen in one of our hsail junits
>>>>>> 
>>>>>> Java code is:
>>>>>> 
>>>>>> float[] outArray;
>>>>>> Shape[] inShapeArray;
>>>>>> (gid)->{
>>>>>>   outArray[gid] = (inShapeArray[gid] instanceof Circle ? 1.0f :
>>>>>> 2.0f);  });
>>>>>> 
>>>>>> The inShapeArray is 50/50 Circle and non-Circle (and we have
>> executed
>>>>>> as java first so have some profiling)
>>>>>> 
>>>>>> Yet the generated code is taking a deopt path for reason
>>>> OptimizedTypeCheckViolated when it is not a circle.
>>>>>> So the code to store 2.0 is never generated.
>>>>>> 
>>>>>> In igv, I can see that this happens when we inline
>>>>>> InstanceOfSnippets.instanceofWithProfile
>>>>>> 
>>>>>> This doesn't happen on every codegen, sometimes the 2.0 store block
>>>> is generated nicely.
>>>>>> 
>>>>>> This used to work fine until I think a few weeks ago...
>>>>>> 
>>>>>> Any suggestions?
>>>>>> 
>>>>>> -- Tom
>>>>>> 
>>>>>> 
>>>>>> @L7:
>>>>>> 	ld_global_u32 $d0, [$d0 + 8];
>>>>>> 	shl_u64 $d0, $d0, 3;
>>>>>> 	mov_b64 $d2, 0x1000fcd88;                  // check if class ==
>>>> Circle
>>>>>> 	cmp_ne_b1_s64 $c0, $d2, $d0;
>>>>>> 	cbr $c0, @L11;
>>>>>> @L8:
>>>>>> 	ld_global_s32 $s1, [$d1 + 12];
>>>>>> 	cmp_ge_b1_u32 $c0, $s0, $s1;
>>>>>> 	cbr $c0, @L10;
>>>>>> @L9:
>>>>>> 	cvt_s64_s32 $d0, $s0;                      // code that stores
>>>> 1.0f for Circle case
>>>>>> 	mul_s64 $d0, $d0, 4;
>>>>>> 	add_u64 $d1, $d1, $d0;
>>>>>> 	mov_b32 $s0, 1.0f;
>>>>>> 	st_global_f32 $s0, [$d1 + 16];
>>>>>> 	ret;
>>>>>> @L1:
>>>>>> 	mov_b32 $s0, -21259;
>>>>>> @L13:
>>>>>> 	ret;
>>>>>> @L11:
>>>>>>   // not a Circle, deoptimize with OptimizedTypeCheckViolated
>>>>>> reason
>>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>> 
> 
>