instanceof code and OptimizedTypeCheckViolated
Tom Rodriguez
tom.rodriguez at oracle.com
Tue Jan 28 09:06:15 PST 2014
On Jan 28, 2014, at 5:32 AM, Deneau, Tom <tom.deneau at amd.com> wrote:
> Yes, the tests runs N = 20 iterations alternating types
> and then forces a compile.
>
> I understand the 50-50 results, but I didn't understand
> the taken=0, not-taken=0 results. Shouldn't the profile
> data be the same for every run?
Actually it’s odd that you get any profile at all. Normally hotspot doesn’t create a method data until it’s run quite a few times, usually 1/3 of CompileThreshold. Some of the tests call HotSpotJavaMethod.reprofile to force the creation of an MDO earlier but I don’t think you want to do that here.
tom
>
> -- Tom
>
>> -----Original Message-----
>> From: Tom Rodriguez [mailto:tom.rodriguez at oracle.com]
>> Sent: Monday, January 27, 2014 7:32 PM
>> To: Deneau, Tom
>> Cc: Douglas Simon; graal-dev at openjdk.java.net
>> Subject: Re: instanceof code and OptimizedTypeCheckViolated
>>
>>
>> On Jan 27, 2014, at 2:07 PM, Deneau, Tom <tom.deneau at amd.com> wrote:
>>
>>> When I use the -Dgraal.traceMethodDataFilter=class.method
>>> flag, I see that the times it is failing, the branchData is
>>> 50-50, but when it is succeeding, it is like there is no profile info
>> at all
>>> (unless I am reading this incorrectly). Why would there be no profile
>> data?
>>> For example...
>>
>> Isn't that how the test is constructed? It appears to run N = 20
>> iterations alternating types and then forces a compile.
>>
>> A count of 10 (well 20) is very low to be be used as trustworthy profile
>> information. I'm pushing some changes that make Graal use the is_mature
>> test that C2 does to filter profile data that has relatively low
>> execution counts, so it's likely this problem will disappear for you.
>>
>>>
>>> Raw method data for
>> com.oracle.graal.compiler.hsail.test.lambda.InstanceOfTest.lambda$runTes
>> t$4(int):
>>> 0 bci: 11 TypeCheckData count(0) null_seen(FALSE)
>> exception_seen(FALSE) nonprofiled_count(0) entries(0)
>>> 56 bci: 14 BranchData taken(0, -1.00) not_taken(0,
>> 2.00) displacement(56)
>>> 88 bci: 18 JumpData taken(0) displacement(24)
>>>
>>>
>>> vs. other times seeing
>>>
>>> 0 bci: 11 TypeCheckData count(2147483647)
>> null_seen(FALSE) exception_seen(FALSE) nonprofiled_count(0) entries(1)
>>
>> That big count is fairly odd looking. I wonder if that's a printing
>> bug.
>>
>> tom
>>
>>>
>> com.oracle.graal.compiler.hsail.test.lambda.VirtualCallTest$Circle (10,
>> 1.00)
>>> 56 bci: 14 BranchData taken(10, 0.50) not_taken(10,
>> 0.50) displacement(56)
>>> 88 bci: 18 JumpData taken(10) displacement(24)
>>>
>>> -- Tom
>>>
>>>
>>>> -----Original Message-----
>>>> From: Doug Simon [mailto:doug.simon at oracle.com]
>>>> Sent: Monday, January 27, 2014 3:45 PM
>>>> To: Tom Rodriguez
>>>> Cc: Deneau, Tom; graal-dev at openjdk.java.net
>>>> Subject: Re: instanceof code and OptimizedTypeCheckViolated
>>>>
>>>>
>>>> On Jan 27, 2014, at 10:37 PM, Tom Rodriguez
>> <tom.rodriguez at oracle.com>
>>>> wrote:
>>>>
>>>>> It's probably some minor difference in the profile. You can use -
>>>> Dgraal.traceMethodDataFilter=class.method to dump the method data
>> used
>>>> for that method. In general I would think for GPU we'd want to avoid
>>>> any overly optimistic profile based optimization since deopt is
>> likely
>>>> to be hugely expensive for GPU or at least a waste of compute
>> resources.
>>>> TypeCheckMinProfileHitProbability appears to control this code, so
>> you
>>>> might try overriding it with a higher value.
>>>>>
>>>>> I'm actually unclear under what conditions instanceofWithProfile
>> will
>>>> win over a straight instance of test. Maybe interface types or deep
>>>> hierarchies?
>>>>
>>>> Definitely over these latter cases as not only will the generated
>> code
>>>> be tighter, but types after the instanceof node in the graph could
>>>> potentially be much tighter. There maybe idiomatic Truffle code where
>>>> this is particularly true?
>>>>
>>>> -Doug
>>>>
>>>>> On Jan 27, 2014, at 12:59 PM, Deneau, Tom <tom.deneau at amd.com>
>> wrote:
>>>>>
>>>>>> Question on some instanceof codegen in one of our hsail junits
>>>>>>
>>>>>> Java code is:
>>>>>>
>>>>>> float[] outArray;
>>>>>> Shape[] inShapeArray;
>>>>>> (gid)->{
>>>>>> outArray[gid] = (inShapeArray[gid] instanceof Circle ? 1.0f :
>>>>>> 2.0f); });
>>>>>>
>>>>>> The inShapeArray is 50/50 Circle and non-Circle (and we have
>> executed
>>>>>> as java first so have some profiling)
>>>>>>
>>>>>> Yet the generated code is taking a deopt path for reason
>>>> OptimizedTypeCheckViolated when it is not a circle.
>>>>>> So the code to store 2.0 is never generated.
>>>>>>
>>>>>> In igv, I can see that this happens when we inline
>>>>>> InstanceOfSnippets.instanceofWithProfile
>>>>>>
>>>>>> This doesn't happen on every codegen, sometimes the 2.0 store block
>>>> is generated nicely.
>>>>>>
>>>>>> This used to work fine until I think a few weeks ago...
>>>>>>
>>>>>> Any suggestions?
>>>>>>
>>>>>> -- Tom
>>>>>>
>>>>>>
>>>>>> @L7:
>>>>>> ld_global_u32 $d0, [$d0 + 8];
>>>>>> shl_u64 $d0, $d0, 3;
>>>>>> mov_b64 $d2, 0x1000fcd88; // check if class ==
>>>> Circle
>>>>>> cmp_ne_b1_s64 $c0, $d2, $d0;
>>>>>> cbr $c0, @L11;
>>>>>> @L8:
>>>>>> ld_global_s32 $s1, [$d1 + 12];
>>>>>> cmp_ge_b1_u32 $c0, $s0, $s1;
>>>>>> cbr $c0, @L10;
>>>>>> @L9:
>>>>>> cvt_s64_s32 $d0, $s0; // code that stores
>>>> 1.0f for Circle case
>>>>>> mul_s64 $d0, $d0, 4;
>>>>>> add_u64 $d1, $d1, $d0;
>>>>>> mov_b32 $s0, 1.0f;
>>>>>> st_global_f32 $s0, [$d1 + 16];
>>>>>> ret;
>>>>>> @L1:
>>>>>> mov_b32 $s0, -21259;
>>>>>> @L13:
>>>>>> ret;
>>>>>> @L11:
>>>>>> // not a Circle, deoptimize with OptimizedTypeCheckViolated
>>>>>> reason
>>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>
More information about the graal-dev
mailing list