RemoveNeverExecutedCode
Deneau, Tom
tom.deneau at amd.com
Fri Aug 22 18:23:03 UTC 2014
Historically for the GPU compiles we have always run with -G:-RemoveNeverExecutedCode. In the early days of junit testing, we didn't have Deoptimization working so if we didn't do enough profiling or if the testing paths took some path that wasn't part of the profiling data, we would get the wrong answer.
Now that we can deoptimize, we could theoretically go back to the default -G:+RemoveNeverExecutedCode. But we would still want a decent profile so that we didn't deoptimize unnecessarily.
On the code below, we really do want to compile both branches even if the profiling data doesn't show one as taken, so I guess this is a case where we use -G:-RemoveNeverExecutedCode.
if (a1[n] != a2[n]) {
isEqual = false;
}
One minor advantage I noticed with -G:+RemoveNeverExecutedCode, is that it allows us to compile some methods we couldn't normally compile because some never taken path involves a recursive method call (which we don't support yet), or certain virtual calls (which since it was never executed contains no profiling info and can't be inlined). So it is sort of a tradeoff but I suppose we will stick with RemoveNeverExecutedCode being off.
-- Tom
-----Original Message-----
From: Tom Rodriguez [mailto:tom.rodriguez at oracle.com]
Sent: Friday, August 22, 2014 12:45 PM
To: Deneau, Tom
Cc: Douglas Simon; graal-dev at openjdk.java.net
Subject: Re: RemoveNeverExecutedCode
On Aug 22, 2014, at 9:10 AM, Deneau, Tom <tom.deneau at amd.com> wrote:
> Doug --
>
> Yes, for the bytecode in question this tells me that the branchProbability is 1.0 but I was trying to understand why.
>
> For instance, 10000 elements in array, the first 10 should all not take the branch, but I see...
>
> executionCount at 13: 9488; branchProbability at 13: 1.000000; exceptionSeen at 13: FALSE;
>
> It looks like maybe the profiling only kicks in after the first 512 times a bytecode is executed (since executionCount above is 9488)? Also if I move the location of the differing elements to past 512, I see that it gets recorded. Is there some logic like that in the hotspot profiling interpreter?
Profiling only kicks in after a certain number of invocations. It's usually 33% of the first tier compile threshold, though I think it works differently in tiered. I'm having trouble finding the exact logic. Anyway, it's a profile so there's no guarantee about what's in it. For unreachable code it will eventually come to a proper stable state about what has been reached which is all we normally care about.
For GPU you might want a more conservative notion of never executed, possibly taking into account the maturity of the branch itself or maybe trusting those counts is just a bad idea for GPU. Is that what you're trying to figure out?
tom
>
> -- Tom
>
>
>
> -----Original Message-----
> From: Doug Simon [mailto:doug.simon at oracle.com]
> Sent: Friday, August 22, 2014 7:39 AM
> To: Deneau, Tom
> Cc: Tom Rodriguez; graal-dev at openjdk.java.net
> Subject: Re: RemoveNeverExecutedCode
>
> You should get all the info you need with ResolvedJavaMethod.getProfilingInfo().toString().
>
> On Aug 22, 2014, at 2:34 PM, Deneau, Tom <tom.deneau at amd.com> wrote:
>
>> I don't know the details of how the profile is collected but...
>> In my case with 2000000 elements and 1 different, in the profile, the branchTakenProbability is showing up as 1.0.
>> For other combinations of elements and differs I saw this:
>> Elems Differs BTP
>> 1000000 1000 < 1.0
>> 1000000 10 1.0
>> 100000 10 1.0
>> 10000 10 1.0
>> 1000 10 < 1.0
>>
>> -- Tom
>>
>>
>> -----Original Message-----
>> From: Tom Rodriguez [mailto:tom.rodriguez at oracle.com]
>> Sent: Thursday, August 21, 2014 7:12 PM
>> To: Deneau, Tom
>> Cc: graal-dev at openjdk.java.net
>> Subject: Re: RemoveNeverExecutedCode
>>
>>
>> On Aug 21, 2014, at 4:39 PM, Deneau, Tom <tom.deneau at amd.com> wrote:
>>
>>> I am trying to benchmark an HSAIL kernel that is implementing the following lambda
>>> n -> {
>>> if (a1[n] != a2[n]) {
>>> isEqual = false;
>>> }
>>>
>>> where a1 and a2 are arrays of doubles. So it is basically doing an Arrays.equals.
>>>
>>> I set up the test to have 2,000,000 elements in the arrays and one of them does not match.
>>> Before I compile for HSAIL, I enable some profiling by running the above lambda on the cpu in non-parallel mode (IntStream.range().forEach()) so the lambda gets executed 2000000 times (and takes the false branch once).
>>>
>>> But even though one of the elements does not match, when I compile thru graal with the default -G:+RemoveNeverExecutedCode, I see that the isEqual = false path has been considered "not executed" and has been removed.
>>>
>>> Is taking a branch once out of 2,000,000 times considered "not executed"?
>>> Or is there some other flaw here?
>>
>> Remember this means never executed according to the profile, so if it doesn't show up in the profile then it didn't happen. Did the profile include the taken branch?
>>
>> tom
>>
>>>
>>> Of course I can work around this in this case by forcing -G:-RemoveNeverExecutedCode but I'd like to understand this.
>>> This is running in -server mode.
>>>
>>> -- Tom
>>>
>>
>
More information about the graal-dev
mailing list