RFR: 8293491: Avoid unexpected deoptimization in loop exit due to incorrect branch profiling

Jie Fu jiefu at openjdk.org
Thu Sep 8 15:10:43 UTC 2022


On Thu, 8 Sep 2022 10:32:53 GMT, Tobias Hartmann <thartmann at openjdk.org> wrote:

>> Hi all,
>> 
>> Please review this patch which fixes the unexpected deoptimizations in loop exit due to incorrect branch profiling.
>> 
>> # Background
>> 
>> While analyzing our big data Apps, we observed unexpected deoptimizations in loop exit due to incorrect branch profiling.
>> 
>> Here is a reproducer.
>> 
>> public class UnexpectedLoopExitDeopt {
>>     public static final int N = 20000000;
>> 
>>     public static int d1[] = new int[N];
>>     public static int d2[] = new int[N];
>> 
>>     public static void main(String[] args) {
>>         System.out.println(test(d1));
>>         System.out.println(test(d2));
>>     }
>> 
>>     public static int test(int[] a) {
>>         int sum = 0;
>>         for(int i = 0; i < a.length; i++) {
>>             sum += a[i];
>>         }
>>         return sum;
>>     }
>> }
>> 
>> 
>> The following is the compilation sequence.
>> 
>>      77    1       3       java.lang.Object::<init> (1 bytes)
>>      83    2       3       java.lang.String::isLatin1 (19 bytes)
>>      84    6       3       jdk.internal.util.Preconditions::checkIndex (18 bytes)
>>      84    3       3       java.lang.String::charAt (25 bytes)
>>      85    4       3       java.lang.StringLatin1::charAt (15 bytes)
>>      86    7       3       java.lang.String::coder (15 bytes)
>>      86    8       3       java.lang.String::hashCode (60 bytes)
>>      87    5       3       java.lang.String::checkIndex (10 bytes)
>>      87    9       3       java.lang.String::length (11 bytes)
>>      93   10     n 0       java.lang.invoke.MethodHandle::linkToStatic(LLLLLLL)L (native)   (static)
>>      96   11     n 0       java.lang.invoke.MethodHandle::linkToSpecial(LLLL)L (native)   (static)
>>      96   12     n 0       java.lang.Object::hashCode (native)   
>>      97   13     n 0       java.lang.invoke.MethodHandle::invokeBasic(LLLLLL)L (native)   
>>      98   14       3       java.util.Objects::requireNonNull (14 bytes)
>>      98   15     n 0       java.lang.invoke.MethodHandle::linkToSpecial(LLLLLLLL)L (native)   (static)
>>      98   16       1       java.lang.Enum::ordinal (5 bytes)
>>     101   17     n 0       java.lang.invoke.MethodHandle::linkToSpecial(LLLL)V (native)   (static)
>>     102   18     n 0       java.lang.invoke.MethodHandle::invokeBasic(LL)L (native)   
>>     212   19 %     3       UnexpectedLoopExitDeopt::test @ 4 (24 bytes)
>>     213   20 %     4       UnexpectedLoopExitDeopt::test @ 4 (24 bytes)
>>     221   19 %     3       UnexpectedLoopExitDeopt::test @ 4 (24 bytes)   made not entrant
>>     221   21       4       UnexpectedLoopExitDeopt::test (24 bytes)
>>     230   20 %     4       UnexpectedLoopExitDeopt::test @ 4 (24 bytes)   made not entrant     <--- Unexpected deopt
>> 0
>>     242   21       4       UnexpectedLoopExitDeopt::test (24 bytes)   made not entrant         <--- Unexpected deopt
>> 0
>> 
>> 
>> The last two deopts (made not entrant) happened in the loop exit which are unexpected.
>> 
>> 
>> # Reason
>> 
>> The unexpected deopts were caused by the incorrect branch profiling count (0 taken count for loop predicate).
>> 
>> Here is the profiling data for `UnexpectedLoopExitDeopt::test`.
>> We can see that for `if_icmpge` @ bci=7, the count for `not taken` is 264957, while 0 for `taken`.
>> The profile count for zero taken is obvious incorrect since the loop will finally exit (when `i >= a.length`).
>> So the taken count should be at least 1 for `if_icmpge` @ bci=7.
>> 
>> 0 iconst_0
>> 1 istore_1
>> 2 iconst_0
>> 3 istore_2
>> 
>> 4 iload_2
>> 5 fast_aload_0
>> 6 arraylength
>> 7 if_icmpge 22
>>   0   bci: 7    BranchData          taken(0) displacement(56)
>>                                     not taken(264957)
>> 
>> 10 iload_1
>> 11 fast_aload_0
>> 12 iload_2
>> 13 iaload
>> 14 iadd
>> 15 istore_1
>> 16 iinc #2 1
>> 19 goto 4
>>   32  bci: 19   JumpData            taken(266667) displacement(-32)
>> 
>> 22 iload_1
>> 23 ireturn
>> 
>> 
>> # Fix
>> 
>> The main idea is to detect if the branch taken target is a loop exit.
>> If so, set the taken count to be at least 1.
>> This is fine because most loops should be finite and would execute the loop exit code at lease once.
>> For infinite loops like `while (true) {...}`, the patch won't change the original behaviour since there is no loop exit.
>> 
>> # Testing
>> 
>> tier1~3 on Linux/x64, no regression
>> 
>> Thanks.
>> Best regards,
>> Jie
>
> src/hotspot/share/ci/ciMethodBlocks.cpp line 166:
> 
>> 164:           if (dest_bci < bci) {
>> 165:             next_block->set_is_loop_exit();
>> 166:           }
> 
> I think this loop detection logic is both wrong and incomplete. For example, javac generates no `goto` for the following loop:
> 
>         int i = 0;
>         do {
>             
>         } while (i++ < 10);
> 
> 
> And in the following case, the block after the `goto` corresponding to the `continue` statement is not the loop exit:
> 
>         int i = 0;
>         label:
>         while (true) {
>             i++;
>             if (i == 1)
>                 continue label;
>             if (i == 2)
>                 break;
>         }
> 
> 
> I just quickly hacked this, there are probably better examples.

Nice catch!

I missed the `do {...} while (condition)` loop pattern and the `continue` statement.
So we should find better way to identify the loop exit block, right?

-------------

PR: https://git.openjdk.org/jdk/pull/10200


More information about the hotspot-compiler-dev mailing list