RFR: 8295023: Interpreter(AArch64): Implement -XX:+PrintBytecodeHistogram and -XX:+PrintBytecodePairHistogram options [v4]

Nick Gasson ngasson at openjdk.org
Mon Oct 17 09:42:12 UTC 2022


On Thu, 13 Oct 2022 04:12:22 GMT, Hao Sun <haosun at openjdk.org> wrote:

>> In this patch, we implement functions histogram_bytecode() and histogram_bytecode_pair() for interpreter AArch64 part. Similar to count_bytecode(), we use atomic operations to update the counters as well.
>> 
>> Here shows part of the message produced with -XX:+PrintBytecodeHistogram and -XX:+PrintBytecodePairHistogram options after this patch.
>> 
>> 
>> $ java -XX:+PrintBytecodeHistogram --version | head -20
>> openjdk 20-internal 2023-03-21
>> OpenJDK Runtime Environment (fastdebug build 20-internal-adhoc.haosun.jdk-src-dev)
>> OpenJDK 64-Bit Server VM (fastdebug build 20-internal-adhoc.haosun.jdk-src-dev, mixed mode)
>> 
>> Histogram of 5004099 executed bytecodes:
>> 
>>   absolute  relative  code    name
>> ----------------------------------------------------------------------
>>     319124     6.38%    dc    fast_aload_0
>>     313397     6.26%    e0    fast_iload
>>     251436     5.02%    b6    invokevirtual
>>     227428     4.54%    19    aload
>>     166054     3.32%    a7    goto
>>     159167     3.18%    2b    aload_1
>>     151803     3.03%    de    fast_aaccess_0
>>     136787     2.73%    1b    iload_1
>>     124037     2.48%    36    istore
>>     118791     2.37%    84    iinc
>>     118121     2.36%    1c    iload_2
>>     110484     2.21%    a2    if_icmpge
>> 
>> $ java -XX:+PrintBytecodePairHistogram --version | head -20
>> openjdk 20-internal 2023-03-21
>> OpenJDK Runtime Environment (fastdebug build 20-internal-adhoc.haosun.jdk-src-dev)
>> OpenJDK 64-Bit Server VM (fastdebug build 20-internal-adhoc.haosun.jdk-src-dev, mixed mode)
>> 
>> Histogram of 4804441 executed bytecode pairs:
>> 
>>   absolute  relative    codes    1st bytecode        2nd bytecode
>> ----------------------------------------------------------------------
>>      77602    1.615%    84 a7    iinc                goto
>>      49749    1.035%    36 e0    istore              fast_iload
>>      48931    1.018%    e0 10    fast_iload          bipush
>>      46294    0.964%    e0 b6    fast_iload          invokevirtual
>>      42661    0.888%    a7 e0    goto                fast_iload
>>      42243    0.879%    3a 19    astore              aload
>>      40138    0.835%    19 b9    aload               invokeinterface
>>      36617    0.762%    dc 2b    fast_aload_0        aload_1
>>      35745    0.744%    b7 dc    invokespecial       fast_aload_0
>>      35384    0.736%    19 b6    aload               invokevirtual
>>      35035    0.729%    b6 de    invokevirtual       fast_aaccess_0
>>      34667    0.722%    dc b6    fast_aload_0        invokevirtual
>> 
>> 
>> In order to verfiy the correctness, I took the trace information produced by -XX:+TraceBytecodes as a cross reference. The hit times for some bytecodes/bytecode pairs can be obtained via parsing the trace. Then I compared the hit times with the corresponding "absolute" columns. I randomly selected several bytecodes/bytecode pairs, and the manual comparion results showed that "absolute" columns are correct.
>> 
>> Note-1: count_bytecode() is updated. 1) caller-saved registers are used as temporary registers and there is no need to save/restore them. 2) atomic_addw() should be used since the counter is of int type.
>> 
>> Note-2: As shown by the update in file templateInterpreterGenerator.cpp, function histogram_bytecode() should be invoked only inside !PRODUCT scope.
>
> Hao Sun has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove the atomic operation to "_index"

Marked as reviewed by ngasson (Reviewer).

-------------

PR: https://git.openjdk.org/jdk/pull/10642


More information about the hotspot-dev mailing list