RFR: 8295023: Interpreter(AArch64): Implement -XX:+PrintBytecodeHistogram and -XX:+PrintBytecodePairHistogram options [v2]
Hao Sun
haosun at openjdk.org
Wed Oct 12 07:50:15 UTC 2022
> In this patch, we implement functions histogram_bytecode() and histogram_bytecode_pair() for interpreter AArch64 part. Similar to count_bytecode(), we use atomic operations to update the counters as well.
>
> Here shows part of the message produced with -XX:+PrintBytecodeHistogram and -XX:+PrintBytecodePairHistogram options after this patch.
>
>
> $ java -XX:+PrintBytecodeHistogram --version | head -20
> openjdk 20-internal 2023-03-21
> OpenJDK Runtime Environment (fastdebug build 20-internal-adhoc.haosun.jdk-src-dev)
> OpenJDK 64-Bit Server VM (fastdebug build 20-internal-adhoc.haosun.jdk-src-dev, mixed mode)
>
> Histogram of 5004099 executed bytecodes:
>
> absolute relative code name
> ----------------------------------------------------------------------
> 319124 6.38% dc fast_aload_0
> 313397 6.26% e0 fast_iload
> 251436 5.02% b6 invokevirtual
> 227428 4.54% 19 aload
> 166054 3.32% a7 goto
> 159167 3.18% 2b aload_1
> 151803 3.03% de fast_aaccess_0
> 136787 2.73% 1b iload_1
> 124037 2.48% 36 istore
> 118791 2.37% 84 iinc
> 118121 2.36% 1c iload_2
> 110484 2.21% a2 if_icmpge
>
> $ java -XX:+PrintBytecodePairHistogram --version | head -20
> openjdk 20-internal 2023-03-21
> OpenJDK Runtime Environment (fastdebug build 20-internal-adhoc.haosun.jdk-src-dev)
> OpenJDK 64-Bit Server VM (fastdebug build 20-internal-adhoc.haosun.jdk-src-dev, mixed mode)
>
> Histogram of 4804441 executed bytecode pairs:
>
> absolute relative codes 1st bytecode 2nd bytecode
> ----------------------------------------------------------------------
> 77602 1.615% 84 a7 iinc goto
> 49749 1.035% 36 e0 istore fast_iload
> 48931 1.018% e0 10 fast_iload bipush
> 46294 0.964% e0 b6 fast_iload invokevirtual
> 42661 0.888% a7 e0 goto fast_iload
> 42243 0.879% 3a 19 astore aload
> 40138 0.835% 19 b9 aload invokeinterface
> 36617 0.762% dc 2b fast_aload_0 aload_1
> 35745 0.744% b7 dc invokespecial fast_aload_0
> 35384 0.736% 19 b6 aload invokevirtual
> 35035 0.729% b6 de invokevirtual fast_aaccess_0
> 34667 0.722% dc b6 fast_aload_0 invokevirtual
>
>
> In order to verfiy the correctness, I took the trace information produced by -XX:+TraceBytecodes as a cross reference. The hit times for some bytecodes/bytecode pairs can be obtained via parsing the trace. Then I compared the hit times with the corresponding "absolute" columns. I randomly selected several bytecodes/bytecode pairs, and the manual comparion results showed that "absolute" columns are correct.
>
> Note-1: count_bytecode() is updated. 1) caller-saved registers are used as temporary registers and there is no need to save/restore them. 2) atomic_addw() should be used since the counter is of int type.
>
> Note-2: As shown by the update in file templateInterpreterGenerator.cpp, function histogram_bytecode() should be invoked only inside !PRODUCT scope.
Hao Sun has updated the pull request incrementally with one additional commit since the last revision:
Introduce atomic_orrw
Introduce atomic_orrw() function as suggested by aph.
Besides, remove atomic_incw(). It's dead code.
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/10642/files
- new: https://git.openjdk.org/jdk/pull/10642/files/7e8b738a..0db39758
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=10642&range=01
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=10642&range=00-01
Stats: 52 lines in 3 files changed: 20 ins; 28 del; 4 mod
Patch: https://git.openjdk.org/jdk/pull/10642.diff
Fetch: git fetch https://git.openjdk.org/jdk pull/10642/head:pull/10642
PR: https://git.openjdk.org/jdk/pull/10642
More information about the hotspot-dev
mailing list