RFR: 8374045: Add support to run benchmarking with fragmented CodeCache
Chad Rakoczy
duke at openjdk.org
Thu Jan 29 20:36:48 UTC 2026
On Fri, 19 Dec 2025 20:08:20 GMT, Chad Rakoczy <duke at openjdk.org> wrote:
> [JDK-8374045](https://bugs.openjdk.org/browse/JDK-8374045)
>
> This PR adds a new utility tool CodeCacheFragmenter to help with testing HotSpot code cache fragmentation scenarios. The tool is a Java agent that uses the WhiteBox API to create and randomly free dummy code blobs in the NonProfiled code heap to achieve a specified fill percentage. It includes configurable parameters for blob sizes, target fill percentage (0-100%), and random seeding to enable reproducible fragmentation patterns. The utility is built via a standard Makefile and produces `codecachefragmenter.jar` which can be used as a Java agent with `-javaagent` flag. This tool is intended for performance testing and experimentation with code cache behavior under various fragmentation conditions.
>
> This is useful to show the performance benefits of [JDK-8326205](https://bugs.openjdk.org/browse/JDK-8326205)
>
> With the same amount of free code cache memory, adding fragmentation increases execution time of Renaissance Dotty by ~5x. See https://github.com/openjdk/jdk/pull/28934#issuecomment-3820151693 for more details.
Performance results for 3 runs. All runs have 64m of usable code cache for C2 code.
1. NonProfiledCodeHeapSize=64m with no fragmentation.
2. NonProfiledCodeHeapSize=128m with half (64m) filled up with dummy blobs
3. NonProfiledCodeHeapSize=112m with half (56m) filled up with dummy blobs and 8m of HotCodeCache using #27858
The results show that code cache fragmentation can significantly degrade performance. Introducing fragmentation increases execution time by ~5x and is accompanied by a large increase in branch mispredictions (~7x).
Run 3 utilizes [JDK-8326205](https://bugs.openjdk.org/browse/JDK-8326205) which shows grouping hot code can significantly improve performance reducing the degradation from ~5x to ~1.5x.
# Run1 (No frag)
perf stat ./build/linux-aarch64-server-release/jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+UnlockDiagnosticVMOptions -XX:+AlwaysPreTouch -XX:+PrintCodeCache -XX:-TieredCompilation -XX:ReservedCodeCacheSize=72m -XX:InitialCodeCacheSize=72m -XX:+SegmentedCodeCache -XX:+WhiteBoxAPI -XX:NonNMethodCodeHeapSize=8m -XX:ProfiledCodeHeapSize=0 -XX:NonProfiledCodeHeapSize=64m -jar renaissance-mit-0.16.0.jar -r 1190 dotty
...
====== dotty (scala) [default], iteration 1188 completed (746.699 ms) ======
====== dotty (scala) [default], iteration 1189 completed (748.863 ms) ======
Performance counter stats:
3031251.35 msec task-clock:u # 2.164 CPUs utilized
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
199250260 page-faults:u # 65.732 K/sec
3224539067410 cycles:u # 1.064 GHz
6062234799740 instructions:u # 1.88 insn per cycle
<not supported> branches:u
28253496347 branch-misses:u
1401.045295428 seconds time elapsed
1262.547605000 seconds user
1769.354116000 seconds sys
# Run 2 (128m 0.5frag 0mHCH)
perf stat ./build/linux-aarch64-server-release/jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+UnlockDiagnosticVMOptions -XX:+AlwaysPreTouch -XX:+PrintCodeCache -XX:-TieredCompilation -XX:ReservedCodeCacheSize=136m -XX:InitialCodeCacheSize=136m -XX:+SegmentedCodeCache -XX:+WhiteBoxAPI -XX:NonNMethodCodeHeapSize=8m -XX:ProfiledCodeHeapSize=0 -XX:NonProfiledCodeHeapSize=128m -XX:+WhiteBoxAPI -javaagent:codecachefragmenter.jar=FillPercentage=50.0,RandomSeed=42 -Xbootclasspath/a:codecachefragmenter.jar -jar renaissance-mit-0.16.0.jar -r 1190 dotty
====== dotty (scala) [default], iteration 1188 completed (4022.919 ms) ======
====== dotty (scala) [default], iteration 1189 completed (4038.655 ms) ======
Performance counter stats:
6321396.03 msec task-clock:u # 1.135 CPUs utilized
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
70083590 page-faults:u # 11.087 K/sec
14233471439404 cycles:u # 2.252 GHz
15400934891881 instructions:u # 1.08 insn per cycle
<not supported> branches:u
200741000177 branch-misses:u
5568.255657415 seconds time elapsed
5498.865476000 seconds user
824.319759000 seconds sys
# Run 3 (112m 0.5frag 8mHCH)
perf stat ./build/linux-aarch64-server-release/jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+UnlockDiagnosticVMOptions -XX:+AlwaysPreTouch -XX:+PrintCodeCache -XX:-TieredCompilation -XX:ReservedCodeCacheSize=128m -XX:InitialCodeCacheSize=128m -XX:+SegmentedCodeCache -XX:+WhiteBoxAPI -XX:NonNMethodCodeHeapSize=8m -XX:ProfiledCodeHeapSize=0 -XX:NonProfiledCodeHeapSize=112m -XX:+HotCodeHeap -XX:HotCodeHeapSize=8m -XX:+WhiteBoxAPI -javaagent:codecachefragmenter.jar=FillPercentage=50.0,RandomSeed=42 -Xbootclasspath/a:codecachefragmenter.jar -jar renaissance-mit-0.16.0.jar -r 1190 dotty
====== dotty (scala) [default], iteration 1188 completed (1137.186 ms) ======
====== dotty (scala) [default], iteration 1189 completed (1134.037 ms) ======
Performance counter stats:
3936522.85 msec task-clock:u # 1.466 CPUs utilized
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
119342149 page-faults:u # 30.317 K/sec
7050404447349 cycles:u # 1.791 GHz
9984653084133 instructions:u # 1.42 insn per cycle
<not supported> branches:u
74832021136 branch-misses:u
2684.809939553 seconds time elapsed
2735.116925000 seconds user
1203.549873000 seconds sys
-------------
PR Comment: https://git.openjdk.org/jdk/pull/28934#issuecomment-3820151693
More information about the compiler-dev
mailing list