RFR: 8326205: Grouping frequently called C2 nmethods in CodeCache [v6]

George Wort duke at openjdk.org
Mon Jan 19 10:09:57 UTC 2026


On Wed, 14 Jan 2026 01:33:37 GMT, Chad Rakoczy <duke at openjdk.org> wrote:

>> ### Summary
>> This PR implements [JDK-8326205](https://bugs.openjdk.org/browse/JDK-8326205), introducing experimental support for grouping hot code within the CodeCache.
>> 
>> ### Description
>> The feature works by periodically sampling the execution of C2-compiled methods to identify hot code, then relocating those methods into a dedicated `HotCodeHeap` section of the CodeCache.
>> 
>> Sampling is performed by the `HotCodeSampler`, which runs on a new dedicated `HotCodeGrouper` thread. The thread wakes up every `HotCodeIntervalSeconds` (default 300s) and collects samples for a duration of `HotCodeSampleSeconds` (default 120s). During each sampling period, it iterates over all Java threads, inspects their last Java frame, obtains the current program counter (PC), and maps it to the corresponding nmethod. This allows the sampler to maintain a profile of the most frequently executed methods.
>> 
>> The `HotCodeGrouper` uses the sampling data to select methods for grouping. Methods are ranked by sample count to form the candidate set. The grouper then relocates these methods (along with their callees, which has been shown to improve performance on AArch64 due to better branch prediction) into the `HotCodeHeap` in descending order of hotness, continuing until the fraction of samples attributable to hot methods exceeds `HotCodeSampleRatio` (default 0.8). The process continues to ensure that the hot-method ratio remains above the threshold.
>> 
>> The `HotCodeHeap` is a new code heap segment with a default size of 20% of the non-profiled heap, though this can be overridden. This size was chosen based on the principle that roughly 20% of methods contribute to 80% of the work. Only C2-compiled nmethods are eligible for relocation, and the relocation process leverages existing infrastructure from [JDK-8316694](https://bugs.openjdk.org/browse/JDK-8316694).
>> 
>> Relocation occurs entirely on the grouper thread and runs concurrently with the application. To maintain correctness, the thread acquires the `CodeCache_lock` and `Compile_lock` during relocation but releases these locks between individual relocations to avoid blocking GC safepoints. Removal of nmethods from the `HotCodeHeap` is handled by the GC.
>> 
>> ### Performance
>> Testing has shown up to a 20% latency reduction in an internal service with a large CodeCache (512 MB). Public benchmark results are forthcoming.
>> 
>> ### Testing
>> * CodeCache tests have been updated to cover the new `HotCodeHeap`.  
>> * Added ded...
>
> Chad Rakoczy has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix builds

Hi, 

I've played around with this PR a bit and had a few thoughts.

The way the grouping is set up currently means that if you run the program for long enough you will keep adding profiled code onto the hot code heap, even if it doesn't really meet the definition of hot. This also means that if the program changes "phase", and the hot code changes, the hot code heap might already be full and you will be unable to compact the new hot code. Have you thought about adding some kind of refresh/reset when the hot code heap is full, to purge code that has not appeared in recent profiles?

Other small configuration changes that helped me try this out, adding a delay variable to avoid profiling the setup period of a program, and making the sampling period configurable.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/27858#issuecomment-3759597826


More information about the hotspot-compiler-dev mailing list