RFR: 8339573: Update CodeCacheSegmentSize and CodeEntryAlignment for ARM

Igor Veresov iveresov at openjdk.org
Mon Sep 9 18:08:05 UTC 2024


On Thu, 5 Sep 2024 00:58:10 GMT, Boris Ulasevich <bulasevich at openjdk.org> wrote:

> With this change, I have adjusted the default settings for CodeCacheSegmentSize and CodeEntryAlignment for AARCH and ARM32. The main goal is to improve code density by reducing the number of wasted bytes (approximately **4%** waste). Improving code density may also have the side effect of boosting performance in large applications
> 
> Each nmethod occupies a number of code cache segments (minimum allocation blocks). Since the size of an nmethod is not aligned to 128 bytes, the last occupied segment is half empty. Reducing the size of the code cache segments correspondingly minimizes waste. However, we should be careful about reducing the CodeCacheSegmentSize too much, as smaller segment sizes will increase the overhead of the CodeHeap::_segmap bitmap. A CodeCacheSegmentSize of 64 seems to be an optimal balance.
> 
> The current large default value for CodeCacheSegmentSize (64+64) was historically introduced with the comment "Tiered compilation has large code-entry alignment" which doesn't make much sense to me. The history of this comment and value is as follows:
> - The PPC port was introduced with CodeEntryAlignment=128 (recently reduced to 64: https://github.com/openjdk/jdk/commit/09a78b5d) and CodeCacheSegmentSize was adjusted accordingly for that platform.
> - Soon after, the 128-byte alignment was applied to all platforms to hide a debug mode warning (https://github.com/openjdk/jdk/commit/e8bc971d). Despite the change (and Segmented Code Cache introduced later), the warning can still be reproduced today using the -XX:+VerifyCodeCache fastdebug option in large applications (10K nmethods ~ 10K free blocks in between them).
> 
> I believe it is time to remove the comment and update the default value.
> 
> I also suggest updating the default CodeEntryAlignment value for AARCH. The current setting is much larger than for x86 and was likely based on the typical cache line size of 64 bytes. Cortex-A57, A72 architecture software optimisation guides recommend a 32-byte alignment for subroutine entry points. Neoverse architecture software optimisation guides do not mention recommended entry point alignment.
> 
> For reference, the default [function_align setting in GCC](https://github.com/gcc-mirror/gcc/blob/master/gcc/config/aarch64/tuning_models/cortexa72.h#L44) is typically 16 or 32 bytes, depending on the target architecture.
> 
> Hotspot performance tests with -XX:CodeCacheSegmentSize=64 and -XX:CodeEntryAlignment=16 options showed the following results:
> - No performance impact on ...

I don't quite remember making this change... And I don't remember any reasons as to why it might have been needed.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20864#issuecomment-2338766859


More information about the hotspot-dev mailing list