RFR: 8343789: Move mutable nmethod data out of CodeCache [v11]
Dean Long
dlong at openjdk.org
Thu Feb 20 22:15:02 UTC 2025
On Tue, 18 Feb 2025 19:23:59 GMT, Boris Ulasevich <bulasevich at openjdk.org> wrote:
>> This change relocates mutable data (such as relocations, oops, and metadata) from the nmethod. The change follows the recent PR #18984, which relocated immutable nmethod data from the CodeCache.
>>
>> The core idea remains the same: use the CodeCache for executable code while moving additional data to the C heap. The primary motivations are improving security and enhancing code density.
>>
>> Although performance is not the main focus, testing on AArch64 CPUs, where code density plays a significant role, has shown a 1–2% performance improvement in specific scenarios, such as the CodeCacheStress test and the Renaissance Dotty benchmark.
>>
>> The numbers. Immutable data constitutes **~30%** on the nmehtod. Mutable data constitutes **~8%** of nmethod. Example (statistics collected on the CodeCacheStress benchmark):
>> - nmethod_count:134000, total_compilation_time: 510460ms
>> - total allocation time malloc_mutable/malloc_immutable/CodeCache_alloc: 62ms/114ms/6333ms,
>> - total allocation size (mutable/immutable/nmentod): 64MB/192MB/488MB
>>
>> Functional testing: jtreg on arm/aarch/x86.
>> Performance testing: renaissance/dacapo/SPECjvm2008 benchmarks.
>>
>> Alternative solution (see comments): In the future, relocations can be moved to _immutable_data.
>
> Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits:
>
> - Address review comments: cleanup, move fields to avoid padding, fix CodeBlob purge to call os::free, fix nmethod::print, update Layout description
> - add a separate adrp_movk function to to support targets located more than 4GB away
> - Force the use of movk in combination with adrp and ldr instructions to address scenarios
> where os::malloc allocates buffers beyond the typical ±4GB range accessible with adrp
> - Fixing TestFindInstMemRecursion test fail with XX:+StressReflectiveCode option:
> _relocation_size can exceed 64Kb, in this case _metadata_offset do not fit into int16.
> Fix: use _oops_size int16 field to calculate metadata offset
> - removing dead code
> - a bit of cleanup and addressing review suggestions
> - rework movoop for not_supports_instruction_patching case: correcting in ldr_constant and relocations fixup
> - remove _code_end_offset
> - update jvm.hotspot.code.CodeBlob class
> - update: mutable data for all CodeBlobs with relocations
> - ... and 2 more: https://git.openjdk.org/jdk/compare/e1d0a9c8...6c3370be
Also, it seems like there are two kinds of code density we should be concerned about:
1. not poluting icache lines with data
2. maximizing near calls in the codecache
For 1), aligning embedded data on cache line boundaries would help, but for 2) we probably would want to put any nearby DataBlobs in their own codecache segment.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/21276#issuecomment-2672805552
More information about the hotspot-compiler-dev
mailing list