RFR: 8343789: Move mutable nmethod data out of CodeCache [v13]

Mon Mar 3 20:11:59 UTC 2025

On Thu, 27 Feb 2025 14:31:31 GMT, Boris Ulasevich <bulasevich at openjdk.org> wrote:

>> This change relocates mutable data (such as relocations, metadata, jvmci data) from the nmethod. The change follows the recent PR #18984, which relocated immutable nmethod data out of the CodeCache.
>> 
>> OOPs was initially moved to a new mutable data blob, but then moved back to nmethod due to performance issues on dacapo benchmarks on aarch with ShenandoagGC (why Shenandoah: it is the only GC with supports_instruction_patching=false - it requires loading from the oops table in compiled code, which takes three instructions for a remote data).
>> 
>> Although performance is not the main focus, testing on AArch64 CPUs, where code density plays a significant role, has shown a 1–2% performance improvement in specific scenarios, such as the CodeCacheStress test and the Renaissance Dotty benchmark.
>> 
>> The numbers. Immutable data constitutes **~30%** on the nmehtod. Mutable data constitutes **~8%** of nmethod. Example (statistics collected on the CodeCacheStress benchmark):
>> - nmethod_count:134000, total_compilation_time: 510460ms
>> - total allocation time malloc_mutable/malloc_immutable/CodeCache_alloc: 62ms/114ms/6333ms,
>> - total allocation size (mutable/immutable/nmentod): 64MB/192MB/488MB
>> 
>> Functional testing: jtreg on arm/aarch/x86.
>> Performance testing: renaissance/dacapo/SPECjvm2008 benchmarks.
>> 
>> Alternative solution (see comments): In the future, relocations can be moved to _immutable_data.
>
> Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 14 commits:
> 
>  - cleanup
>  - returning oops back to nmethods. jtreg: Ok, performance: Ok. todo: cleanup
>  - Address review comments: cleanup, move fields to avoid padding, fix CodeBlob purge to call os::free, fix nmethod::print, update Layout description
>  - add a separate adrp_movk function to to support targets located more than 4GB away
>  - Force the use of movk in combination with adrp and ldr instructions to address scenarios
>    where os::malloc allocates buffers beyond the typical ±4GB range accessible with adrp
>  - Fixing TestFindInstMemRecursion test fail with XX:+StressReflectiveCode option:
>    _relocation_size can exceed 64Kb, in this case _metadata_offset do not fit into int16.
>    Fix: use _oops_size int16 field to calculate metadata offset
>  - removing dead code
>  - a bit of cleanup and addressing review suggestions
>  - rework movoop for not_supports_instruction_patching case: correcting in ldr_constant and relocations fixup
>  - remove _code_end_offset
>  - ... and 4 more: https://git.openjdk.org/jdk/compare/3c9d64eb...56c0cc78

As agreed, I moved oops back to nmethod, significantly reducing the change. All AArch-specific modifications (encoding of long load with adrp+movk+ldr and its relocation patching) were reverted.

Testing results:
- Builds: AArch & x86, client build, GraalVM build
- jtreg (hotspot & jdk tier1-3): G1/ZGC/Shenandoah/Xcomp/TieredStopAtLevel=3/-TieredCompilation - No regressions
- DaCapo & Renaissance benchmarks - No regressions

Here is the PrintNMethodStatistics printout. It shows that in the application (a large Renaissance Dotty benchmark), we observe a significant reduction in CodeCache usage.

Statistics for 20625 bytecoded nmethods for C1:
 total size      = 121587728 (100%)
 in CodeCache    = 80406760 (66.130653%)
   header        = 4950000 (6.156199%)
   constants     = 640 (0.000796%)
   main code     = 69890600 (86.921303%)
   stub code     = 4923768 (6.123575%)
   oops          = 476752 (0.592925%)
 mutable data    = 10163920 (8.359330%)
   relocation    = 6810824 (67.009811%)
   metadata      = 3353096 (32.990185%)
 immutable data  = 31017048 (25.510014%)
   dependencies  = 606216 (1.954461%)
   nul chk table = 724344 (2.335309%)
   handler table = 222464 (0.717231%)
   scopes pcs    = 15817888 (50.997398%)
   scopes data   = 13646136 (43.995602%)
Statistics for 8290 bytecoded nmethods for C2 | Statistics for 8442 bytecoded nmethods for JVMCI
 total size      = 66679688 (100%)            |  total size      = 46208136 (100%)
 in CodeCache    = 26004920 (38.999763%)      |  in CodeCache    = 19489616 (42.177887%)
   header        = 1989600 (7.650860%)        |    header        = 2026080 (10.395690%)
   constants     = 1920 (0.007383%)           |    constants     = 540288 (2.772184%)
   main code     = 20949456 (80.559586%)      |    main code     = 14737620 (75.617805%)
   stub code     = 2702064 (10.390588%)       |    stub code     = 1904548 (9.772117%)
   oops          = 295560 (1.136554%)         |    oops          = 213544 (1.095681%)
 mutable data    = 6564928 (9.845469%)        |  mutable data    = 4168848 (9.021892%)
   relocation    = 3542736 (53.964584%)       |    relocation    = 1671384 (40.092228%)
                                              >    JVMCI data    = 202608 (4.860048%)
   metadata      = 3022192 (46.035416%)       |    metadata      = 2294856 (55.047726%)
 immutable data  = 34109840 (51.154766%)      |  immutable data  = 22549672 (48.800220%)
   dependencies  = 988000 (2.896525%)         |    dependencies  = 460104 (2.040402%)
   nul chk table = 554680 (1.626158%)         |    nul chk table = 618888 (2.744554%)
   handler table = 1787424 (5.240201%)        |    handler table = 20664 (0.091638%)
   scopes pcs    = 16152224 (47.353561%)      |    scopes pcs    = 10965040 (48.626163%)
   scopes data   = 14627512 (42.883556%)      |    scopes data   = 7746888 (34.354771%)
                                              >    speculations  = 2738088 (12.142474%)

By moving mutable data out of the CodeCache, we reduce CodeCache usage by the following percentages:
- C1: 10163920/(10163920+80406760) = 11%
- C2: 6564928/(6564928+26004920) = 20%
- JMVTI: 4168848/(4168848+19489616) = 18%

-------------

PR Comment: https://git.openjdk.org/jdk/pull/21276#issuecomment-2695425491
PR Comment: https://git.openjdk.org/jdk/pull/21276#issuecomment-2695429860