RFR: Refactor AOTCodeCache layout to store preload entries separately [v2]

Fri Sep 5 14:59:05 UTC 2025

> Currently the mechanism to lay out final AOTCodeCache entries in `AOTCodeCache::finish_write()` is a bit convoluted.
> Code data is initially written in a temporary buffer and then assembled in the final buffer in `AOTCodeCache::finish_write()`.
> 
> In the temporary buffer AOTCodeEntry structs are added from the end of the buffer, and the payload (the actual compiled code) is added from the start of the buffer. That means the temporary buffer holds AOTCodeEntry in reverse order.
> 
> ACE=AOTCodeEntry
> 
> | payload | ... | ACE[n] | ACE[n-1] | ... | ACE[0] |
> 
> 
> When assembling the final buffer, AOTCodeEntry structs are first copied in the temporary buffer to make the order correct:
> 
> 
> | payload | ...| ACE[0] | ACE[1] | ... | ACE[n] |... | ACE[n] | ACE[n-1] | ... | ACE[0] |
> 
> 
> and then the whole memory block is copied into the final buffer.
> This means the size of the temporary buffer needs to be a bit more than required.
> 
> Another issue is the search table created in `finish_write`. This table includes entries marked for preload. However, preload entries are never looked up; they get loaded at the start of the JVM in `preload_aot_code()`. Since the preload and other code entries are mixed together,  we also need a separate table to identify the preload entries.
> 
> This PR is an attempt to fix above issues. It does final assembly in following steps:
> 1. Process AOTCodeEntry structs in the temporary buffer in reverse order and write the ones marked for preload in the final buffer
> 2. Now the payload for the preload entries is marked
> 3. Next, add the AOTCodeEntry structs for non-preload code to the final buffer
> 4. Then add the payload for these entries
> 5. Finally add the search table
> 
> 
> | ACE[0] | ... | ACE[m] | payload | ACE[0] | ... | ACE[n] | payload | search_table |
> 
> 
> This layout separates the preload entries from rest of the code and these entries can then be processed sequentially when the cache is loaded. There is no need for a separate table to identify the preload entries.
> 
> I have added the new functionality in separate methods suffixed with `_new` (eg `finish_write_new` and `preload_aot_code_new`) and they are guarded by `UseNewCode` flag.
> 
> **Performance impact:**
> 
> Startup numbers for spring-boot-getting-started:
> 
> Run,Old CDS + AOT,New CDS + AOT
> 1,263,275
> 2,265,278
> 3,266,272
> 4,277,271
> 5,265,265
> 6,264,261
> 7,266,263
> 8,258,266
> 9,275,268
> 10,277,263
> Geomean,267.53,268.15
> Stdev,6.14,5.34
> 
> 
> AOTCache size comparison:
> 
> -XX:-UseNewCode: 65613824 bytes
> -XX:+UseNewCode: 65597440 bytes...

Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision:

  Remove UseNewCode and older version of functions

  Signed-off-by: Ashutosh Mehra <asmehra at redhat.com>

-------------

Changes:
  - all: https://git.openjdk.org/leyden/pull/95/files
  - new: https://git.openjdk.org/leyden/pull/95/files/0e76a2d3..f1e95d7b

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=leyden&pr=95&range=01
 - incr: https://webrevs.openjdk.org/?repo=leyden&pr=95&range=00-01

  Stats: 293 lines in 2 files changed: 0 ins; 263 del; 30 mod
  Patch: https://git.openjdk.org/leyden/pull/95.diff
  Fetch: git fetch https://git.openjdk.org/leyden.git pull/95/head:pull/95

PR: https://git.openjdk.org/leyden/pull/95