RFR: 8358329: AArch64: emit direct branches in static stubs for small code caches [v2]
    Mikhail Ablakatov 
    mablakatov at openjdk.org
       
    Wed Jun 11 15:37:47 UTC 2025
    
    
  
> In the A64 ISA, the B (direct branch) instruction can encode a target within a ±128MB range relative to the instruction. Due to this limitation, when generating static stubs, HotSpot conservatively emits indirect branches for calls to c2i interface stubs. These indirect branches are implemented using a four-instruction sequence: three instructions to materialize the target address in a register, followed by a BR instruction to perform the jump.
> 
> This patch optimizes static stub generation when the code cache is small enough to guarantee that the target entry point of the c2i interface stub lies within the direct branch range. In such cases, a single direct B instruction can be used instead of the indirect sequence, saving 3 instructions (12 bytes) per static stub.
> 
> Below is an example of the optimization's impact, measured using the movie-lens benchmark from the Renaissance benchmark suite:
> 
> | Metric      | Before        | After         | Difference |
> |-------------|---------------|---------------|------------|
> | totalInHeap | Avg: 1883.875 | Avg: 1871.667 | -0.65%     |
> |             | Sum: 6653848  | Sum: 6616344  | -0.56%     |
> | stubCode    | Avg: 103.164  | Avg: 87.285   | -15.38%    |
> |             | Sum: 364376   | Sum: 308552   | -15.33%    |
> 
> Full jtreg passed on AArch64.
Mikhail Ablakatov has updated the pull request incrementally with one additional commit since the last revision:
  address review comments: use pd_patch_instruction directly
  
  MacroAssembler::pd_patch_instruction can distinguish between the `b`
  and `movk movz movz br` sequences. Strictly speaking, the method
  patches not a single instruction but a semantically joint sequence of
  instructions. Use it directly instead of `NativeJump` and
  `NativeGeneralJump` wrapper classes to simplify the implementation and
  get rid of an extra icache invalidation.
  
  Other changes in the patch simply clean up code that became redundant.
-------------
Changes:
  - all: https://git.openjdk.org/jdk/pull/25702/files
  - new: https://git.openjdk.org/jdk/pull/25702/files/a904f1c1..7ef1c4ae
Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25702&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25702&range=00-01
  Stats: 28 lines in 3 files changed: 0 ins; 25 del; 3 mod
  Patch: https://git.openjdk.org/jdk/pull/25702.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25702/head:pull/25702
PR: https://git.openjdk.org/jdk/pull/25702
    
    
More information about the hotspot-dev
mailing list