RFR: 8280481: Duplicated static stubs in NMethod Stub Code section [v7]
Evgeny Astigeevich
duke at openjdk.java.net
Fri May 20 16:32:02 UTC 2022
On Fri, 20 May 2022 16:20:24 GMT, Evgeny Astigeevich <duke at openjdk.java.net> wrote:
>> Calls of Java methods have stubs to the interpreter for the cases when an invoked Java method is not compiled. Calls of static Java methods and final Java methods have statically bound information about a callee during compilation. Such calls can share stubs to the interpreter.
>>
>> Each stub to the interpreter has a relocation record (accessed via `relocInfo`) which provides an address of the stub and an address of its owner. `relocInfo` has an offset which is an offset from the previously know relocatable address. The address of a stub is calculated as the address provided by the previous `relocInfo` plus the offset.
>>
>> Each Java call has:
>> - A relocation for a call site.
>> - A relocation for a stub to the interpreter.
>> - A stub to the interpreter.
>> - If far jumps are used (arm64 case):
>> - A trampoline relocation.
>> - A trampoline.
>>
>> We cannot avoid creating relocations. They are needed to support patching call sites and stubs.
>>
>> One approach to create shared stubs to keep track of created stubs. If the needed stub exist we use its address and create only needed relocation information. The `relocInfo` for a created stub will have a positive offset. As relocations for different stubs can be created after that, a relocation for a shared stub will have a negative offset relative to the address provided by the previous relocation:
>>
>> reloc1 ---> 0x0: stub1
>> reloc2 ---> 0x4: stub2 (reloc2.addr = reloc1.addr + reloc2.offset = 0x0 + 4)
>> reloc3 ---> 0x0: stub1 (reloc3.addr = reloc2.addr + reloc3.offset = 0x4 - 4)
>>
>> According to [relocInfo.hpp](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/code/relocInfo.hpp#L237):
>>
>> // [About Offsets] Relative offsets are supplied to this module as
>> // positive byte offsets, but they may be internally stored scaled
>> // and/or negated, depending on what is most compact for the target
>> // system. Since the object pointed to by the offset typically
>> // precedes the relocation address, it is profitable to store
>> // these negative offsets as positive numbers, but this decision
>> // is internal to the relocation information abstractions.
>>
>> However, `CodeSection` does not support negative offsets. It [assumes](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/asm/codeBuffer.hpp#L195) addresses relocations pointing at grow upward:
>>
>> class CodeSection {
>> ...
>> private:
>> ...
>> address _locs_point; // last relocated position (grows upward)
>> ...
>> void set_locs_point(address pc) {
>> assert(pc >= locs_point(), "relocation addr may not decrease");
>> assert(allocates2(pc), "relocation addr must be in this section");
>> _locs_point = pc;
>> }
>>
>> Negative offsets reduce the offset range by half. This can cause the increase of filler records, the empty `relocInfo` records to reduce offset values. Also negative offsets are only needed for `static_stub_type`, but other 13 types don’t need them.
>>
>> This PR implements another approach: postponed creation of stubs. First we collect requests for creating shared stubs. Then we have the finalisation phase, where shared stubs are created in `CodeBuffer`. This approach does not need negative offsets. Supported platforms are x86, x86_64 and aarch64.
>>
>> There is a new diagnostic option: `UseSharedStubs`. Its default value for x86, x86_64 and aarch64 is set true.
>>
>> **Results from [Renaissance 0.14.0](https://github.com/renaissance-benchmarks/renaissance/releases/tag/v0.14.0)**
>> Note: 'Nmethods with shared stubs' is the total number of nmethods counted during benchmark's run. 'Final # of nmethods' is a number of nmethods in CodeCache when JVM exited.
>> - AArch64
>>
>> +------------------+-------------+----------------------------+---------------------+
>> | Benchmark | Saved bytes | Nmethods with shared stubs | Final # of nmethods |
>> +------------------+-------------+----------------------------+---------------------+
>> | dotty | 820544 | 4592 | 18872 |
>> | dec-tree | 405280 | 2580 | 22335 |
>> | naive-bayes | 392384 | 2586 | 21184 |
>> | log-regression | 362208 | 2450 | 20325 |
>> | als | 306048 | 2226 | 18161 |
>> | finagle-chirper | 262304 | 2087 | 12675 |
>> | movie-lens | 250112 | 1937 | 13617 |
>> | gauss-mix | 173792 | 1262 | 10304 |
>> | finagle-http | 164320 | 1392 | 11269 |
>> | page-rank | 155424 | 1175 | 10330 |
>> | chi-square | 140384 | 1028 | 9480 |
>> | akka-uct | 115136 | 541 | 3941 |
>> | reactors | 43264 | 335 | 2503 |
>> | scala-stm-bench7 | 42656 | 326 | 3310 |
>> | philosophers | 36576 | 256 | 2902 |
>> | scala-doku | 35008 | 231 | 2695 |
>> | rx-scrabble | 32416 | 273 | 2789 |
>> | future-genetic | 29408 | 260 | 2339 |
>> | scrabble | 27968 | 225 | 2477 |
>> | par-mnemonics | 19584 | 168 | 1689 |
>> | fj-kmeans | 19296 | 156 | 1647 |
>> | scala-kmeans | 18080 | 140 | 1629 |
>> | mnemonics | 17408 | 143 | 1512 |
>> +------------------+-------------+----------------------------+---------------------+
>>
>> - X86_64
>>
>> +------------------+-------------+----------------------------+---------------------+
>> | Benchmark | Saved bytes | Nmethods with shared stubs | Final # of nmethods |
>> +------------------+-------------+----------------------------+---------------------+
>> | dotty | 337065 | 4403 | 19135 |
>> | dec-tree | 183045 | 2559 | 22071 |
>> | naive-bayes | 176460 | 2450 | 19782 |
>> | log-regression | 162555 | 2410 | 20648 |
>> | als | 121275 | 1980 | 17179 |
>> | movie-lens | 111915 | 1842 | 13020 |
>> | finagle-chirper | 106350 | 1947 | 12726 |
>> | gauss-mix | 81975 | 1251 | 10474 |
>> | finagle-http | 80895 | 1523 | 12294 |
>> | page-rank | 68940 | 1146 | 10124 |
>> | chi-square | 62130 | 974 | 9315 |
>> | akka-uct | 50220 | 555 | 4263 |
>> | reactors | 23385 | 371 | 2544 |
>> | philosophers | 17625 | 259 | 2865 |
>> | scala-stm-bench7 | 17235 | 295 | 3230 |
>> | scala-doku | 15600 | 214 | 2698 |
>> | rx-scrabble | 14190 | 262 | 2770 |
>> | future-genetic | 13155 | 253 | 2318 |
>> | scrabble | 12300 | 217 | 2352 |
>> | fj-kmeans | 8985 | 157 | 1616 |
>> | par-mnemonics | 8535 | 155 | 1684 |
>> | scala-kmeans | 8250 | 138 | 1624 |
>> | mnemonics | 7485 | 134 | 1522 |
>> +------------------+-------------+----------------------------+---------------------+
>>
>>
>> **Testing: fastdebug and release builds for x86, x86_64 and aarch64**
>> - `tier1`...`tier4`: Passed
>> - `hotspot/jtreg/compiler/sharedstubs`: Passed
>
> Evgeny Astigeevich has updated the pull request incrementally with 522 additional commits since the last revision:
>
> - Remove non-existing option
> - Use call offset instead of caller pc
> - Add UseSharedStubs option
> - Add a test and implementation fixes
> - 8280481: Duplicated static stubs in NMethod Stub Code section
> - 8286858: Remove dead code in sun.reflect.misc.MethodUtil
>
> Reviewed-by: mchung, iris
> - 8285962: NimbusDefaults has a typo in a L&F property
>
> Reviewed-by: prr
> - 8287013: StringConcatFactory: remove short and byte mixers/prependers
>
> Reviewed-by: jlaskey
> - 8286893: G1: Recent card set coarsening statistics wrong
>
> Reviewed-by: tschatzl, ayang
> - 8286943: G1: With virtualized remembered sets, maximum number of cards configured is wrong
>
> Reviewed-by: ayang, iwalulya
> - ... and 512 more: https://git.openjdk.java.net/jdk/compare/718f3b05...3db0f157
Create a new PR after synchronizing with tip: https://github.com/openjdk/jdk/pull/8816
-------------
PR: https://git.openjdk.java.net/jdk/pull/8024
More information about the hotspot-compiler-dev
mailing list