RFC: improving NMethod code locality in CodeCache

Astigeevich, Evgeny eastig at amazon.co.uk
Tue Nov 23 17:34:44 UTC 2021


Hello,
 
We’d like to discuss a proposal for improving NMethod code locality in CodeCache.

We have cases where the CodeCache contains more than 15,000 compiled methods. In these cases, we saw a negative performance effect. The hot executable code is not contiguous, so branch prediction hardware can become overloaded.

The current NMethod layout is continuous and consists of the following sections:
* Header: This is C++ part of NMethod: class members and other C++ stuff. Its size is ‘sizeof(NMethod)’. Jdk17 arm64 has it to be 344 bytes. On x86_64 it is 352 bytes.
* Relocation
* Constant pool
* Instructions (main code)
* Stub code
* Oops
* Metadata: Class related metadata
* Scopes data: Debugging information
* Scopes pcs: Debugging information
* Dependencies
* Handler table: Exception handler table
* Nul chk table: Implicit Null Pointer exception table
* Speculations
* JVMCI data

We collected the section sizes of C2 nmethods in the DaCapo and Renaissance benchmarks on x86_64 and arm64. The C2 methods were got with ‘XX:+LogCompilation’. 
Summary of results for jdk17 with tiered compilation:
* DaCapo:
    * arm64 (full data https://github.com/eastig/codecache/blob/master/jdk17/dacapo_c2_sizes_arm64.csv): 
+---------------------+---------+------------+-----------+
|                     |   min   |   max      |   median  |
+---------------------+---------+------------+-----------+
| C2 nmethods         | 152     | 5215       | 916       |
| Total size - bytes  | 271,576 | 38,367,872 | 4,072,616 |
+---------------------+---------+------------+-----------+

Proportion of the total size of a section vs C2 nmethods total size

+---------------+-------+-------+--------+
|    Section    |  min  |  max  | median |
+---------------+-------+-------+--------+
| header        | 4.7%  | 19.3% | 8.0%   |
| consts        | 0.0%  | 0.1%  | 0.0%   |
| instrs        | 39.7% | 49.7% | 44.5%  |
| stub code     | 8.9%  | 11.3% | 10.1%  |
| oops          | 0.2%  | 0.4%  | 0.3%   |
| metadata      | 2.0%  | 3.0%  | 2.3%   |
| scopes data   | 12.2% | 18.6% | 15.9%  |
| scopes pcs    | 7.8%  | 9.0%  | 8.4%   |
| deps          | 0.3%  | 0.8%  | 0.5%   |
| handler table | 1.3%  | 3.3%  | 2.1%   |
| nul_chk table | 1.0%  | 1.6%  | 1.6%   |
+---------------+-------+-------+--------+

    * x86_64 (full data https://github.com/eastig/codecache/blob/master/jdk17/dacapo_c2_sizes_x86_64.csv):
+---------------------+---------+------------+-----------+
|                     |   min   |   max      |   median  |
+---------------------+---------+------------+-----------+
| C2 nmethods         | 155     | 5135       | 889       |
| Total size - bytes  | 264,800 | 35,026,312 | 3,985,744 |
+---------------------+---------+------------+-----------+

Proportion of the total size of a section vs C2 nmethods total size

+---------------+-------+-------+--------+
|    Section    |  min  |  max  | median |
+---------------+-------+-------+--------+
| header        | 5.2%  | 20.6% | 8.3%   |
| consts        | 0.0%  | 0.6%  | 0.1%   |
| instrs        | 49.2% | 60.7% | 55.3%  |
| stub code     | 1.1%  | 1.9%  | 1.4%   |
| oops          | 0.1%  | 0.3%  | 0.2%   |
| metadata      | 1.6%  | 2.9%  | 2.0%   |
| scopes data   | 12.2% | 19.6% | 16.8%  |
| scopes pcs    | 7.8%  | 9.2%  | 8.5%   |
| deps          | 0.3%  | 0.8%  | 0.5%   |
| handler table | 1.5%  | 3.5%  | 2.0%   |
| nul_chk table | 0.9%  | 1.6%  | 1.1%   |
+---------------+-------+-------+--------+

* Renaissance
    * arm64 (full data https://github.com/eastig/codecache/blob/master/jdk17/renaissance_c2_sizes_arm64.csv):
+---------------------+---------+------------+-----------+
|                     |   min   |   max      |   median  |
+---------------------+---------+------------+-----------+
| C2 nmethods         | 155     | 7447       | 1198      |
| Total size - bytes  | 366,248 | 52,840,528 | 4,989,392 |
+---------------------+---------+------------+-----------+

Proportion of the total size of a section vs C2 nmethods total size

+---------------+-------+-------+--------+
|    Section    |  min  |  max  | median |
+---------------+-------+-------+--------+
| header        | 4.8%  | 14.6% | 8.5%   |
| consts        | 0.0%  | 0.1%  | 0.0%   |
| instrs        | 35.7% | 45.6% | 42.8%  |
| stub code     | 8.3%  | 12.0% | 10.1%  |
| oops          | 0.2%  | 0.6%  | 0.4%   |
| metadata      | 2.0%  | 4.1%  | 3.0%   |
| scopes data   | 12.4% | 20.8% | 16.1%  |
| scopes pcs    | 7.8%  | 8.9%  | 8.4%   |
| deps          | 0.4%  | 1.0%  | 0.5%   |
| handler table | 1.2%  | 3.9%  | 2.4%   |
| nul_chk table | 0.9%  | 1.3%  | 1.1%   |
+---------------+-------+-------+--------+

    * x86_64 (full data https://github.com/eastig/codecache/blob/master/jdk17/renaissance_c2_sizes_x86_64.csv):

+---------------------+---------+------------+-----------+
|                     |   min   |   max      |   median  |
+---------------------+---------+------------+-----------+
| C2 nmethods         | 158     | 7242       | 938       |
| Total size - bytes  | 354,952 | 47,019,560 | 3,791,764 |
+---------------------+---------+------------+-----------+

Proportion of the total size of a section vs C2 nmethods total size

+---------------+-------+-------+--------+
|    Section    |  min  |  max  | median |
+---------------+-------+-------+--------+
| header        | 5.4%  | 15.7% | 9.7%   |
| consts        | 0.0%  | 0.1%  | 0.0%   |
| instrs        | 46.1% | 54.4% | 52.7%  |
| stub code     | 1.3%  | 1.9%  | 1.4%   |
| oops          | 0.2%  | 0.5%  | 0.3%   |
| metadata      | 1.9%  | 3.4%  | 2.6%   |
| scopes data   | 12.7% | 23.6% | 17.4%  |
| scopes pcs    | 8.0%  | 9.4%  | 8.6%   |
| deps          | 0.4%  | 1.0%  | 0.5%   |
| handler table | 1.3%  | 4.0%  | 2.5%   |
| nul_chk table | 1.0%  | 1.4%  | 1.2%   |
+---------------+-------+-------+--------+

The data show that due to intervening non-executable data in NMethods, executable code is sparse in the CodeCache. The data also show the most contributors of non-executable data are the header and scopes sections. Arm64 vs x86_64 looks consistent except the stub code. On arm64 the size of the stub code is 4-5 times bigger.

We’d like to have an option to configure the CodeCache to support C2 nmethods with separated executable code and non-executable data. According to the fixed JDK-8152664 (https://bugs.openjdk.java.net/browse/JDK-8152664) “Support non-continuous CodeBlobs in HotSpot”, NMethod sections can be located in different places of memory. The discussion of it: https://mail.openjdk.java.net/pipermail/hotspot-dev/2016-April/022500.html. Separating code will complicate maintenance of the CodeCache. Different parts of memory for a nmethod need to be allocated/released.

There is JDK-7072317 “move metadata from CodeCache” (https://bugs.openjdk.java.net/browse/JDK-7072317) which the implementation works can be done under.

There can be different approaches for the implementation:

1. What to separate:
    a. All code (main plus stub) from other sections.
    b. Or only main code because this is the code where an application should spend most of the time.
    c. Or the header and scope sections.
2. Where to put:
    a. Different segments for code and nmethod data. This will require updating NMethod because it uses code_offset, stub_offset from header_begin.
    b. The same segment but in a different part (e.g., code grows from lower addresses upwards and metadata from high addresses downwards). This might allow keeping NMethod using code_offset, stub_offset.
    c.  Or in a completely different place (C-heap, Metaspace,...)

It needs to be investigated if the separation of sections which are frequently accessed during the normal execution of the code (e.g., oop section) affects the performance negatively. We might need to change NMethodSweeper to preserve the code locality property.

We would like to get feedback on the above approaches (or something different) before implementing JDK-7072317.
 
Comments welcome!
 
Thanks,
Evgeny Astigeevich, AWS Corretto Team




Amazon Development Centre (London) Ltd. Registered in England and Wales with registration number 04543232 with its registered office at 1 Principal Place, Worship Street, London EC2A 2FA, United Kingdom.




More information about the hotspot-dev mailing list