RFR: 8350852: Implement JMH benchmark for sparse CodeCache
Evgeny Astigeevich
eastigeevich at openjdk.org
Tue Mar 4 21:47:52 UTC 2025
On Tue, 4 Mar 2025 17:34:41 GMT, Evgeny Astigeevich <eastigeevich at openjdk.org> wrote:
>> This benchmark is used to check performance impact of the code cache being sparse.
>>
>> We use C2 compiler to compile the same Java method multiple times to produce as many code as needed. The Java method is not trivial. It adds two 40 digit positive integers. These compiled methods represent the active methods in the code cache. We split active methods into groups. We put a group into a fixed size code region. We make a code region aligned by its size. CodeCache becomes sparse when code regions are not fully filled. We measure the time taken to call all active methods.
>>
>> Results: code region size 2M (2097152) bytes
>> - Intel Xeon Platinum 8259CL
>>
>> |activeMethodCount |groupCount |Methods/Group |Score |Error |Units |Diff |
>> |--- |--- |--- |--- |--- |--- |--- |
>> |128 |1 |128 |19.577 |0.619 |us/op | |
>> |128 |32 |4 |22.968 |0.314 |us/op |17.30% |
>> |128 |48 |3 |22.245 |0.388 |us/op |13.60% |
>> |128 |64 |2 |23.874 |0.84 |us/op |21.90% |
>> |128 |80 |2 |23.786 |0.231 |us/op |21.50% |
>> |128 |96 |1 |26.224 |1.16 |us/op |34% |
>> |128 |112 |1 |27.028 |0.461 |us/op |38.10% |
>> |256 |1 |256 |47.43 |1.146 |us/op | |
>> |256 |32 |8 |63.962 |1.671 |us/op |34.90% |
>> |256 |48 |5 |63.396 |0.247 |us/op |33.70% |
>> |256 |64 |4 |66.604 |2.286 |us/op |40.40% |
>> |256 |80 |3 |59.746 |1.273 |us/op |26% |
>> |256 |96 |3 |63.836 |1.034 |us/op |34.60% |
>> |256 |112 |2 |63.538 |1.814 |us/op |34% |
>> |512 |1 |512 |172.731 |4.409 |us/op | |
>> |512 |32 |16 |206.772 |6.229 |us/op |19.70% |
>> |512 |48 |11 |215.275 |2.228 |us/op |24.60% |
>> |512 |64 |8 |212.962 |2.028 |us/op |23.30% |
>> |512 |80 |6 |201.335 |12.519 |us/op |16.60% |
>> |512 |96 |5 |198.133 |6.502 |us/op |14.70% |
>> |512 |112 |5 |193.739 |3.812 |us/op |12.20% |
>> |768 |1 |768 |325.154 |5.048 |us/op | |
>> |768 |32 |24 |346.298 |20.196 |us/op |6.50% |
>> |768 |48 |16 |350.746 |2.931 |us/op |7.90% |
>> |768 |64 |12 |339.445 |7.927 |us/op |4.40% |
>> |768 |80 |10 |347.408 |7.355 |us/op |6.80% |
>> |768 |96 |8 |340.983 |3.578 |us/op |4.90% |
>> |768 |112 |7 |353.949 |2.98 |us/op |8.90% |
>> |1024 |1 |1024 |368.352 |5.961 |us/op | |
>> |1024 |32 |32 |463.822 |6.274 |us/op |25.90% |
>> |1024 |48 |21 |457.674 |15.144 |us/op |24.20% |
>> |1024 |64 |16 |477.694 |0.986 |us/op |29.70% |
>> |1024 |80 |13 |484.901 |32.601 |us/op |31.60% |
>> |1024 |96 |11 |480.8 |27.088 |us/op |30.50% |
>> |1024 |112 |9 |474.416 |10.053 |us/op |28.80% |
>>
>> - AArch64 Neoverse N1
>>
>> |activeMethodCount |groupCount |Methods/Group |Score |Error |Units |Diff |...
>
> Hi @vnkozlov,
>
> I'd appreciate if you take a look at this.
> @eastig Can you make compiled code of different/random size to be more representative for real application?
Yes, I can do this.
I think the size of compiled code is not very important.
What is important the time spent in an invoked nmethod.
I can add a benchmark where there will be a random distribution of nmethods of different sizes.
For the current benchmark of nmethods of the same size, I can add a parameter causing them running different times.
What do you think?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/23831#issuecomment-2699008207
More information about the hotspot-compiler-dev
mailing list