RFR: 8350852: Implement JMH benchmark for sparse CodeCache

Evgeny Astigeevich eastigeevich at openjdk.org
Fri Mar 14 18:34:54 UTC 2025


On Thu, 27 Feb 2025 22:23:23 GMT, Evgeny Astigeevich <eastigeevich at openjdk.org> wrote:

> This benchmark is used to check performance impact of the code cache being sparse.
> 
> We use C2 compiler to compile the same Java method multiple times to produce as many code as needed. The Java method is not trivial. It adds two 40 digit positive integers. These compiled methods represent the active methods in the code cache. We split active methods into groups. We put a group into a fixed size code region. We make a code region aligned by its size. CodeCache becomes sparse when code regions are not fully filled. We measure the time taken to call all active methods.
> 
> Results: code region size 2M (2097152) bytes
> - Intel Xeon Platinum 8259CL
> 
> |activeMethodCount	|groupCount	|Methods/Group	|Score	|Error	|Units	|Diff	|
> |---	|---	|---	|---	|---	|---	|---	|
> |128	|1	|128	|19.577	|0.619	|us/op	|	|
> |128	|32	|4	|22.968	|0.314	|us/op	|17.30%	|
> |128	|48	|3	|22.245	|0.388	|us/op	|13.60%	|
> |128	|64	|2	|23.874	|0.84	|us/op	|21.90%	|
> |128	|80	|2	|23.786	|0.231	|us/op	|21.50%	|
> |128	|96	|1	|26.224	|1.16	|us/op	|34%	|
> |128	|112	|1	|27.028	|0.461	|us/op	|38.10%	|
> |256	|1	|256	|47.43	|1.146	|us/op	|	|
> |256	|32	|8	|63.962	|1.671	|us/op	|34.90%	|
> |256	|48	|5	|63.396	|0.247	|us/op	|33.70%	|
> |256	|64	|4	|66.604	|2.286	|us/op	|40.40%	|
> |256	|80	|3	|59.746	|1.273	|us/op	|26%	|
> |256	|96	|3	|63.836	|1.034	|us/op	|34.60%	|
> |256	|112	|2	|63.538	|1.814	|us/op	|34%	|
> |512	|1	|512	|172.731	|4.409	|us/op	|	|
> |512	|32	|16	|206.772	|6.229	|us/op	|19.70%	|
> |512	|48	|11	|215.275	|2.228	|us/op	|24.60%	|
> |512	|64	|8	|212.962	|2.028	|us/op	|23.30%	|
> |512	|80	|6	|201.335	|12.519	|us/op	|16.60%	|
> |512	|96	|5	|198.133	|6.502	|us/op	|14.70%	|
> |512	|112	|5	|193.739	|3.812	|us/op	|12.20%	|
> |768	|1	|768	|325.154	|5.048	|us/op	|	|
> |768	|32	|24	|346.298	|20.196	|us/op	|6.50%	|
> |768	|48	|16	|350.746	|2.931	|us/op	|7.90%	|
> |768	|64	|12	|339.445	|7.927	|us/op	|4.40%	|
> |768	|80	|10	|347.408	|7.355	|us/op	|6.80%	|
> |768	|96	|8	|340.983	|3.578	|us/op	|4.90%	|
> |768	|112	|7	|353.949	|2.98	|us/op	|8.90%	|
> |1024	|1	|1024	|368.352	|5.961	|us/op	|	|
> |1024	|32	|32	|463.822	|6.274	|us/op	|25.90%	|
> |1024	|48	|21	|457.674	|15.144	|us/op	|24.20%	|
> |1024	|64	|16	|477.694	|0.986	|us/op	|29.70%	|
> |1024	|80	|13	|484.901	|32.601	|us/op	|31.60%	|
> |1024	|96	|11	|480.8	|27.088	|us/op	|30.50%	|
> |1024	|112	|9	|474.416	|10.053	|us/op	|28.80%	|
> 
> - AArch64 Neoverse N1
> 
> |activeMethodCount	|groupCount	|Methods/Group	|Score	|Error	|Units	|Diff	|
> |---	|---	|---	|---	|---	|---	|---	|
> |128	|1	|128	|25.297	|0.792	|us/op	|	|
> |128	|32	|4	|31.451...

AMD 4th Gen EPYC (Genoa) results


|activeMethodCount	|groupCount	|Methods/Group	|Score	|Error	|Diff	|
|---	|---	|---	|---	|---	|---	|
|128	|1	|128	|14.71	|0.042	|	|
|128	|32	|4	|19.381	|0.04	|31.80%	|
|128	|48	|3	|19.998	|0.099	|35.90%	|
|128	|64	|2	|20.965	|0.097	|42.50%	|
|128	|80	|2	|20.988	|0.121	|42.70%	|
|128	|96	|1	|21.442	|0.141	|45.80%	|
|128	|112	|1	|20.985	|0.05	|42.70%	|
|256	|1	|256	|31.282	|0.072	|	|
|256	|32	|8	|41.516	|0.252	|32.70%	|
|256	|48	|5	|43.29	|0.45	|38.40%	|
|256	|64	|4	|45.71	|0.321	|46.10%	|
|256	|80	|3	|45.682	|0.325	|46%	|
|256	|96	|3	|47.03	|0.168	|50.30%	|
|256	|112	|2	|47.761	|0.609	|52.70%	|
|512	|1	|512	|69.02	|0.742	|	|
|512	|32	|16	|97.437	|0.9	|41.20%	|
|512	|48	|11	|97.472	|0.481	|41.20%	|
|512	|64	|8	|102.799	|0.519	|48.90%	|
|512	|80	|6	|104.942	|0.278	|52%	|
|512	|96	|5	|106.29	|0.182	|54%	|
|512	|112	|5	|109.224	|0.49	|58.20%	|
|768	|1	|768	|114.981	|1.51	|	|
|768	|32	|24	|155.305	|0.995	|35.10%	|
|768	|48	|16	|155.688	|0.362	|35.40%	|
|768	|64	|12	|158.123	|0.443	|37.50%	|
|768	|80	|10	|160.181	|0.879	|39.30%	|
|768	|96	|8	|162.661	|0.177	|41.50%	|
|768	|112	|7	|164.742	|0.342	|43.30%	|
|1024	|1	|1024	|175.37	|1.244	|	|
|1024	|32	|32	|206.198	|1.131	|17.60%	|
|1024	|48	|21	|206.476	|1.19	|17.70%	|
|1024	|64	|16	|211.615	|0.654	|20.70%	|
|1024	|80	|13	|212.683	|0.928	|21.30%	|
|1024	|96	|11	|214.103	|0.432	|22.10%	|
|1024	|112	|9	|217.517	|1.197	|24%	|

Intel 4th Gen Xeon (Sapphire Rapids) results
|activeMethodCount	|groupCount	|Methods/Group	|Score	|Error	|Diff	|
|---	|---	|---	|---	|---	|---	|
|128	|1	|128	|12.68	|0.01	|	|
|128	|32	|4	|15.61	|0.04	|23.1%	|
|128	|48	|3	|15.75	|0.05	|24.2%	|
|128	|64	|2	|16.02	|0.11	|26.4%	|
|128	|80	|2	|16.21	|0.12	|27.9%	|
|128	|96	|1	|16.48	|0.27	|30.0%	|
|128	|112	|1	|17.12	|0.59	|35.1%	|
|256	|1	|256	|25.21	|0.15	|	|
|256	|32	|8	|31.73	|0.35	|25.9%	|
|256	|48	|5	|31.74	|0.37	|25.9%	|
|256	|64	|4	|33.56	|0.59	|33.1%	|
|256	|80	|3	|33.62	|0.91	|33.3%	|
|256	|96	|3	|34.46	|0.92	|36.7%	|
|256	|112	|2	|34.92	|0.99	|38.5%	|
|512	|1	|512	|58.05	|0.96	|	|
|512	|32	|16	|69.60	|1.59	|19.9%	|
|512	|48	|11	|70.61	|1.11	|21.6%	|
|512	|64	|8	|75.67	|1.25	|30.4%	|
|512	|80	|6	|77.70	|1.59	|33.9%	|
|512	|96	|5	|79.04	|1.29	|36.2%	|
|512	|112	|5	|80.09	|0.92	|38.0%	|
|768	|1	|768	|112.73	|1.66	|	|
|768	|32	|24	|135.95	|4.22	|20.6%	|
|768	|48	|16	|137.05	|2.00	|21.6%	|
|768	|64	|12	|136.82	|2.06	|21.4%	|
|768	|80	|10	|144.65	|5.60	|28.3%	|
|768	|96	|8	|148.26	|6.11	|31.5%	|
|768	|112	|7	|152.97	|5.36	|35.7%	|
|1024	|1	|1024	|165.65	|3.10	|	|
|1024	|32	|32	|209.07	|4.72	|26.2%	|
|1024	|48	|21	|214.42	|4.14	|29.4%	|
|1024	|64	|16	|219.80	|4.28	|32.7%	|
|1024	|80	|13	|224.82	|4.11	|35.7%	|
|1024	|96	|11	|230.94	|2.56	|39.4%	|
|1024	|112	|9	|234.45	|3.49	|41.5%	|

-------------

PR Comment: https://git.openjdk.org/jdk/pull/23831#issuecomment-2725396685
PR Comment: https://git.openjdk.org/jdk/pull/23831#issuecomment-2725427533


More information about the hotspot-compiler-dev mailing list