RFR: 8344009: Improve compiler memory statistics
Roberto Castañeda Lozano
rcastanedalo at openjdk.org
Thu Feb 20 14:02:53 UTC 2025
On Wed, 19 Feb 2025 09:49:54 GMT, Roberto Castañeda Lozano <rcastanedalo at openjdk.org> wrote:
>>> > Hi Thomas, this looks very useful, thanks! I will run some Oracle-internal functional and performance testing and come back with the results next week.
>>>
>>> Functional test results (Oracle internal tier1-tier5) look good.
>>>
>>> I measured C2 execution time before and after the changeset using DaCapo 23 and did not find any statistically significant difference, except for a 2-3% regression on the jython benchmark (using large input size). This small regression is IMO acceptable, particularly given that these changes can be seen as an investment to improve compiler resource utilization in the long run.
>>
>> Hi @robcasloz, interesting, I did not expect this. What did you measure? With Compilation statistic vs without, or with old vs new, but both enabled? (best, give me both sets of command line args)
>
>> > > Hi Thomas, this looks very useful, thanks! I will run some Oracle-internal functional and performance testing and come back with the results next week.
>> >
>> >
>> > Functional test results (Oracle internal tier1-tier5) look good.
>> > I measured C2 execution time before and after the changeset using DaCapo 23 and did not find any statistically significant difference, except for a 2-3% regression on the jython benchmark (using large input size). This small regression is IMO acceptable, particularly given that these changes can be seen as an investment to improve compiler resource utilization in the long run.
>>
>> Hi @robcasloz, interesting, I did not expect this. What did you measure? With Compilation statistic vs without, or with old vs new, but both enabled? (best, give me both sets of command line args)
>
> I measured and compared C2 speed in bytecodes/s as reported by `-XX:+CITime` (averaged over a number of repetitions). I wanted to test that the feature does not affect C2's execution time when not used, so I simply compared C2 compilation speed for `jdk-25+10` vs. `jdk-25+10` with this changeset applied on top (both release builds) and `-XX:+CITime -Xbatch -XX:-TieredCompilation` on both builds (the last two flags for better stability across benchmark repetitions). I could observe the regression on both linux-x64 and macosx-aarch64 platforms. Let me know if you need more details.
> @robcasloz I identified and hopefully fixed a small issue that hit the "disabled" path. Turns out we allocate arena chunks a lot more frequently than I thought, and the new unconditional call to Thread::current() in there was hurting a bit. I now avoid this unless I know the statistic is enabled.
>
> With this patch, on my machine the difference between unpatched and patched JVM with stats disabled is below one standard deviation for the benchmark in question.
Great, thanks! Will re-run benchmarking and report results early next week.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/23530#issuecomment-2671587462
More information about the serviceability-dev
mailing list