RFR(M): 8244660: Code cache sweeper heuristics is broken
Nils Eliasson
nils.eliasson at oracle.com
Thu May 14 19:54:03 UTC 2020
Hi Man,
Thank you for your very comprehensive measuring. This makes me
comfortable that this change achieves the desired goals.
Lets go with a default threshold of 0.5%. If we encounter any issues, it
is easy to change.
I changed the SweeperThreshold to be a percentage of
ReservedCodeCacheSize - but capped at 1.2Mb. The default is 0.5% which
is 1.2 Mb when running with tiered compilation. The cap only applies
when the default is used - in that way a user have freedom to increase
it at will.
I added add a log_info(codecache, sweep) with the threshold in bytes
during startup for convenience.
I also added the threshold (in bytes) to the JFR
CodeSweeperConfiguration event.
webrev: http://cr.openjdk.java.net/~neliasso/8244660/webrev.02/
This patch applies on top of cleanup patch JDK-8244658
And patch JDK-8244278 must be used on top of this patch to get decent
results.
Best regards,
Nils Eliasson
On 2020-05-14 04:48, Man Cao wrote:
> Hi Nils,
>
> I have done more DaCapo benchmarking with the patches.
> Overall, the result looks good, and your fix indeed reduces sweep frequency
> than the current state.
> It retains possible performance improvement and does not introduce
> unnecessary increase in code cache usage.
>
> All results are available at
> https://cr.openjdk.java.net/~manc/8244660_benchmarks/.
> I have also included counters for used code cache size and sweeper
> statistics in the graphs.
> These metrics are collected using this patch:
> https://cr.openjdk.java.net/~manc/8244660_benchmarks/hsperfcounters_webrev/
> All runs are with "-Xms4g -Xmx4g -XX:-TieredCompilation", because
> -TieredCompilation matters a lot for our workload.
> Also note that the numbers for throughput/CPU and GC exclude the warmup
> iterations. The codecache/sweeper statistics account for all iterations
> (including warmups).
>
> Comparing 3 JDK builds:
> https://cr.openjdk.java.net/~manc/8244660_benchmarks/20200508-JDKHead-dacapoLarge4G-sweeperPatches.html
> base: current state with no pending patches
> allFixes: with patches for JDK-8244660, JDK-8244278 and JDK-8244658
> sweepAt90: with only the patch for JDK-8244278, so it's the same as the
> config I used in previous results in JDK-8244278.
> "allFixes" reduced sweep frequency than "base", without introducing much
> increase in code cache usage.
>
> Same as above, but with -XX:ReservedCodeCacheSize=40m:
> https://cr.openjdk.java.net/~manc/8244660_benchmarks/20200512-JDKHead-dacapoLarge4G-sweeperPatches-CodeCache40MB.html
> "allFixes" retains the throughput and CPU improvement for tradesoap,
> perhaps even better than not sweeping ("sweepAt90").
> Code cache usage for tradesoap is between "base" and not sweeping, which is
> OK in my opinion.
>
> I think 1/100 of a 240mb default code cache seems a bit high. During
>> startup we produce a lot of L3 code that will be thrown away. We want to
>> recycle it fairly quickly, to avoid fragmenting the code cache, but not
>> that often that we affect startup.
>> I've done some startup measurements, and then we sweep about every other
>> second in a benchmark that produces a lot of code.
>> What results are you seeing?
>
> The 1/256 capped at 1MB seems OK.
> Even with 40MB or 48MB code cache size with -TieredCompilation, it does not
> flush too frequently.
>
> Code cache flushing has another heuristic - it might be broken too. But
>> it would be interesting too see how it works with the new sweep
>> heuristic. If you know that you have enough code cache - turning it off
>> is no loss. It only helps when you are running out of code cache.
>
>
>> When we are doing normal sweeping - we don't deoptimize cold code. That
>> is handled my the method flushing - it should only kick in when we start
>> to run out of code cache.
>
> I think we should address MethodFlushing in a separate RFE/BUG.
>
>
> Thanks for explaining this.
> I did some benchmarking with -XX:NmethodSweepActivity and
> -XX:MinPassesBeforeFlush, on top of the "allFixes" config:
> https://cr.openjdk.java.net/~manc/8244660_benchmarks/20200508-JDKHead-dacapoLarge4G-NmethodSweepActivity.html
> https://cr.openjdk.java.net/~manc/8244660_benchmarks/20200508-JDKHead-dacapoLarge4G-MinPassesBeforeFlush.html
> xalan, jython look better with small values, pmd looks worse.
> I'll follow up separately if I find anything wrong with the
> flushing/cold-code-deoptimization heuristic
>
> The heuristics for CodeAging may have been negatively affected by the
>> transition to handshakes. Also the SetHotnessClosure should be replaced
>> by a mechanism using the NMethodEntry barriers.
>> I see that we are missing JFR events for MethodFlushing. I have created
>> another patch for that.
> Although I'm not very familiar with these, thanks for identifying and
> fixing these issues!
>
> -Man
More information about the hotspot-compiler-dev
mailing list