RFR(S): 8198756: Limit number of compiler threads for small code cache
Vladimir Kozlov
vladimir.kozlov at oracle.com
Wed Mar 21 22:36:11 UTC 2018
I hijacked and change Subject of this RFE to implement dynamic
allocation of compiler threads.
I wrote proposal in FRE's comment. Please, look.
It is still assigned to Martin ;) but we can take ownership when we
finalize design.
Thanks,
Vladimir
On 3/2/18 2:28 AM, Doerr, Martin wrote:
> Hi Derek, Igor and Vladimir,
>
> thanks for all replies.
>
> I agree with that it would be good to have something like
> UseDynamicNumberOfGCThreads.
>
> Btw. I have recently requested to activate that one by default
> (JDK-8198547).
>
> If we can’t get it for jdk11, I’d like at least to make it easier for
> customers to save memory without explicitly setting CICompilerCount.
>
> Best regards,
>
> Martin
>
> *From:*White, Derek [mailto:Derek.White at cavium.com]
> *Sent:* Freitag, 2. März 2018 04:25
> *To:* Igor Veresov <igor.veresov at oracle.com>; Doerr, Martin
> <martin.doerr at sap.com>
> *Cc:* Vladimir Kozlov <vladimir.kozlov at oracle.com>;
> hotspot-compiler-dev at openjdk.java.net
> *Subject:* RE: RFR(S): 8198756: Limit number of compiler threads for
> small code cache
>
> Hi Igor, Martin,
>
> Just to throw out some other user experience:
>
> I’m typically running on machines with 98 to 224 CPUs. It’s not the case
> that **every** Java app needs to use all the CPUs for compiler threads.
> The JVM may not be the only JVM running on the system (Hadoop,
> microservices, etc), let alone the only important app on the system.
>
> Historically the GC threads have been the worst offenders in this
> regard. The GC thread’s “scaling factor” is much higher than the
> compiler thread’s scaling factor. But with options like
> UseDynamicNumberOfGCThreads, the GC tries to adjust the number of GC
> threads to the work to be done. I think it’s important that the JVM
> figure out how to scale the number of compiler threads as well.
>
> I won’t claim that Martin’s scheme is the best approach, or that it
> should be on by default, but unless a better solution is going into JDK
> 11, I’d support this scheme as an experimental flag. FWIW.
>
> * Derek White, Cavium (Purveyor of fine 224 cpu systems for the
> discerning developer).
>
> *From:*hotspot-compiler-dev
> [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] *On Behalf Of
> *Igor Veresov
> *Sent:* Thursday, March 01, 2018 7:46 PM
> *To:* Doerr, Martin <martin.doerr at sap.com <mailto:martin.doerr at sap.com>>
> *Cc:* Vladimir Kozlov <vladimir.kozlov at oracle.com
> <mailto:vladimir.kozlov at oracle.com>>;
> hotspot-compiler-dev at openjdk.java.net
> <mailto:hotspot-compiler-dev at openjdk.java.net>
> *Subject:* Re: RFR(S): 8198756: Limit number of compiler threads for
> small code cache
>
> Doerr,
>
> I think the optimal number of compiler threads is such that it keeps the
> length of the compiler queues as minimal. During startup typically the
> optimal number of compiler threads is equal to the number of the CPUs,
> may be even more than that considering threads a either C1 or C2 and
> compiles typically happen in waves using one and then the other. The
> fact that some users see code cache filling slower with fewer threads is
> just an indication of how huge their compile queues are, and this is
> certainly not good for startup. The problem of resource holding is real,
> since after startup we don’t need that many threads (unless you’re
> running something that does dynamic code generation). Perhaps the
> solution to all of this is having a dynamic pool of compiler threads
> that could expand/shrink depending on the load (the length of the
> compile queues).
>
> igor
>
> On Mar 1, 2018, at 12:31 AM, Doerr, Martin <martin.doerr at sap.com
> <mailto:martin.doerr at sap.com>> wrote:
>
> Hi Igor,
>
> we observed that the compiler threads fill up the code cache faster
> than the sweeper can clean when using a small code cache.
>
> This doesn't seem beneficial at all.
>
> Some customers try to save memory by using a very small code cache.
> It's very annoying that so much memory gets wasted for such a large
> number of idle compiler threads which hold their arenas etc.
>
> Maybe the current formula was optimized for a special scenario with
> many slow cores? Maybe SPARC Niagara?
>
> Shouldn't such scenarios use a large code cache? Maybe much more
> than 240MB?
>
> Best regards,
>
> Martin
>
> *From:*Igor Veresov [mailto:igor.veresov at oracle.com]
> *Sent:*Donnerstag, 1. März 2018 08:05
> *To:*Vladimir Kozlov <vladimir.kozlov at oracle.com
> <mailto:vladimir.kozlov at oracle.com>>
> *Cc:*Doerr, Martin <martin.doerr at sap.com
> <mailto:martin.doerr at sap.com>>;
> hotspot-compiler-dev at openjdk.java.net
> <mailto:hotspot-compiler-dev at openjdk.java.net>
> *Subject:*Re: RFR(S): 8198756: Limit number of compiler threads for
> small code cache
>
> I’m curious about the rationale for tying the number of thread to
> the size of the code cache. Is it because you don’t want them to
> keep holding the space for their code buffers when they are idle?
>
> igor
>
>
>
> On Feb 27, 2018, at 10:19 AM, Vladimir Kozlov
> <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>>
> wrote:
>
> Hi Doerr,
>
> The problem with your proposal is that we don't use scale number
> of compiler threads when we have a lot of cpus (>1000 on big
> "slow" machines).
> By default for tiered compilation we have 240Mb for CodeCache.
> With your formula we always will have 7 threads (2 C1 and 5 C2)
> which could be fine if machine has < total 32 procs/threads. But
> for big machines it may be bottleneck for JIT compilation
> intensive applications (and for startup when most JIT
> compilations happened).
>
> Main motivation of current approach was to reach peak
> performance (c2 compilations) as fast as possible. What we
> usually observed before is large compilation queue for C2
> compilation because slow throughput of C2. It was especially
> visible with tiered compilation when compilation thresholds
> reached faster with first tier compiled profiling code.
>
> And I agree that we may have problem with number of compiler
> threads at the beginning of graph (< 32 cpu threads) when the
> number grows too fast:
>
> Graph for3*log2(x)*log2(log2(x))/2
>
> -60-55-50-45-40-35-30-25-20-15-10-55101520253035404550556065707580859095100105110115120125130-35-30-25-20-15-10-55101520253035404550556065x:
> 32.0711217y: 17.4325495
>
>
>
> May be we should have a formula which takes into account code
> cache size and number of cpu threads.
>
> Igor Veresov was original developer of current formula. It would
> be nice to hear his opinion.
>
> Thanks,
> Vladimir
>
> On 2/27/18 8:10 AM, Doerr, Martin wrote:
>
> Hi,
>
> the VM currently starts a large amount of compiler threads
> on systems with many CPUs regardless of the code cache size.
>
> This doesn't make sense for very small code cache sizes.
>
> The dynamically determined number of compiler threads can be
> observed by:
>
> jdk/bin/java -XX:ReservedCodeCacheSize=128m
> -XX:+PrintFlagsFinal -version|grep CICompiler
>
> I suggest not to use more than 1 compiler thread per 32MB of
> code cache:
>
> http://cr.openjdk.java.net/~mdoerr/8198756_CompilerCount/webrev.00/
> <http://cr.openjdk.java.net/%7Emdoerr/8198756_CompilerCount/webrev.00/>
>
> This seems to be conservative.
>
> Please review and let me know if you have a different
> limitation proposal.
>
> Best regards,
>
> Martin
>
More information about the hotspot-compiler-dev
mailing list