Java 8 Performance Issue With Tiered Compilation Disabled

James Lent jlent at tivo.com
Mon Jun 6 12:52:06 UTC 2016


Thanks for your interest.  Answer in-line.


On 06/06/2016 04:00 AM, Tobias Hartmann wrote:

Hi James,

thanks for the report, please find some questions below.

On 02.06.2016 17:56, James Lent wrote:


We are running a Solr/Lucene based server on Java 8:

openjdk version "1.8.0_91"
OpenJDK Runtime Environment (build 1.8.0_91-b14)
OpenJDK 64-Bit Server VM (build 25.91-b14, mixed mode)

Recently we tried to run with Tiered Compilation disabled:

  -XX:-TieredCompilation

In production the system performed very well for about 70 hours and then
performance start to degrade very sharply.  HotSpot logs indicated that
100s of methods had reached their recompile limit.  A sample of the logs
is below:

made not compilable on levels 0 1 2 3 4 java.lang.String::equals (81
bytes) decompile_count > PerMethodRecompilationCutoff
made not compilable on levels 0 1 2 3 4
org.apache.lucene.util.OpenBitSetIterator::nextDoc (136 bytes)
decompile_count > PerMethodRecompilationCutoff
made not compilable on levels 0 1 2 3 4
org.apache.solr.search.SolrConstantScoreQuery$ConstantScorer::score (5
bytes) decompile_count > PerMethodRecompilationCutoff
made not compilable on levels 0 1 2 3 4
org.roaringbitmap.Util::fillArrayAND (92 bytes) decompile_count >
PerMethodRecompilationCutoff

The methods that hit this limit include Java, Lucene, Solr, and
RoaringBitMap methods.

I was able to reproduce this in a dev environment by just running very
light real world traffic.  What I observed was that many methods were
being recompiled 3 or more times per hour.

I was able to "fix" this issue by one of two methods:

1) Increase max code cache size:

  -XX:ReservedCodeCacheSize=240m

2) Disable code cache flushing:

  -XX:-UseCodeCacheFlushing

After running for over 12 hours the maximum number of decompiles for a
method is 14 and that method hit 11 decompiles very quickly and then
leveled off (with the first fix it is still at 11).  Only 7 methods have
8 or more decompiles and this has held steady for hours now).

I also tried the following based on a suggestion from Chris Newland,
but, it did not help:

  -XX:-UseBimorphicInlining

My theory is that the Code Cache flushing algorithm coupled with the
Code Cache demands of my application are causing the HotSpot compiler to
"thrash".  Keep cycling the methods through the Code Cache:

1) Compile method A
2) Compile method B
3) Kick out method A and compile method C
4) Kick out method B and compile method A

Observations:

1) When Tiered Compilation is disabled the Code Cache max drops from
240mb to 48mb.



Did the problem also show up with Tiered Compilation?

This problem did not.  Of course with tiered compilation enabled the code cache max seems to default to 240mb.  I did not try running tiered with a smaller max code cache size.  I did not that tiered seems to use a lot more code cache.  The same application exceeds 70mb when it is enabled.

However with tiered compilations enabled we were seeing a hard to understand drop in performance after about 4.5 hours in both production (2 second response times suddenly jump to 3 to 5 seconds).  I was able to reproduce this issue in a dev environment with surprisingly little trouble.  I investigated this issue from many angles to try to understand what was going on (heap, linux file cache, background network traffic, processor load, EC2 variability).  The one thing that did seem to work in dev was turning off Tiered compilation (a desperation idea).  I have not been able to identify any interesting compiler logs that correlate with that issue.  When we tried it in production it worked very well for 70 hours and then this issue occurred which was actually much worse.  I was not a hero.

I sent an email on this first issue to the hotspot-dev mailing list (before the "turn off tiered compilation fix" failed in production), but, was not expecting much help because I could not provide much data to work with.  Just a description of the symptoms.

2) When not constrained (i.e. when one of my two "fixes" is applied) my
application under this traffic load levels out at just below 22mb in the
Code Cache.
3) When neither of my fixes is applied then the Code Cache usage peaks
about 18mb, but, is highly variable minute to minute.



You could try to use a lower value of -XX:NmethodSweepActivity to avoid aggressive sweeping when the code cache is getting full. But I doubt that this helps because there is still enough space in the code cache.

I will try to give it a try as time permits, but, I have torn down my dev environment for now.  I was able to reproduce this issue right on my dev box, but doing so consumes most of its resources.  I may try to set it up before I leave tonight.





We are planning to implement fix one above in production soon.  Any
comments?



I think increasing the code cache size does not hurt and prevents aggressive sweeping in this case because a larger code cache size leads to a larger per method hotness threshold for flushing.

Did you try to reproduce the problem with a JDK 9 early access build? The symptoms you are describing sound quite similar to an issue we encountered with JDK 9. This was fixed by code aging [1] but the fix was never backported to JDK 8. You can enable/disable code aging with -XX:UseCodeAging in 9. If the problem turns out to be fixed, we could provide a prototype backport of code aging for testing in your environment.

Thanks for the info.  I have not tried with Java 9.  I will put that on the list to try.




Best regards,
Tobias

[1] http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/975d903e1de3



I can provide HotSpot logs.  Chris has provided me the required options
to get the more detailed information.

________________________________

This email and any attachments may contain confidential and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments) by others is prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete this email and any attachments. No employee or agent of TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo Inc. may only be made by a signed written agreement.


James Lent
Senior Software Engineer

Digitalsmiths
A TiVo Company

www.digitalsmiths.com<http://www.digitalsmiths.com/>
jlent at digitalsmiths.com<mailto:jlent at digitalsmiths.com>  | office 919.460.4747

________________________________

This email and any attachments may contain confidential and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments) by others is prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete this email and any attachments. No employee or agent of TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo Inc. may only be made by a signed written agreement.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160606/226add76/attachment-0001.html>


More information about the hotspot-compiler-dev mailing list