JIT stops compiling after a while (java 8u45)

Tobias Hartmann tobias.hartmann at oracle.com
Wed Mar 2 08:51:17 UTC 2016


Hi Martin,

On 01.03.2016 20:18, Martin Traverso wrote:
>> For real world applications I hope that this is a much smaller issue but if you must load and execute loads and loads of short lived classes then it might be reasonable to disable concurrent class unloading (at the cost of getting serial Full gcs instead).
> 
> Unfortunately, this is not a theoretical issue for us. We see this problem running Presto (http://prestodb.io), which generates bytecode for every query it processes. For now, we're working around it with a background thread that watches the size of the code cache and calls System.gc() when it gets close to the max (https://github.com/facebook/presto/commit/91e1b3bb6bbfffc62401025a24231cd388992d7c).

Okay, I changed JDK-8023191 from enhancement to bug and set fix version to 9. We can then backport this to 8u.

Best regards,
Tobias

> 
> Martin
> 
> 
> 
> On Tue, Mar 1, 2016 at 5:17 AM, Mikael Gerdin <mikael.gerdin at oracle.com <mailto:mikael.gerdin at oracle.com>> wrote:
> 
>     Hi,
> 
>     On 2016-03-01 13:35, Tobias Hartmann wrote:
> 
>         Hi,
> 
>         is just had a another look and it turned out that even with 8u40+ class unloading is triggered. I missed that because it happens *much* later (compared to 8u33) when the code cache already filled up and compilation is disabled. At this point we don't recover because new classes are loaded and new OSR nmethods are compiled rapidly.
> 
>         Summary:
>         The code cache fills up due to OSR nmethods that are not being flushed. With 8u33 and earlier, G1 did more aggressive class unloading (probably due to more allocations or different heuristics) and this allowed the sweeper to flush enough OSR nmethods to continue compilation. With 8u40 and later, class unloading happens long after the code cache is full.
> 
> 
>     Before 8u40 G1 could only unload classes at Full GCs.
>     After 8u40 G1 can unload classes at the end of a concurrent GC cycle, avoiding Full GC.
> 
>     If you run the test with CMS with +CMSClassUnloadingEnabled you will probably see similar problematic results since the class unloading in G1 is very similar to the one in CMS.
>     I haven't investigated in depth why the classes do not get unloaded in the G1 and CMS cases but there are several known quirks with how concurrent class unloading behaves which causes them to unload classes later than the serial Full GC.
> 
>     Running G1 with -XX:-ClassUnloadingWithConcurrentMark
>     or CMS with -XX:-CMSClassUnloadingEnabled
>     disables concurrent class unloading completely and works around the issue you are seeing.
> 
>     For real world applications I hope that this is a much smaller issue but if you must load and execute loads and loads of short lived classes then it might be reasonable to disable concurrent class unloading (at the cost of getting serial Full gcs instead).
> 
> 
> 
>         I think we should fix this by flushing "cold" OSR nmethods as well (JDK-8023191). Thomas Schatzl mentioned that we could also trigger a concurrent mark if the code cache is full and hope that some classes are unloaded but I'm afraid this is too invasive (and does not help much in the general case).
> 
> 
>     If it is possible to flush OSR nmethods without doing a full class unloading cycle then I think that path is prefereable.
> 
>     /Mikael
> 
> 
> 
>         Opinions?
> 
>         Best regards,
>         Tobias
> 
>         On 01.03.2016 11:27, Tobias Hartmann wrote:
> 
>             Hi Nileema,
> 
>             thanks for reporting this issue!
> 
>             CC'ing the GC team because this seems to be a GC issue (see evaluation below).
> 
>             On 29.02.2016 23:59, nileema wrote:
> 
>                 We are seeing an issue with the CodeCache becoming full which causes the
>                 compiler to be disabled in jdk-8u45 to jdk-8u72.
> 
>                 We had seen a similar issue in Java7 (old issue:
>                 http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-August/011333.html).
>                 This issue went away with earlier versions of Java 8.
> 
> 
>             Reading the old conversation, I'm wondering if this could again be a problem with OSR nmethods that are not flushed? The bug (JDK-8023191) is still open - now assigned to me.
> 
>             Doing a quick experiment, it looks like we mostly compile OSR methods:
>                22129 2137 %     3       Runnable_412::run @ 4 (31 bytes)
>                22130 2189 %     4       Runnable_371::run @ 4 (31 bytes)
>                22134 2129 %     3       Runnable_376::run @ 4 (31 bytes)
>                22136 2109 %     3       Runnable_410::run @ 4 (31 bytes)
> 
>             Currently, OSR nmethods are not flushed just because the code cache is full but only if the nmethod becomes invalid (class loading/unloading, uncommon trap, ..)
> 
>             With your test, class unloading should happen and therefore the OSR nmethods *should* be flushed.
> 
>                 We used the test http://github.com/martint/jittest to compare the behavior
>                 of jdk-8u25 and jdk-8u45. For this test, we did not see any CodeCache full
>                 messages with jdk-8u25  but did see them with 8u45+ (8u60  and 8u74)
>                 Test results comparing 8u25, 8u45 and 8u74:
>                 https://gist.github.com/nileema/6fb667a215e95919242f
> 
>                 In the results you can see that 8u25 starts collecting the code cache much
>                 sooner than 8u45. 8u45 very quickly hits the limit for code cache. If we
>                 force a full gc when it is about to hit the code cache limit, we see the
>                 code cache size go down.
> 
> 
>             You can use the following flags to get additional information:
>             -XX:CICompilerCount=1 -XX:+PrintCompilation -XX:+PrintMethodFlushing -XX:+TraceClassUnloading
> 
>             I did some more experiments with 8u45:
> 
>             java -mx20g -ms20g -XX:ReservedCodeCacheSize=20m -XX:+TraceClassUnloading -XX:+UseG1GC -jar jittest-1.0-SNAPSHOT-standalone.jar | grep "Unloading"
>             -> We do *not* unload any classes. The code cache fills up with OSR nmethods that are not flushed.
> 
>             Removing the -XX:+UseG1GC flag solves the issue:
> 
>             java -mx20g -ms20g -XX:ReservedCodeCacheSize=20m -XX:+TraceClassUnloading -jar jittest-1.0-SNAPSHOT-standalone.jar | grep Unloading
>             -> Prints plenty of [Unloading class Runnable_40 0x00000007c0086028] messages and the code cache does not fill up.
>             -> OSR nmethods are flushed because the classes are unloaded:
>                 21670  970 %     4       Runnable_87::run @ -2 (31 bytes)   made zombie
> 
>             The log files look good:
> 
>             1456825330672   112939  10950016        10195496        111.28
>             1456825331675   118563  11432256        10467176        112.41
>             1456825332678   125935  11972928        10778432        115.72
>             [Unloading class Runnable_2498 0x00000007c0566028]
>             ...
>             [Unloading class Runnable_34 0x00000007c0082028]
>             1456825333684   131493  10220608        5382976         117.46
>             1456825334688   137408  10359296        5636120         116.81
>             1456825335692   143593  7635136         5914624         114.21
> 
>             After the code cache fills up, we unload classes and therefore flush methods and start over again.
> 
>             I checked for several releases if classes are unloaded:
>             - 8u27: success
>             - 8u33: success
>             - 8u40: fail
>             - 8u45: fail
>             - 8u76: fail
> 
>             The regression was introduced in 8u40.
> 
>             I also tried with the latest JDK 9 build and it fails as well (had to change the bean name from "Code Cache" to "CodeCache" and run with -XX:-SegmentedCodeCache). Again, -XX:-UseG1GC -XX:+UseParallelGC solves the problem.
> 
>             Can someone from the GC team have a look?
> 
>                 Is this a known issue?
> 
> 
>             I'm not aware of any related issue.
> 
>             Best regards,
>             Tobias
> 
>                 Thanks!
> 
>                 Nileema
> 
> 
> 
>                 --
>                 View this message in context: http://openjdk.5641.n7.nabble.com/JIT-stops-compiling-after-a-while-java-8u45-tp259603.html
>                 Sent from the OpenJDK Hotspot Compiler Development List mailing list archive at Nabble.com.
> 
> 



More information about the hotspot-gc-dev mailing list