RFR 8220613: [TEST] java/util/Arrays/TimSortStackSize2.java times out with fastdebug build

Aleksey Shipilev shade at redhat.com
Mon Mar 18 17:12:52 UTC 2019


On 3/14/19 4:23 PM, Roger Riggs wrote:
> I didn't have a good idea where to look, the times do seem excessive.
> Suggestions?

You do it with profilers. Since fastdebug probably has the bottleneck in the JVM code, you need a
native profiler. On Linux, you do this:

 $ CONF=linux-x86_64-server-fastdebug perf record -g make images run-test
TEST=java/util/Arrays/TimSortStackSize2.java

...and then open "perf report" and meditate. Sometimes it is easier to produce the high-level
flamegraph, for instance with https://github.com/KDAB/hotspot:
 http://cr.openjdk.java.net/~shade/8220613/perf-fastdebug.png

What can you see here? G1 ConcurrentRefineThread spends a lot of time verifying stuff, as it would
in fastdebug builds. This is one of the major contributors to this difference:

release timing:
  real	0m12.485s
  user	0m40.930s
  sys	0m3.840s

fastdebug timing:
  real	0m32.030s
  user	1m58.519s
  sys	0m5.172s

So, there is 3-4x difference. It is way off the stated in original problem:
  Release images build: 4 seconds
  Fastdebug images build: 2.5 minutes

Anyway, if you apply this:

diff -r 98e21d4da074 test/jdk/java/util/Arrays/TimSortStackSize2.java
--- a/test/jdk/java/util/Arrays/TimSortStackSize2.java  Mon Mar 18 15:21:33 2019 +0100
+++ b/test/jdk/java/util/Arrays/TimSortStackSize2.java  Mon Mar 18 17:52:09 2019 +0100
@@ -71,4 +71,5 @@
             OutputAnalyzer output = ProcessTools.executeTestJava(xmsValue,
                                                                  xmxValue,
+                                                                 "-XX:+UseParallelGC",
                                                                  "TimSortStackSize2",
                                                                  "67108864");

Then timings become:

release:
  real	0m16.004s
  user	0m41.382s
  sys	0m4.660s


fastdebug:
  real	0m17.292s
  user	1m8.225s
  sys	0m4.068s

You repeat the profiling step to discover C2 becomes hot. Falling back to C1 would not help
fastdebug timing, though, because less optimized code is not offsetting the better compiler performance.


-Aleksey



More information about the core-libs-dev mailing list