long classpaths and JDK-8162492

Thu Dec 15 04:13:41 UTC 2016

Thanks for investigating! I went ahead and filed
https://bugs.openjdk.java.net/browse/JDK-8171294 with the repro.

And here's a prototype of eager scanning:
http://cr.openjdk.java.net/~cushon/8171294/webrev.00/.

On Wed, Dec 14, 2016 at 3:43 AM, Claes Redestad <claes.redestad at oracle.com>
wrote:

> Thanks for the report and reproducer.
>
> I took it for a spin and analyzed behavior for bit, and in additional
> to a massive amount of calls to Files.exist (which would be avoided
> by actually scanning eagerly), the current design also has a problem
> with rampant retained set growth (quadratic, it seems), which leads to
> GC thrashing (and would eventually have javac crash with an OOM on
> a long enough classpath).
>
> This is easily seen by running with -J-Xlog:gc
>
> [7.161s][info][gc] GC(31) Pause Young (G1 Evacuation Pause)
> 403M->132M(911M) 38.814ms
> [7.925s][info][gc] GC(32) Pause Young (G1 Evacuation Pause)
> 295M->137M(911M) 50.975ms
> ...
> [76.455s][info][gc] GC(80) Pause Young (G1 Evacuation Pause)
> 1026M->778M(6102M) 191.624ms
> [78.020s][info][gc] GC(81) Pause Young (G1 Evacuation Pause)
> 1057M->784M(6102M) 164.749ms
>
> real    11m11.944s
> user    98m15.140s
> sys     1m19.092s
>
> Not caching the failed lookup at all[1] leads to a much slower growth
> rate for the retained set and a quicker execution overall:
>
> [21.354s][info][gc] GC(39) Pause Young (G1 Evacuation Pause)
> 515M->88M(726M) 12.811ms
> [22.727s][info][gc] GC(40) Pause Young (G1 Evacuation Pause)
> 514M->88M(726M) 14.445ms
> ...
> [75.846s][info][gc] GC(79) Pause Young (G1 Evacuation Pause)
> 622M->108M(872M) 20.874ms
> [77.391s][info][gc] GC(80) Pause Young (G1 Evacuation Pause)
> 623M->109M(872M) 17.676ms
>
> real    7m3.745s
> user    8m34.560s
> sys     0m5.040s
>
> Not saying this[1] is anywhere near a good final solution, but helps to
> underscore that this might be more critical than a simple corner-case
> performance bug.
>
> Thanks!
>
> /Claes
>
> [1]
> diff -r 50135a630f35 src/jdk.compiler/share/classes
> /com/sun/tools/javac/file/JavacFileManager.java
> --- a/src/jdk.compiler/share/classes/com/sun/tools/javac/file/JavacFileManager.java
> Tue Dec 13 12:25:58 2016 -0800
> +++ b/src/jdk.compiler/share/classes/com/sun/tools/javac/file/JavacFileManager.java
> Wed Dec 14 12:39:52 2016 +0100
> @@ -586,7 +586,7 @@
>                  Path relativePath = path.resolveAgainst(fileSystem);
>
>                  if (!Files.exists(relativePath)) {
> -                    relativePath = null;
> +                    return null;
>                  }
>
>                  pathCache.put(path, relativePath);
>
>
>
> On 2016-12-13 05:18, Liam Miller-Cushon wrote:
>
>> I noticed a performance regression with long classpaths and large
>> numbers of sources. This is related to JDK-8162492, but the benchmarks
>> in that bug are for long classpaths with a single source file, and my
>> proposed fix doesn't address that.
>>
>> Under JDK 9 I'm seeing compilation time scale as the product of the
>> classpath length and the number of calls to getFileObject or list. The
>> example below takes 17s with JDK 8 and 8m40s with JDK 9.
>>
>> I think the issue is with the cache strategy in ArchiveContainer.
>> ArchiveContainer calls Files.exists for each path and caches the result,
>> but if it's queried for a large number of distinct paths it doesn't get
>> cache hits, and ends up doing a lot of work.
>>
>> I'd like to propose bringing back the cache strategy that was used in
>> ZipArchive:
>> http://hg.openjdk.java.net/jdk8/jdk8/langtools/file/1ff9d511
>> 8aae/src/share/classes/com/sun/tools/javac/file/ZipArchive.java#l73.
>> ZipArchive scans all entries in the zip once, and puts the directory
>> names in a cache. If getFileObject/list are called many times with paths
>> that aren't in the archive it doesn't have to do any work, and the cache
>> stays small.
>>
>> I have a couple of question about how to proceed:
>>
>> - does bringing back the ZipArchive-style cache sound reasonable, or are
>> there other advantages to the new approach used in ArchiveContainer?
>>
>> - should I add this to JDK-8162492, or does it deserve a separate bug?
>>
>>
>> Here's the repro -
>>
>> generate the test inputs (start in a fresh directory):
>>
>> $ javac -cp asm.jar Test.java && java -cp asm.jar:. Test
>>
>> compile:
>>
>> $ $JAVAC8 -fullversion
>> javac full version "1.8.0_122-ea-b04"
>> $ time $JAVAC8 @params.txt
>> real0m17.385s
>> user0m40.855s
>> sys0m1.305s
>>
>> $ $JAVAC9 -fullversion
>> javac full version "9-ea+148"
>> $ time $JAVAC9 @params.txt
>> real8m40.530s
>> user32m8.614s
>> sys0m57.024s
>>
>> === Test.java ===
>> import static java.nio.charset.StandardCharsets.UTF_8;
>> import static org.objectweb.asm.Opcodes.ACC_PUBLIC;
>> import static org.objectweb.asm.Opcodes.ACC_SUPER;
>>
>> import java.io.File;
>> import java.nio.file.Files;
>> import java.nio.file.Paths;
>> import java.util.ArrayList;
>> import java.util.List;
>> import java.util.jar.JarEntry;
>> import java.util.jar.JarOutputStream;
>> import org.objectweb.asm.ClassWriter;
>>
>> class Test {
>>
>>     static final int JARS = 1000;
>>     static final int CLASSES = 100;
>>     static final int SOURCES = 1000;
>>
>>     public static void main(String[] args) throws Exception {
>>         List<String> jars = new ArrayList<>();
>>         for (int i = 0; i < JARS; i++) {
>>             String jarName = "lib" + i + ".jar";
>>             jars.add(jarName);
>>             try (JarOutputStream jos =
>>                     new
>> JarOutputStream(Files.newOutputStream(Paths.get(jarName)))) {
>>                 for (int j = 0; j < CLASSES; j++) {
>>                     String name = String.format("b%d%d/B%d%d", i, j, i,
>> j);
>>                     jos.putNextEntry(new JarEntry(name + ".class"));
>>                     ClassWriter cw = new ClassWriter(0);
>>                     cw.visit(52, ACC_PUBLIC | ACC_SUPER, name, null,
>> "java/lang/Object", null);
>>                     jos.write(cw.toByteArray());
>>                 }
>>             }
>>         }
>>         List<String> sources = new ArrayList<>();
>>         for (int i = 0; i < SOURCES; i++) {
>>             StringBuilder sb = new StringBuilder();
>>             sb.append(String.format("package a%d;\n", i));
>>             sb.append(String.format("class A%d {\n", i));
>>             for (int j = 0; j < CLASSES; j++) {
>>                 String name = String.valueOf((i + j) % JARS) + j;
>>                 sb.append(String.format("b%s.B%s x%d;\n", name, name,
>> j));
>>             }
>>             sb.append(" }\n");
>>             String file = String.format("A%d.java", i, i);
>>             sources.add(file);
>>             Files.write(Paths.get(file), sb.toString().getBytes(UTF_8));
>>         }
>>         List<String> params = new ArrayList<>();
>>         params.addAll(sources);
>>         params.add("-cp");
>>         params.add(String.join(File.pathSeparator, jars));
>>         Files.write(Paths.get("params.txt"), params, UTF_8);
>>     }
>> }
>> ===
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/compiler-dev/attachments/20161214/05cb2855/attachment-0001.html>