Proposed javac argument processing performance improvement
Jonathan Gibbons
jonathan.gibbons at oracle.com
Tue Jun 14 16:34:14 PDT 2011
David,
Looking at your patch, you take a hit at the end by converting the Set
to a List by way of an array. At a minimum, it would be worth
overloading List.from to accept an Iterable<A> -- that would allow you
to eliminate the intermediate array. For bonus points, you could change
the return type of processArgs to Iterable<File> and avoid the copy
altogether. However, in that case, you would need to fix up a couple of
isEmpty() calls in Main. The primary use of the result is in
Main.java:412, which already accepts Iterable<File>. There are no other
uses that I can see within the main javac source code. I found 3 uses in
the regression tests that would be unaffected.
I believe there are a couple of regression tests that need to be fixed
up if you change the type of JavaCompiler.filenames.
-- Jon
On 06/13/2011 04:06 PM, David Schlosnagle wrote:
> On Mon, Jun 13, 2011 at 2:57 PM, Dan Smith<daniel.smith at oracle.com> wrote:
>> Thank you very much for this useful analysis and patch!
>>
>> I'll work on getting this problem addressed. The current target for new patches is JDK 8; I'll also create a request to backport the fix to a JDK 7 update.
> Excellent, if there is anything I can do to help get this back ported
> into JDK 7 and ideally 6, let me know.
>
>> Out of curiosity, what sort of an impact does this have on total compilation time? Given the n^2 growth, does processArgs start to dominate compilation time as n gets larger, or does it remain a fixed (probably small?) proportion of the total compilation time?
> After profiling the current implementation on Windows, processArgs
> does end up taking just over 51% overall javac execution time for
> large numbers of input files with around 4000 input files. If you're
> interested in the details, I attached output from a YourKit tracing
> profile session for javac with 4411 files as input on Windows. (Note
> that the execution times are extremely long due to the tracing
> overhead for including java.* and com.sun.* traces.)
>
> For 4411 input files on Windows,
> com.sun.tools.javac.util.List.contains, java.io.File.equals,
> java.io.File.compareTo, java.io.Win32FileSystem.compare,
> java.lang.String.compareToIgnoreCase, and
> java.lang.String$CaseInsensitiveComparator.compare are all executed
> (N+1)*(N/2) = 9,730,666 times each. On non-Windows filesystems, the
> implementation of java.io.File.compareTo just uses String.compareTo,
> so it is not as expensive as the case insensitive comparison of
> Windows.
>
> With the proposed patch, there will only be O(N) invocations of
> java.io.File.hashCode (and java.io.File.equals on collision).
> Unfortunately I don't have an OpenJDK Windows build right now, but I
> gathered some stats from a single pass on my Mac OS X OpenJDK 7 build
> compiling essentially a empty classes "public class TestN {}".
>
> Before:
> =======
> # files processArgs javac real javac user javac sys
> 1 4.987ms 0m2.576s 0m0.725s 0m0.136s
> 1000 40.397ms 0m2.032s 0m3.470s 0m0.368s
> 2000 86.796ms 0m3.551s 0m5.275s 0m0.625s
> 3000 219.402ms 0m6.076s 0m8.510s 0m1.121s
> 4000 217.765ms 0m7.272s 0m11.209s 0m1.219s
> 5000 320.309ms 0m10.277s 0m16.011s 0m1.910s
> 6000 425.571ms 0m11.712s 0m18.127s 0m2.181s
> 7000 503.229ms 0m12.089s 0m18.869s 0m1.960s
> 8000 618.292ms 0m15.680s 0m21.288s 0m2.109s
> 9000 783.368ms 0m16.208s 0m22.597s 0m2.289s
> 10000 926.978ms 0m18.804s 0m25.991s 0m2.382s
>
> After:
> ======
> # files processArgs javac real javac user javac sys
> 1 2.062ms 0m0.869s 0m0.887s 0m0.121s
> 1000 32.369ms 0m4.702s 0m3.729s 0m0.415s
> 2000 63.746ms 0m6.610s 0m5.914s 0m0.768s
> 3000 67.950ms 0m5.936s 0m8.116s 0m1.050s
> 4000 76.996ms 0m7.306s 0m11.667s 0m1.293s
> 5000 93.124ms 0m9.083s 0m15.092s 0m1.812s
> 6000 111.525ms 0m10.497s 0m17.183s 0m2.038s
> 7000 136.811ms 0m12.383s 0m19.278s 0m2.214s
> 8000 136.578ms 0m12.343s 0m18.999s 0m1.902s
> 9000 231.338ms 0m15.045s 0m21.841s 0m2.199s
> 10000 156.525ms 0m17.555s 0m25.835s 0m2.393s
>
> - Dave
More information about the compiler-dev
mailing list