Speeding up jmod

Chris Hegarty chris.hegarty at oracle.com
Sun May 1 15:56:40 UTC 2016


> On 1 May 2016, at 12:53, Alan Bateman <Alan.Bateman at oracle.com> wrote:
> 
> On 01/05/2016 03:02, Claes Redestad wrote:
>> Hi,
>> 
>> Alan asked me to take a look at jmod performance (also jlink, but saving that for another day), so I set
>> up a naive benchmark[1] and started profiling.
>> 
>> ... and saw nothing really suspicious except that time is split between doing I/O and executing native code in
>> libz.so, which I guess isn't surprising. Oddly enough the only java methods that even show up in
>> profiles are related to writing, so I figured taking a closer look at the code for writing output from jmod
>> wouldn't hurt. Turns out I was wrong, since I soon found that the output stream used by JmodTask is
>> unbuffered...
>> 
>> Applied a trivial patch[2] and results of running the micro with -f 10 -i 1 -bm ss (which is more or less like
>> running jmod standalone):
>> 
>> Benchmark                   Mode  Cnt  Score   Error  Units
>> JmodBenchmark.jmodJavaBase    ss   10  1.966 ± 0.297   s/op # before
>> JmodBenchmark.jmodJavaBase    ss   10  1.196 ± 0.142   s/op # after
>> 
>> Seems like a notable reduction right there. Timing runs of jmod standalone gives analogous results on
>> real time, but user time is still almost as high.
>> 
>> Poking around further and it's obvious JIT threads are eating a larger portion of my cycles now - likely C2 is
>> ramping up but not having time to get much done in the short life-time of jmod, which is mostly spent in
>> native code anyhow. Switching to running short-running apps with only C1 can be profitable, especially on
>> machines with a lot of cores (like the 2x8x2 machine I'm running this on), so I ran the numbers:
>> 
>> Again, with time:
>> 
>> Benchmark                   Mode  Cnt  Score   Error  Units
>> JmodBenchmark.jmodJavaBase    ss   10  1.175 ± 0.147   s/op
>> 
>> real    0m17.140s
>> user    0m54.868s
>> sys    0m4.172s
>> 
>> -XX:TieredStopAtLevel=1
>> 
>> Benchmark                    Mode  Cnt  Score   Error  Units
>> JmodBenchmark.jmodJavaBase  thrpt   10  1.075 ± 0.194  ops/s
>> 
>> real    0m14.810s
>> user    0m15.556s
>> sys    0m1.584s
>> 
>> Yep, only running "C1" improves things a lot in this case and on my environment.
>> 
>> I suggest accepting the patch[2] as well as switching the jmod runner to run with -XX:TieredStopAtLevel=1
>> or similar. Both are likely needed for most to see any effect on build times.
>> 
>> A long term alternative to consider might be to implement a server-based jmod akin to the javac server.
> Thanks Claes, this is good analysis!

Yes, this is great work. Thanks Claes.

> The create method should be using a BufferedOutputStream,

This was an oversight in the original implementation. The output
should be buffered.

> I'm surprised that it isn't. 'll get that patch in the current refresh although it looks like this helps more with the benchmark that with the build.
> 
> I changed make/CreateJmods.java to use -XX:TieredStopAtLevel=1 and make a bit difference in the build. The wall clock time to create the jmods on my local machine drops from 46s. to 22s. I also tried a remote Windows machine and the time to create the jmods also dropped by about 20s.

Wow, this is a real win. Good find.

> I'm sure Erik will have advice on how to fit this in. As things stand, the VM options for the jmod command are configured in spec.gmk.in to to use $(JAVA_TOOL_FLAGS_SMALL). Maybe it's time to change JAVA_TOOL_FLAGS_SMALL as it it seems to be  -XX:+UseSerialGC and some heap settings at this time.

I would expect that a number of other tools could benefit from this
too.

-Chris.



More information about the jigsaw-dev mailing list