Multi-threaded javac

David Schlosnagle schlosna at
Wed Jul 14 21:04:44 PDT 2010


Thanks for the javac updates. I'm not planning any major changes right
now. I need to get a better feel for the javac code and where we're
experiencing bottlenecks, so I'm going to spend some time in the
profiler. I'll also take a look at creating a buffered diagnostic log.

- Dave

On Monday, July 12, 2010, Jonathan Gibbons <jonathan.gibbons at> wrote:
> David,
> Most of what I wrote in my note on "Towards a multi-threaded javac" remains valid, and much of the work that I did to support that investigation has indeed made its way into javac. Further work became stalled when it seemed likely that the gains to be had by more work would not be as good as initially expected.   The initial estimates were based on a different but related compiler which used ANTLR for the parser. That compiler had more to gain by performing the parsing in parallel.  javac has its own parser, and the parser is a smaller fraction of the overall compilation time, so the benefits of parallelizing the parsing would be correspondingly less.
> You don't say what sort of patches you like to attempt. I think any substantial changes to the core of the compiler would be viewed with some amount of scepticism.  I think changes to leverage the work we've done so far to parallelize parsing and class file writing would be met with some amount of interest.  If I were to return to this work, the next step I would do would be to update the parser to optionally buffer error messages, by using a new impl of AbstractLog that can buffer diagnostics until parsing the file is complete, so that the buffered diagnostics can be reported as a group, and not interleaved with any diagnostics that might have occurred in other files being read at the same time.
> -- Jon
> On 07/06/2010 01:41 AM, David Schlosnagle wrote:
> Hello friendly compiler-devs!
> One of the projects I work on has a large number of Java files (around
> 25,000 total), and I've been thinking of ways to decrease the overall
> compilation time, especially in the case of clean builds. We have
> recently undergone an effort to modularize a previous monolithic code
> base into smaller more maintainable modules, very similar to the
> modularization work done for Jigsaw. This modularization does allow us
> to build a proper module dependency graph and build some modules in
> parallel; however, we still have several large modules (in the 2,000
> to 7,000 compilation unit range) with a couple being a serial
> bottleneck in the overall project compilation until we can refactor
> these into smaller modules we're stuck with long compilations that
> don't fully utilize available resources. Due to the dependency
> structure, several of these modules can only be compiled one at a
> time. I'm looking for ways to improve the throughput of javac,
> especially on multi-core machines, possibly using finer grained
> parallelism than what is currently offered by JSR-199 and
> I've read through Jonathan Gibbons' blog post "Towards a
> multi-threaded javac" [1] and related enhancement requests [2] [3]
> [4], and I was wondering if there is any additional work along these
> lines targeted for JDK7. I saw some changes in the repository already
> that seem to be related such as the changes to
> (6724071: refactor Log into a front end
> and back end [5] [6] and 6720185: DiagnosticFormatter refactoring [7]
> [8]) and (6724551: Use Queues
> instead of Lists to link compiler phases [9]); however, I don't see
> the changes mentioned related to parallel parsing in the current JDK7
> langtools repository or further work on making the class generation
> asynchronous.
> I've also read through the OpenJDK Compilation Overview [10] and gone
> through a fair amount of** code, and I'd like to
> take a shot at trying to parallelize some of the process, but before I
> spend too much time on this, is this something reasonable to work on
> and submit a patch for review and possible inclusion? Are there any
> known issues blocking progress on these changes to javac? Does anyone
> have any advice or warnings before I start hacking on a patch? I'm
> aware that many developers may not want to incur the CPU and memory
> overhead this would entail, so I'd assume this functionality would not
> be enabled by default. I was thinking an option similar to `make -j`
> would be a reasonable starting point for defining the max number of
> concurrent tasks or desired parallelism level. I'd love to hear
> others' thoughts on the subject.
> [1]:
> [2]:
> [3]:
> [4]:
> [5]:
> [6]:
> [7]:
> [8]:
> [9]:
> [10]:
> Thanks,
> Dave

More information about the compiler-dev mailing list