javac lexer parser rewrite
Jonathan Gibbons
jonathan.gibbons at oracle.com
Tue Feb 7 07:15:02 PST 2012
For javac, just write a small program to call the (internal) javac
classes directly.
-- Jon
On 02/07/2012 03:29 AM, leszekp at safe-mail.net wrote:
> Hello
> I would gladly experiment with lexing/parsing to see if machine generated code are of comparable speed
> I wonder how you measure pure lexing time. Do you have some 'special' wrapper around javac to do it?
>
> Leszek
>
> -------- Original Message --------
> From: Maurizio Cimadamore<maurizio.cimadamore at oracle.com>
> To: leszekp at safe-mail.net
> Cc: compiler-dev at openjdk.java.net
> Subject: Re: javac lexer parser rewrite
> Date: Tue, 07 Feb 2012 10:27:10 +0000
>
>> Hi
>> let me start by saying that I agree with you - the current parser/lexer
>> architecture is messy and it represent a barrier for other people to
>> chime in and start to contribute. However, when I was working on a
>> parser improvement related to lambda expressions (I added lookahead
>> support), I was surprised to see how fast javac lexer/parser actually
>> are. Here are some 'unofficial' numbers taken on my machine (each run
>> correspond to lexing the 'jdk/src' folder of the JDK 8 repo):
>>
>> Run1: 0m6.501s
>> Run2: 0m6.205s
>> Run3: 0m6.936s
>>
>> AVG: 6.547
>> TOTAL FILES: 7846
>> AVG TIME/FILE: 0.83 * 10-6 s
>>
>> So, is it messy? Sure - is it fast? Yes, like hell. So, to summarise, I
>> think that any effort to try to improve our parser/lexer architecture is
>> definitively welcome - however, anyone embarking on such a project
>> should keep the above numbers in mind - if you can achieve the same
>> speed (well, even marginally slower would be acceptable) than it'd be an
>> option well worth considering.
>>
>> Maurizio
>>
>> On 07/02/12 09:57, leszekp at safe-mail.net wrote:
>>> Hello
>>>
>>> Javac scanner and parser now are handwritten. The code, especially in parser is quite messy and
>>> hard to read and modify.
>>> It is possible to rewrite lexer and parser using some kind of java parser generator.
>>> It would improve readability and allows for easier modifications.
>>>
>>> There is a project 'compiler grammar' (which seems dormant). Java lexer and parser were rewritten
>>> using antlr. But antrl generated parsers are very slow.
>>>
>>> Many lexer and parser generators exists which are able to process 'classic' regular expressions for lexer or
>>> context free grammars for parser and produce fast code (ie. jflex, beaver, jikes parser generator and more)
>>>
>>> What do you think about it? Is there a need for such thing? Is it worth the effort?
>>>
>>> Regards
>>> Leszek Piotrowicz
More information about the compiler-dev
mailing list