javac lexer parser rewrite
leszekp at Safe-mail.net
leszekp at Safe-mail.net
Tue Feb 7 03:29:58 PST 2012
Hello
I would gladly experiment with lexing/parsing to see if machine generated code are of comparable speed
I wonder how you measure pure lexing time. Do you have some 'special' wrapper around javac to do it?
Leszek
-------- Original Message --------
From: Maurizio Cimadamore <maurizio.cimadamore at oracle.com>
To: leszekp at safe-mail.net
Cc: compiler-dev at openjdk.java.net
Subject: Re: javac lexer parser rewrite
Date: Tue, 07 Feb 2012 10:27:10 +0000
> Hi
> let me start by saying that I agree with you - the current parser/lexer
> architecture is messy and it represent a barrier for other people to
> chime in and start to contribute. However, when I was working on a
> parser improvement related to lambda expressions (I added lookahead
> support), I was surprised to see how fast javac lexer/parser actually
> are. Here are some 'unofficial' numbers taken on my machine (each run
> correspond to lexing the 'jdk/src' folder of the JDK 8 repo):
>
> Run1: 0m6.501s
> Run2: 0m6.205s
> Run3: 0m6.936s
>
> AVG: 6.547
> TOTAL FILES: 7846
> AVG TIME/FILE: 0.83 * 10-6 s
>
> So, is it messy? Sure - is it fast? Yes, like hell. So, to summarise, I
> think that any effort to try to improve our parser/lexer architecture is
> definitively welcome - however, anyone embarking on such a project
> should keep the above numbers in mind - if you can achieve the same
> speed (well, even marginally slower would be acceptable) than it'd be an
> option well worth considering.
>
> Maurizio
>
> On 07/02/12 09:57, leszekp at safe-mail.net wrote:
> > Hello
> >
> > Javac scanner and parser now are handwritten. The code, especially in parser is quite messy and
> > hard to read and modify.
> > It is possible to rewrite lexer and parser using some kind of java parser generator.
> > It would improve readability and allows for easier modifications.
> >
> > There is a project 'compiler grammar' (which seems dormant). Java lexer and parser were rewritten
> > using antlr. But antrl generated parsers are very slow.
> >
> > Many lexer and parser generators exists which are able to process 'classic' regular expressions for lexer or
> > context free grammars for parser and produce fast code (ie. jflex, beaver, jikes parser generator and more)
> >
> > What do you think about it? Is there a need for such thing? Is it worth the effort?
> >
> > Regards
> > Leszek Piotrowicz
More information about the compiler-dev
mailing list