The difference between comment and document about the parser

Guoxiong Li lgxbslgx at gmail.com
Sat May 22 12:10:15 UTC 2021


Hi Jon,

I read some more information about the parser and the code of the javac
parser recently.
It seems that the javac parser should be named the* LL(K) *parser precisely
instead of LL(1).
Because it looks ahead to more than one token.

I agree with you that both the document of the project `Compiler Grammar`
and the comment in JavacParser.java need to be revised or be clarified.

I will submit a PR to revise the comment in JavacParser.java later.

But I have never revised the document on the official website.
Could I get your help to direct me to the corresponding location and
sponsor me?

Best Regards,
-- Guoxiong


On Sun, Jan 24, 2021 at 3:50 AM Jonathan Gibbons <
jonathan.gibbons at oracle.com> wrote:

> For the comment in JavacParser.java, I suggest the phrase "with code
> derived systematically from an LL(1) grammar" is replaced by something like
> "according to the grammar described in the Java Language Specification",
> and for bonus points, possibly even give the URL
> https://docs.oracle.com/javase/specs/index.html. I would recommend not
> citing any specific version, because that would need to be updated on each
> release.  For additional pedantry, you could also include reference to the
> preview features, which are not part of JLS itself.
>
> -- Jon
> On 1/23/21 9:05 AM, Jonathan Gibbons wrote:
>
> Going back in the public record, I note that JLS 3rd Edition, chapter
> 18[1] says:
>
> *The grammar presented in this chapter is the basis for the reference
> implementation. Note that it is not an LL(1) grammar, though in many cases
> it minimizes the necessary look ahead.*
>
> -- Jon
>
> 1: https://docs.oracle.com/javase/specs/jls/se6/html/syntax.html
> On 1/22/21 12:49 AM, Guoxiong Li wrote:
>
> Hi all,
>
> The comment at class jdk.compiler/com.sun.tools.javac.parser.JavacParser
> states it as below.
>
> ```
> /** The parser maps a token sequence into an abstract syntax
>  *  tree. It operates by recursive descent, with code derived
>  *  systematically from an LL(1) grammar. For efficiency reasons, an
>  *  operator precedence scheme is used for parsing binary operation
>  *  expressions.
> ```
>
> And the document of the project `Compiler Grammar`[1] states it as below.
>
> > The parser that is currently in the javac compiler is a hand-written
> LALR parser.
>
> We can see that one is  LL(1) and another is LALR. I think the comment may
> be right.
> No matter which one is the right description, the difference is not
> acceptable and need to be unified.
>
> What is your opinion? Any idea is appreciated.
>
> [1] http://openjdk.java.net/projects/compiler-grammar/
>
> Best Regards.
>
> -- xiong
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/compiler-dev/attachments/20210522/13af7206/attachment.htm>


More information about the compiler-dev mailing list