An alternative to "restricted keywords" + helping automatic modules

forax at forax at
Fri May 19 13:53:47 UTC 2017

> De: "Stephan Herrmann" <stephan.herrmann at>
> À: "John Rose" <john.r.rose at>, jigsaw-dev at
> Cc: "Rémi Forax" <forax at>
> Envoyé: Vendredi 19 Mai 2017 12:37:07
> Objet: Re: Re: An alternative to "restricted keywords" + helping automatic
> modules

> A quick question to keep the ball rolling:

> Do we agree on the following assessment of the status quo?

> The definition of "restricted keywords" implies (without explicitly saying so),
> that classification of a word as keyword vs. identifier can only be made
> *after* parsing has accepted the enclosing ModuleDeclaration.
> (With some tweaks, this can be narrowed down to
> "after the enclosing ModuleDirective has been accepted")

> This definition is not acceptable.
I agree that this is not acceptable but this is not what we are proposing. 

You do not have to wait the reduction of ModuleDeclaration (or ModuleDirective), the parser know its parsing state (the LR item) during the parsing not at the end. 
The LR analysis is not able to know at some point during the parsing which production will be reduced later but it is able to know which terminals will not lead to an error when shifting the next terminal. 

When you are in the middle of the parsing, the parser shift a terminal to go from one state to another, so for a state the parser knows if it can shift by a terminal which is among the set of restricted keywords or not then either it can instruct the lexer before scanning the token to activate the restricted keyword automata or after having scanned the token it can classify the token as a keyword instead of as an identifier. 

The idea is that the parser will not only tell when it reduces a production but also when it is about to shift a restricted keyword. 
So you can classify a token as an identifier or as a keyword because the parser is able to bubble up that its parser state (the LR item) may recognize a keyword. 

> comments?
> Stephan


> ----- ursprüngliche Nachricht ---------

> Subject: Re: An alternative to "restricted keywords" + helping automatic modules
> Date: Fr 19 Mai 2017 07:27:31 CEST
> From: John Rose<john.r.rose at>
> To: Stephan Herrmann<stephan.herrmann at>

> On May 18, 2017, at 1:59 AM, Stephan Herrmann < stephan.herrmann at >
> wrote:

>> In all posts I could not find a real reason against escaping,
>> aside from aesthetics. I don't see this as sufficient motivation
>> for a less-then-perfect solution.

> So, by disregarding esthetics...

>> Clarity:
>> I'm still not completely following your explanations, partly because
>> of the jargon you are using. I'll leave it to Alex to decide if he
>> likes the idea that JLS would have to explain terms like dotted
>> production.

>> Compare this to just adding a few more rules to the grammar,
>> where no hand-waving is needed for an explanation.
>> No, I did not say that escaping is a pervasive change.
>> I never said that the grammar for ordinary compilation units
>> should be changed.
>> If you like we only need to extend one rule for the scope of
>> modular compilation units: Identifier. It can't get simpler.

>> Completeness:
>> I understand you as saying, module names cannot start with
>> "transitive". Mind you, that every modifier that will be added
>> to the grammar for modules in the future will cause conflicts for
>> names that are now legal, and you won't have a means to resolve this.

>> By contrast, we can use the escaping approach even to solve one
>> more problem that has been briefly touched on this list before:

>> Automatic modules suffer from the fact that some artifact names may
>> have Java keywords in their name, which means that these artifacts
>> simply cannot be used as automatic modules, right?
>> Why not apply escaping also here? *Any* dot-separated sequence
>> of words could be used as module name, as long as module references
>> have a means to escape any keywords in that sequence.

>> Suitability for implementation:
>> As said, your proposal resolves one problem, but still IDE
>> functionality suffers from restricted keywords, because scanning
>> and parsing need more context information than normal.

> …we obtain the freedom for IDEs to disregard abnormal
> amounts of context, saving uncounted machine cycles,

>> - Recovery after a syntax error will regress.

> …and we make life easier for all ten writers of error recovery
> functions,

>> - Scanning arbitrary regions of code is not possible.

> …we unleash the power of an army of grad students to study
> bidirectional parsing of module files,

>> Remember:
>> In an IDE code with syntax errors is the norm, not an exception,
>> as the IDE provides functionality to work on incomplete code.

> …and ease the burdens of the thousands who must spend their
> time looking at syntax errors for their broken module files.

> Nope, not for me. Give me esthetics, please. Really.

> — John

> ---- ursprüngliche Nachricht Ende ----

More information about the jigsaw-dev mailing list