REPL code review -- CompletenessAnalyzer
Robert Field
robert.field at oracle.com
Tue Sep 22 05:46:15 UTC 2015
On 09/11/15 08:25, Maurizio Cimadamore wrote:
> * CompletenessAnalyzer
>
> - Location position constants appear more complex than necessary
> (or I'm missing something); it appears that constants like XEXPR1o,
> XDECL1o, XSTMT1o are never used? The only uages I see are two:
>
> - to build other constants (such as XEXPR1, XDECL1 and XSTMT1)
>
> - in a switch inside parseUnit
Exactly, they define the bits and the semantics of those bits. They are
applied as a set of bits.
The comments were broken, I'm updating as such --
// Location position kinds -- a token is ...
private static final int XEXPR = 0b1;
// OK in expression (not first)
private static final int XDECL = 0b10;
// OK in declaration (not first)
private static final int XSTMT = 0b100;
// OK in statement framework (not first)
private static final int XEXPR1o = 0b1000;
// OK first in expression
private static final int XDECL1o = 0b10000;
// OK first in declaration
private static final int XSTMT1o = 0b100000;
// OK first or only in statement framework
private static final int XEXPR1 = XEXPR1o | XEXPR;
// OK in expression (anywhere)
private static final int XDECL1 = XDECL1o | XDECL;
// OK in declaration (anywhere)
private static final int XSTMT1 = XSTMT1o | XSTMT;
// OK in statement framework (anywhere)
private static final int XANY1 = XEXPR1o | XDECL1o |
XSTMT1o; // Mask: first in statement, declaration, or expression
private static final int XTERM = 0b100000000;
// Can terminate (last before EOF)
private static final int XERRO = 0b1000000000;
// Is an error
So, for example "boolean" is XEXPR | XDECL1, which is XEXPR | XDECL |
XDECL1o -- meaning it can occur in an expression (in a cast), or at the
beginning or later in a declaration. If we encounter "boolean" at the
beginning of a snippet, parseUnit() masks to just look at allowed in
first position, in this case XDECL1o. So, it knows it is the beginning
of a declaration.
>
> Since no token is really using those constants explicitly, maybe they
> could be removed in favor of their more general counterparts (this
> also kind of means that i.e. XEXPR1 == XEXPR, etc.)
See above, it is being used.
>
> - I believe in general the design of location codes + 'belongs'
> field feels weird, and given this code is quite convoluted (because
> the task is hard!) the code might benefit from some cleanup. My
> suggestions:
> (1) come up with a cleaner enum which contains: { EXPR, DECL,
> STMT, AMBIGUOUS, ERR } - use this for classification purposes in
> parseUnit and related predicates (i.e. what is this token like?)
> (2) extract the 'isOkToTerminate' property onto a separate boolean
> flag on the token. After all, this is an orthogonal property of all
> tokens.
Does my explanation clarify? Am I still missing a simplification?
An enum won't work as they are unions of possibilities.
>
> - watch out for commented out code/dead code (here and elsewhere)
>
> Sometimes the completeness analysis doesn't behave as expected:
>
> -> new
> >> ;
> >> ;
> >> ;
> >> ;
> >> ;
> >> ;
> >> Foo();
> | Error:
> | <identifier> expected
> | new
> | ^
> | Error:
> | '(' or '[' expected
> | ;
> | ^
>
> In the above, I found it confusing that the ';' is not enough to tell
> the analysis logic that I'm done with the statement - and the analysis
> will keep waiting until it matches a '()'.
>
OK, "new" has special processing that has makes it error prone. In the
webrev below I remove that special processing at a small loss of error
detection, and add a general check for bad sequencing.
http://cr.openjdk.java.net/~rfield/CompletenessAnalyzer_v0/
This fixes "new ;" but does not fix "(;". The lexical analysis does not
have enough context to know this isn't the beginning of a
for-statement. Basically, everything between matching parenthesis,
brackets, or braces is currently ignored -- effectively "(...)",
"[...]", or "{...}". A result of this is, for example, that "else" is
returned from completion as "unknown" (erroneous) so it is immediately
parsed and an error returned; Whereas "(else" is returned from
completion as "incomplete" prompting for more input.
To fix this, the nesting analysis would need to be moved into the parser
(where is typically belongs) which would be a fairly significant
change. I will poke at that and see how complex it is. Your thoughts on
its importance?
Thanks,
Robert
More information about the kulla-dev
mailing list