REPL code review
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Fri Sep 11 15:25:36 UTC 2015
Hi,
this is a review for REPL - I focused on the javac/language-related
parts and I think I came up with some interesting examples. I tried to
cover as many areas as possible and, as a result, the comments are not
as deep. I can dive more into specific sections if required. Overall the
feeling I get from this is very good. The overall machinery seems to be
cleanly defined - I like the fact that you use JDI to completely hide
the REPL frontend from how user-defined classes are loaded/redefined. I
did run into some bugs - some of them minor, some of them more worrisome
- I hope this is the kind of feedback you were looking for.
One general high-level comment, which I also pointed out elsewhere, is
that I'm not sure jshell really belongs in langtools; while it's
semantically (obviously) related to langtools - it is a rather different
beasts w.r.t. all other tools in langtools-land; the fat that it depends
on the JDK (for jline, and for JDI in general) makes it very hard to run
on top of a random JDK and then bootstrapping classes - which is a
technique widely used to be able to run langtools tools w/o having to do
a full build cycle. More specifically, talking about IDE integration, I
don't see how IntelliJ/Netbeans langtools projects (and the langtools
internal developer ant build) could be updated to be able to run/debug
jshell w/o a full build.
Below are the more detailed review comments (organized by class):
* TreeDependencyScanner
- not sure why Object is being used as a return type for the
visitor methods; you could probably just use Void and then use scan()
instead of scanAndReduce() [given there's nothing to reduce] ?
- I'm not sure you need the scanDeclaration method - it is only
used in Scan.processVariable - which runs it on a variable's type - but
variable types are already scanned by TDS references with the right
'decl' set?
* Eval
- just curious - you seem to do a lot of String-based analysis (see
example in computeQualifiedParameterTypes()) - why is it not possible to
rely on the more robust Element API to do that? I.e. couldn't a snippet
corresponding to a declaration have an Element? Or, even more generally,
why not just create real AST nodes for wraps?
- UnresolvedExtractor - it would probably help to use
DiagnosticFlag.RECOVERABLE instead of hard-coded strings to filter out
resolution error (I also noted this somewhere below). However, I also
note how the logic for extracting the unresolved symbol is inherently
fragile and dependent on what is the shape of the javac diagnostics. I
wonder if looking at the analyzed tree might not provide a more robust
solution here.
- the Range logic is clever - but, again, it seems like a big
workaround for the fact that wrapping is defined in terms of strings -
not trees; any reason you went down this path?
- corralled - I found the Snippet hierarchy underused when it comes
to corralling; the task of corralling a snippet is basically performed
by Eval through some helper routines - couldn't we have each snippet
define how corralling should be done? You go half-way there by having a
corral() method in the Snippet base class, but it doesn't do much, other
then retrieving whatever value has been set from outside.
- the corralling logic doesn't seem to work on supertypes using FQN:
-> class A extends B.Inner { static class Inner { } }
| Error:
| package B does not exist
| class A extends B.Inner { static class Inner { } }
| ^-----^
I would have expected the snippet to wait for B.
- the updates of corralled snippets seems to happen at the wrong
time - this could lead to chain of symbols that are not usable:
-> class D extends E { }
| Added class D, however, it cannot be referenced until class E is declared
-> class E { D d; }
| Added class E, however, it cannot be referenced until class D is declared
I guess what happens here is that, since the decision on whether to
corral or not is based on the fact that we can compile cleanly , here we
run into troubles. The first snippet obviously cannot be compiled
cleanly, so it's corralled; the second one cannot compile cleanly
because the first one is not 'defined' - so you get another resolution
error and another corralled snippet. Unfortunately, the two snippets
taken together form a legal source. Btw, I think the current behavior is
preventing stuff like this:
-> class A extends B { }
| Added class A, however, it cannot be referenced until class B is declared
-> class B extends A { }
| Added class B, however, it cannot be referenced until class A is declared
from looping forever in Eval.declare; perhaps the code could be made
more robust by looking explicitly for update loops (i.e. this can be
done by keeping a stack of all the updated snippets and see if we
encounter the same snippet twice).
- on update loops - I guess it is indeed possible to have loops,
even with current corralling limitations (by exploiting redefinition):
-> class Outer { class Inner extends Foo { } }
| Added class Outer, however, it cannot be referenced until class Foo
is declared
-> class Foo { }
| Added class Foo
-> class Foo extends Outer { }
Update loop check logic is in order I think. Also, interestingly - I'm
not able to stop the REPL once I get into such a loop. Actually, later I
found out it doesn't loop forever - I guess eventually it runs out of
stack (after several minutes) and exits saying this:
| Replaced class Foo
| Update overwrote class Foo
Which seems bogus anyway - given there's cyclic inheritance (no error is
reported!).
* ReplParser - in general it looks good. The question I have here is
more design-oriented: why do we need it? I believe the main use case is
that you need to parse the input source to be able to decide between
expressions and declarations. Can't we wrap the snippet in the following
two ways:
class Temp$Decl {
<snippet here>
}
and:
class Temp$Expr {
Object o = <snippet here>
}
and then see who wins? This would also require also to move the wrapping
logic from Eval to some earlier stage (when the input is deemed
complete). It's probably a very big change, so I'm merely throwing this
as a suggestion for future improvements - it's outside the scope of the
review.
* TypePrinter - so, this is needed when you need to print the type of
some expression; it seems the exact copy of the printer used by the
RichDiagnosticFormatter, except for the fact that you want to omit the
captured type-variable details. So, maybe you could split this process
in two: first you map the original type onto a type where all captured
variables have been replaced by their bounds (i.e from List<#CAP1> to
List<Object>) - then you call the RichDiagnosticFormatter printer onto
it (the best way to do that would probably be to create a fake
diagnostic with a type as its argument, and then feed that to the
diagnostic formatter - that should give you back a string). The
RichDiagnosticFormatter, by default will also generate stuff that you
probably don't want, such as 'where' clauses to disambiguate
type-variable names with the same names but different owners - but such
features can be selectively disabled (via javac command line options).
* MemoryFileManager
- Maybe the code of 'list' could be more optimized/lazy by returning
a compound iterator - i.e. an iterator that keeps scanning the items in
the first iterator and, when finished moves onto the next iterator.
- comments: is it normal to repeat all comments from the
JavaFileManager interface?
- watch out for unused methods: dumpClasses, findGeneratedClass,
findGeneratedBytes, inferModuleName, listModueLocations (maybe this
stuff is for Jigsaw? If so, maybe add a comment)
* CompletenessAnalyzer
- Location position constants appear more complex than necessary (or
I'm missing something); it appears that constants like XEXPR1o, XDECL1o,
XSTMT1o are never used? The only uages I see are two:
- to build other constants (such as XEXPR1, XDECL1 and XSTMT1)
- in a switch inside parseUnit
Since no token is really using those constants explicitly, maybe they
could be removed in favor of their more general counterparts (this also
kind of means that i.e. XEXPR1 == XEXPR, etc.)
- I believe in general the design of location codes + 'belongs'
field feels weird, and given this code is quite convoluted (because the
task is hard!) the code might benefit from some cleanup. My suggestions:
(1) come up with a cleaner enum which contains: { EXPR, DECL, STMT,
AMBIGUOUS, ERR } - use this for classification purposes in parseUnit and
related predicates (i.e. what is this token like?)
(2) extract the 'isOkToTerminate' property onto a separate boolean
flag on the token. After all, this is an orthogonal property of all tokens.
- watch out for commented out code/dead code (here and elsewhere)
Sometimes the completeness analysis doesn't behave as expected:
-> new
>> ;
>> ;
>> ;
>> ;
>> ;
>> ;
>> Foo();
| Error:
| <identifier> expected
| new
| ^
| Error:
| '(' or '[' expected
| ;
| ^
In the above, I found it confusing that the ';' is not enough to tell
the analysis logic that I'm done with the statement - and the analysis
will keep waiting until it matches a '()'.
-SourceCodeAnalysisImpl - generally looks good, and the tab completion
logic is quite clever:
- in scopeContent() what is the difference between the
'scopeIterable' and a call to Scope.getLocalElements (which is routed
via Scope.getSymbols)?
- the logic for filtering based on target type is cool - but could
use some cleanup - for instance there are two smart filters, and it's
not clear why those are needed; it seems like smartTypeFilter is only
used in imports, and the main difference is that it is allowed to
operate on type elements, while the smartFilter can't; but the typing
logic is the same (except for isAssignable vs. isSubtype - which
shouldn't be important in the smartTypeFilter case anyway). I suggest
just using smartFilter (without the class filtering logic), and the add
the proper extra filtering logic
(i.e. must be a class, must be a non-class) as a separate 'kind' filter.
-found a bug with varargs (varargs expansion does not take place):
-> class Foo { static void m(int... i) { } }
| Replaced class Foo
| Update overwrote class Foo
-> Foo.m(Display all 395 possibilities? (y or n)
-found bugs around constructors; asMemberOf doesn't get called, and
that leads to useless smart filters:
-> class Baz<X> {
>> Baz(X x) { }
>> }
| Added class Baz
-> String s = "";
| Added variable s of type String with initial value ""
-> new Baz(Display all 391 possibilities? (y or n)
-> new Baz<String>(Display all 391 possibilities? (y or n)
-> Baz<String> bz = new Baz<>(Display all 391 possibilities? (y or n)
The problem is that the type of the constructor is Baz(X) and no
variable in scope can possibly have type X, so all elements in the scope
are cleared out. In the first two cases should be fixable by simply
adding a call to asMemberOf() on the constructor element. The third case
is more difficult - perhaps some heuristics could be attempted, but the
general case is hard.
-smart completion doesn't always work as expected - in particular,
target-type-based suggestions seem to occur only once:
-> fs.m(
ls <press tab to see more>
-> fs.m
| Error:
| cannot find symbol
| symbol: variable m
| fs.m
| ^--^
-> fs.m(Display all 389 possibilities? (y or n)
After first time, all members will be displayed (unless you start
working on a different instance...)
* TaskFactory:
- I see that cntResolve counts the recoverable errors - javac does
have a notion of recoverable errors that is used in annotation
processing - all recoverable diagnostics are marked with a special flag
(DiagnosticFlag.RECOVERABLE). I wonder if it would be more robust to
rely on that, rather than using raw diagnostic text (as the latter might
change over time).
- I like the hierarchy of tasks - it is very easy to see which tasks
are meant to operate on source code and which on synthetic wraps.
Maurizio
More information about the kulla-dev
mailing list