Can we get the IDE for free, too? What do we need for full IDE Integration for Truffle Languages?

Wed Aug 17 11:02:16 UTC 2016

On středa 10. srpna 2016 13:17:47 CEST Stefan Marr wrote:
> Hi:
> 
> I recently experiment with IDE support for my Truffle-based languages, and
> wrote up a few notes on what I see Truffle could do. Long post below.
> 
> If you prefer the HTML version, it’s here:
> http://stefan-marr.de/2016/08/can-we-get-the-ide-for-free-too/
> 
> Comments welcome.

Nice. What are the next steps? Here are those that I have in mind:

== Implement Language Server Protocol ==

Having a trivial implementation of https://github.com/Microsoft/language-server-protocol/ in graal would be natural start, right? It would give IDEs 
something to connect to and display. I know you can get some structural info 
about SOM programs that could be used for code completion. Having that working 
could inspire other people to join.

== Generate Java API for the Language Server Protocol ==

I could probably generate Java API bindings for the protocol, so we don't have 
to write that by hand and update everytime the protocol evolves. There is 
[TypeScript to Java transpiler](https://dukescript.com/update/2016/07/01/
transcript-to-java.html) and this could be a nice test.

== Use Language Server Protocol in NetBeans ==

I know guys in NetBeans who work on the editor infrastructure and could call 
the protocol and display the necessary information in the IDE.

-jt

> Best regards
> Stefan
> 
> 
> # Can we get the IDE for free, too?
> # What do we need for full IDE Integration for Truffle Languages?
> 
> With the [Truffle language implementation framework][4], we got a powerful
> foundation for implementing languages as simple interpreters. In combination
> with the Graal compiler, Truffle interpreters execute their programs as
> [very efficient native code][5].
> 
> Now that we got just-in-time compilation essentially "[for free][6]", _can
> we get IDE integration for our Truffle languages as well?_
> 
> In case you wonder, this is inspired by the [language server protocol][3]
> project of Microsoft, RedHat, Eclipse Che, and others. Their goal is to
> develop a language-agnostic protocol that connects so-called _language
> servers_ to IDEs. That made me wonder whether we could provide the
> infrastructure needed for such a language server as part of Truffle.
> 
> In the remainder of this post, I'll briefly discuss what IDE features would
> be desirable as a start, what of that Truffle currently could support, and
> how far we could got with a language-agnostic API as part of Truffle.
> 
> ## 1. Which IDE Features would be desirable?
> 
> Generally, I am thinking about features available in IDEs such as Eclipse,
> Visual Studio, NetBeans, or Pharo. On the one hand, there are tools that
> help to understand the execution of a program. Typically, this includes
> debugging support, inspecting of values, but it can also be about profiling
> to identify performance issues. Such execution-related aspects are covered
> by Truffle already today. The framework comes with support for a debugger
> and a profiler. The debugger can be used across Truffle languages for
> instance in [NetBeans][7] or in [a web-based experiment of mine][8].
> 
> Features that are not strictly related to execution are however not
> supported. In the research community, this area is something where
> [Language Workbench projects][1] [[1]] excel. They often come with their
> own domain-specific languages to define language grammars, and use
> transformation or compilation approaches to generate a wide ranges of tools
> from such specification.
> 
> The tools I find most essential for an IDE include:
>  - support for highlighting (syntactic and semantic)
>  - code browsing, structural representation
>  - code completion (incl. signature information, and API documentation)
>  - reporting of parsing and compilation errors
>  - reporting of potential bugs and code quality issues
>  - quick fix functionality
>  - refactoring support
> 
> ![Code Completion for SOMns in VS Code](figures/code-completion.png)
> 
> For code completion, as in the figure above, one needs of course
> language-specific support, ideally taking the current context into account,
> and adjusting the set of proposals to what is possible at that lexical
> location.
> 
> ![Go to Definition for SOMns in VS Code](figures/go-to-definition.png)
> 
> Similarly, for the _go to definition_ example above, it is most useful if
> the language semantics are taken into account. My prototype currently does
> not take into account that the receiver of a `println` in this case is a
> literal, which would allow it to reduce the set of possible definitions to
> the one in the `String` class.
> 
> ![Parsing Error in SOMns in VS Code](figures/parse-error.png)
> 
> Other features are more simple mappings of already available functionality.
> The error message above is shown on parsing errors, and helps to understand
> why the parser complains. That type of error is also shown in typical
> languages for instance on the command line to enable basic programming.
> 
> ![Navigating a file based on its program entities](figures/symbol-info.png)
> 
> The other features are clearly more language-specific, as is the last
> example above, the browsing of a file based on the entities it defines.
> However, most of these elements can be mapped onto a common set of concepts
> that can be handled in a language-agnostic way in an IDE.
> 
> While the [DynSem][9] project might bring all these generation-based
> features to Truffle languages, I wonder whether we can do more in a
> bottom-up approach based on the interpreters we already got. [Ensō][2], a
> self-describing DSL workbench, seems to go the route of interpretation over
> transformation as well.
> 
> ## 2. What does Truffle currently support?
> 
> As mentioned already above, Truffle currently focuses on providing a
> framework geared towards tooling for language execution. This focuses
> mainly on providing the implementation framework for the languages
> themselves, but includes also support for [language instrumentation][10]
> that can be used to implement debuggers, profilers, tools for collecting
> dynamic metrics, [coverage tracking][11], [dynamic code reloading][12],
> etc.
> 
> The framework is based on the idea that AST nodes, i.e., the basic
> executable elements to which a parser transforms an input program, can be
> _tagged_. An instrumentation tool can then act based on these tags and for
> instance add extra behavior to a program or track its execution. AST nodes
> are correlated to their lexical representation in a program by so-called
> `SourceSections`. A `SourceSection` encodes the coordinates in the program
> file.
> 
> Unfortunately, this is where the support from the Truffle framework ends.
> Tooling aspects revolving around the lexical aspects of programs are
> currently not supported.
> 
> ## 3. Can we provide a language-agnostic API to implement tooling focused on
> lexical aspects?
> 
> The most basic aspect that is currently missing in Truffle for any kind of
> lexical support is mapping any location in a source to a corresponding
> semantic entity. There are two underlying features that are currently
> missing for that. First, we would need to actually retain information about
> code that is not part of a method, i.e., is not part of the ASTs that are
> built for Truffle. Second, we would need a simple lookup data structure
> from a location in the source to the corresponding element. To implement
> for instance code completion, we need to identify the code entity located
> at the current position in the source, as well as its context so that we
> can propose sensible completion options.
> 
> Let's go over the list of things I wanted.
> 
> ### 3.1 Support for Highlighting (Syntactic and Semantic)
> 
> This needs two things. First, a set of common tags to identify concepts such
> as keywords, literals, or method definitions. Second, it needs an API to
> communicate that information during parsing to Truffle so that it is stored
> along with the `Source` information for instance, and can be used for
> highlighting.
> 
> To support semantic instead of purely syntactic highlighting, Truffle would
> furthermore need support for dynamic tags. Currently, Truffle assumes that
> tags are not going to change after AST creation. For many dynamic
> languages, this is however too early. Only executing the code will
> determine whether an operation is for instance a field read or an access to
> a global.
> 
> ### 3.2 Code Browsing, Structural Representation
> 
> Communicating structural information might be a bit more challenging in a
> language-agnostic way. One could got with the choice of a superset of
> concepts that is common to various languages and then provide an API that
> records these information on a per-`Source` basis, which could then be
> queried from an IDE.
> 
> One challenge here, especially from the perspective of editing would be to
> chose data structures that are easily and efficiently _evolvable/updatable_
> during the editing of a source file. Assuming that the editor provides the
> information about only parts of a source having changed, it should be
> possible to leverage that.
> 
> Note, this also requires a parser that is flexible enough to parse such
> chunks. This is however something that would be language-specific,
> especially since Truffle leaves the parsing aspect completely to the
> languages.
> 
> ### 3.3 Code Completion (incl. signature information, and API documentation)
> 
> For code completion, one needs a mapping from the source locations to the
> 'lexically dominating' entities. With that I mean, not necessarily the
> structure of an AST, but as with the highlighting, a mapping from the source
> location to the most relevant element from a user's perspective. Assuming
> we got that for highlighting already, we would need language-specific
> _lookup_ routines to determine the relevant elements for code completion.
> And those should then probably also return all the language-specific
> information about signatures (name and properties of arguments, e.g.,
> types) as well as API documentation.
> 
> ### 3.4 Reporting of Parsing and Compilation Errors
> 
> Depending on how far one wants to take that, this could be as simple as an
> API to report one or more parsing or compilation exceptions.
> 
> I see two relevant aspects here that should be considered. The first is that
> the current `PolyglotEngine` design of Truffle does not actually expose a
> `parse()` operation. It is kept sealed off under the hood. Second,
> depending on the language and the degree of convenience one would want to
> provide to users, the parser might want to continue parsing after the first
> error and report multiple issues in one go. This might make the parser much
> more complex, but for compilation, i.e., structural or typing issues
> unrelated to the syntax, one might want to report all issues, instead of
> aborting after the first one. Such features might require very different
> engineering decisions compared to implementations that abort on the first
> error, but it would improve the programming experience dramatically.
> 
> ### 3.5 Reporting of Potential Bugs and Code Quality Issues
> 
> This doesn't seem to be fundamentally different from the previous issue. The
> question is whether an API for querying such information is something that
> belongs into the `PolyglotEngine`, or whether there should be another entry
> point for such tooling altogether. Since I have a strong dynamic languages
> background, I'd argue the `PolyglotEngine` is the right place. I want to
> execute code to learn more about the behavior of a program. I want to run
> unit tests to get the best feedback and information (including types and
> semantic highlighting) about my code. So, I'd say it belongs all in there.
> 
> ### 3.6 Quick-Fix Functionality and Refactoring Support
> 
> I haven't really experimented with these aspects, but there seems to be a
> language-specific and a language-agnostic component to it. The
> language-specific component would be to identify the entities that need to
> be changed by a quick fix or refactoring, as well as determining the
> replacement. The actual operation however seems to be fairly
> language-independent and could be a service provided by a common
> infrastructure to change `Source` objects/files.
> 
> ## 4. Conclusion
> 
> To me it seems that there is huge potential for Truffle to provide more
> language-agnostic infrastructure to realize standard and perhaps
> non-standard IDE features by providing additional APIs to be implemented by
> languages. Getting basic features is something reasonably straight forward
> and would help anyone using a language that doesn't already have IDE
> support traditionally.
> 
> However, there are also a couple of challenges that might be at odds with
> Truffle as a framework for languages that are mostly tuned for peak
> performance. In my own experience, adapting the SOMns parser to provide all
> the necessary information for highlighting, code completion, go to
> definition, etc., requires quite a few design decisions that depart from
> the
> straight-forward parsing that I did before to just directly construct the
> AST. On the one hand, I need to retain much more information than before.
> My Truffle ASTs are very basic and contain only elements relevant for
> execution. For editing in an IDE however, we want all the _declarative_
> elements as well. On the other hand, one probably wants a parser that is
> incremental, and perhaps works on the chunk that the editor identified as
> changed. If the parser wasn't designed from the start to work like that,
> this seems to require quite pervasive changes to the parser. Similarly, one
> would need a different approach to parsing to continue after a parse error
> was found. On top of that comes the aspect of storing the desired
> information in an efficient data structure. Perhaps that is something where
> persistent data structures would be handy.
> 
> While there are challenges and tradeoffs, for language implementers like me,
> this would be a great thing. I'd love to experiment with my language and
> still get the benefits of an IDE. Perhaps not exactly for free, but with a
> reasonable effort. While Language Workbenches provide such features, I
> personally prefer the bottom up approach. Instead of specifying my
> language, I'd rather express the one truth of its semantics as an
> interpreter. In the end, I want to execute programs. So, let's start there.
> 
> ### PS
> 
> I'd like to thank [Michael L. Van De Vanter][14] for discussions on the
> topic. The code for my experiments with the language server protocol and
> SOMns are available on GitHub as part of the [SOMns-vscode][15] project. So
> far, I haven't tried to integrate it with Truffle however.
> 
> And I have to admit, I am currently mostly focusing on the [MetaConc][16]
> project, where we aim a little higher and try to go beyond what people come
> to expect when debugging concurrent applications.
> 
> ### References
> 
> 1. Erdweg, S.; van der Storm, T.; Völter, M.; Boersma, M.; Bosman, R.; Cook,
> W. R.; Gerritsen, A.; Hulshout, A.; Kelly, S.; Loh, A.; Konat, G. D. P.;
> Molina, P. J.; Palatnik, M.; Pohjonen, R.; Schindler, E.; Schindler, K.;
> Solmi, R.; Vergu, V. A.; Visser, E.; van der Vlist, K.; Wachsmuth, G. H. &
> van der Woning, J. (2013), [The State of the Art in Language
> Workbenches][1], Springer, pp. 197-217.
> 
> [1]: http://homepages.cwi.nl/~storm/publications/lwc13paper.pdf
> [2]: http://enso-lang.org/
> [3]: https://github.com/Microsoft/language-server-protocol/
> [4]: https://github.com/graalvm/truffle#readme
> [5]:
> https://github.com/smarr/are-we-fast-yet/blob/master/docs/performance.md#pe
> rformance-results [6]:
> http://stefan-marr.de/papers/oopsla-marr-ducasse-meta-tracing-vs-partial-ev
> aluation/ [7]: https://youtu.be/ewdzDqPsn38
> [8]: https://vimeo.com/smarr/web-debugger-01
> [9]: https://github.com/metaborg/dynsem
> [10]:
> http://lafo.ssw.jku.at/javadoc/truffle/latest/com/oracle/truffle/api/instru
> mentation/package-summary.html [11]:
> https://github.com/MetaConc/CoverallsTruffle#readme
> [12]:
> http://2016.ecoop.org/event/icooolps-2016-trufflereloader-a-low-overhead-la
> nguage-neutral-reloader [13]: https://github.com/smarr/SOMns
> [14]: http://vandevanter.net/mlvdv/
> [15]: https://github.com/smarr/SOMns-vscode/tree/master/server
> [16]: http://ssw.jku.at/Research/Projects/MetaConc/