Snippet specification feedback
Pavel Rappo
pavel.rappo at oracle.com
Thu Mar 30 10:21:16 UTC 2023
Here's a more substantial reply to your email, as promised.
> On 15 Mar 2023, at 21:14, Tagir Valeev <amaembo at gmail.com> wrote:
>
> Hello!
>
> I'm working on @snippet support improvement in IntelliJ IDEA. I'm using the following page as the primary source of information:
> https://docs.oracle.com/en/java/javase/19/docs/specs/javadoc/doc-comment-spec.html#snippet
>
> Unfortunately, it leaves many questions about how the snippets should be parsed and rendered. Here's my feedback, hopefully it could be helpful.
>
> 1. Attribute value syntax. The spec says "An attribute value may be an identifier, unsigned integer, or enclosed in either single or double quote characters; no escape characters are supported". I assume that {@snippet class=pkg.Class} is a malformed tag, as pkg.Class is not an identifier and not quoted. Nevertheless, it's parsed by the javadoc tool and displayed, as if it were {@snippet class="pkg.Class"}. Similarly {@snippet file=pkg/Class.java} also works, while according to the spec it should not. Is it an implementation problem (implementation is more permissive than required by spec) or a spec problem? If such tag should be accepted, could you specify which non-identifier symbols exactly are allowed in unquoted values?
That seems like a pure specification issue. I think the Standard Doclet Specification (spec) switches between Java, HTML and maybe some other type of identifiers too freely, without proper indication. You can get a hint of that when the snippet section suddenly starts talking about _simple_ identifiers: as far as I know, Java does not categorise identifiers. _Names_ can be qualified and simple, identifiers cannot [^1][^2].
> 2. "region" attribute. It's not explicitly specified that such an attribute exists. One can only guess from existing samples and javadoc tool implementation. Only "id", "lang", "class", and "file" attributes are mentioned in the specification. It would be nice to specify the "region" attribute as well.
Not sure what you mean here. There's (i) a subsection on regions and also (ii) individual snippet tag subsections that mention this attribute.
> 3. "class" attribute. It's specified that "The location of the external code can be specified either by class name". However, in fact it's not the class name. If I create a file snippet-files/Xyz.java and declare a class named "class Abc {}" there, I should refer to it with {@snippet class=Xyz}, rather than {@snippet class=Abc}. It looks to me that the class attribute is just an alternative to a file attribute, which can be constructed from class using `classAttribute.replace('.', '/')+".java"`. It does not check whether the class with a given name and package statement actually appears in a given file. It would be nice to clarify this part of the spec.
Jon would probably be a better person to answer this question.
> 4. Markup tags placement. It's specified: "They are placed in // comments (or the equivalent in other languages or formats)". To me, it's quite a vague statement. Apparently, parser cannot understand every single existing file format in the world, so it may have no idea how comments are represented in the target file format. Moreover, target file format may have no formal specification. For example, if we have external snippet with .txt extension, which kind of comment prefix should we use to define regions? I tried
>
> outside
> # @start region="hello"
> mytest
> # @end
>
> The javadoc tool fails to find the region in this case. But I can argue that # represents a comment line in my text files. I strongly feel that this part should be specified more precisely: either list all possible preceding symbols, or provide another exact description about which preceding characters are recognized as comment start. Should the parser behavior actually depend on the language (specified by 'lang' attribute or file extension)?
If I recall correctly, initially, we didn't want to allow authors to choose the EOL-comment marker. Instead, the marker was and is inferred from the type (the lang attribute) of the snippet. I cannot remember the rationale behind it. It might be because we felt it was "too much too soon", or it might have had something to do with the fact that EOL comments aren't simple: for example, in .properties, # or ! means a comment line [^3], not an end-of-line comment.
Naturally, snippets whose lang attribute has value "java" or "properties" assume such markers. Inline snippet whose lang attribute is unspecified uses //.
Eventually, someone will need to parse an external or hybrid snippet that does not use any of those. We should carefuly think about it; I'm not ready to propose anything at this time.
> 5. Markup tag arguments format. It's not specified completely. There is a sample `@start region=name` which implies that "name=value" format is used for arguments, but it's completely unclear which characters are allowed, which are not, whether the quotation is supported, are there any escape characters, etc. This is especially important, as arguments may contain regular expressions which are known to contain non-trivial characters. One may guess that markup tag arguments are formatted exactly like snippet tag attributes, but it would be nice to specify this explicitly.
Generally, what the snippet parser wants to avoid is ambiguities related to these symbols: ", ', }. Aside from those and the unicode escapes [^4], there are no escapes in snippets and only one special character combination to avoid in inline snippets, */, which wouldn't be an issue if doc comments were hosted in // instead of /* ... */ comments; but that's a discussion for another day.
> 6. Whitespace rendering. While it's said that "Markup comments do not appear in the generated output", the spec does not say anything about preceding whitespace. E.g., consider the following snippet:
>
> /**
> * {@snippet lang=Java :
> * System.out.println(1);
> * // @replace substring=2 replacement=3:
> * System.out.println(2);
> * }
> */
>
> We exclude the // comment from the rendering. However, there are four spaces before it. Should they be rendered? The javadoc tool does not render them. It would be nice to specify this behavior.
It spawned (an internal?) discussion just before the feature was integrated. Early experiments suggested that authors do expect standalone markup to disappear without a trace. So not only should the markup comment and any preceding whitespace go away, but the freed-up empty line should too.
> 7. Common indentation. It looks like the common indentation is stripped from the rendered snippet, similarly to text blocks. Is it an implementation detail or should be specified?
This is intentional and was spelled in JEP, but somehow went missing from the spec. This behavior is to facilitate pasting snippet content into a documentation comment without the need to reindent that content afterwards, which might be painful in some code editors.
> 8. @highlight type. It's not specified which highlight type is used by default. Javadoc tool uses 'bold'. It would be nice to specify this explicitly. Also, it's said that "Valid type names are bold, italic, and highlighted". To me it means that any other type name should be reported as invalid. However, javadoc tool just ignores it. It's unclear to me from the spec whether it's possible to define custom CSS class names and use them here.
This question can be better answered by Jon or Hannes.
> 9. Link type. It's unclear to me how "linkplain" should differ from "link". It looks like, javadoc tool renders them in the same way, as the whole snippet is monospaced.
link and linkplain were modelled after their standard doclet namesakes. While they look the same now, I don't think that they have to stay like this in the future. Jon may have something to add to this.
> Sorry if I'm missing some information.
>
> With best regards,
> Tagir Valeev.
>
-Pavel
[^1]: https://docs.oracle.com/javase/specs/jls/se20/html/jls-3.html#jls-3.8
[^2]: https://docs.oracle.com/javase/specs/jls/se20/html/jls-6.html#jls-6.2
[^3]: https://docs.oracle.com/en/java/javase/20/docs/api/java.base/java/util/Properties.html#load(java.io.InputStream)
[^4]: https://docs.oracle.com/javase/specs/jls/se20/html/jls-3.html#jls-3.3
More information about the javadoc-dev
mailing list