RFR: JDK-8298405: Support Markdown in Documentation Comments [v6]

Fri Jan 19 17:53:43 UTC 2024

On Mon, 8 Jan 2024 21:26:50 GMT, Jonathan Gibbons <jjg at openjdk.org> wrote:

>> Please review a patch to add support for Markdown syntax in documentation comments, as described in the associated JEP.
>> 
>> Notable features:
>> 
>> * support for `///` documentation comments in `JavaTokenizer`
>> * new module `jdk.internal.md` -- a private copy of the `commonmark-java` library
>> * updates to `DocCommentParser` to treat `///` comments as Markdown
>> * updates to the standard doclet to render Markdown comments in HTML
>
> Jonathan Gibbons has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision:
> 
>  - Merge with upstream/master
>  - Merge remote-tracking branch 'upstream/master' into 8298405.doclet-markdown-v3
>  - Address review comments
>  - Fix whitespace
>  - Improve handling of embedded inline taglets
>  - Customize support for Markdown headings
>  - JDK-8298405: Support Markdown in Documentation Comments

src/jdk.compiler/share/classes/com/sun/tools/javac/parser/DocCommentParser.java line 259:

> 257:         LineKind lineKind = textKind == DocTree.Kind.MARKDOWN ? peekLineKind() : null;
> 258: 
> 259:         if (DEBUG) System.err.println("starting content " + showPos(bp) + " " + newline);

Debug output is useful. I wonder if we should consider https://openjdk.org/jeps/264.

src/jdk.compiler/share/classes/com/sun/tools/javac/parser/DocCommentParser.java line 1295:

> 1293:             switch (ch) {
> 1294:                 case ' ' -> indent++;
> 1295:                 case '\t' -> indent = 4;

Shouldn't it be like this or it does not matter?
 ```suggestion
                 case '\t' -> indent += 4;

src/jdk.compiler/share/classes/com/sun/tools/javac/parser/DocCommentParser.java line 1336:

> 1334:                     switch (initialLineKind) {
> 1335:                         case CODE_FENCE -> {
> 1336:                             if (lineKind == LineKind.CODE_FENCE && ch == term && count(ch) == count) {

https://spec.commonmark.org/0.30/#example-124 shows that the closing fence may be longer than the opening one: consider `count(ch) >= count`.

That said, I note that on my experiment the resulting output was identical with or without the change I ask you to consider. Perhaps I haven't yet understood how the parsing works.

src/jdk.compiler/share/classes/com/sun/tools/javac/parser/DocCommentParser.java line 1377:

> 1375:          * @see <a href="https://spec.commonmark.org/0.30/#atx-headings">ATX Headings</a>
> 1376:          */
> 1377:         ATX_HEADER(Pattern.compile("#{1,6}( .*|$)")),

Nothing wrong here, I just didn't know that an ATX header "opening sequence of # characters" can be followed by the end of line". Interesting.

src/jdk.compiler/share/classes/com/sun/tools/javac/parser/DocCommentParser.java line 1421:

> 1419:             case '#', '=', '-', '+', '_', '`', '~' -> {
> 1420:                 String line = peekLine();
> 1421:                 for (LineKind lk : LineKind.values()) {

Nothing wrong here, just noting that this is one more way one can depend on an enum constant order. Change it, and a peeked line kind might change too (e.g. `OTHER` matches everything.)

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16388#discussion_r1458858629
PR Review Comment: https://git.openjdk.org/jdk/pull/16388#discussion_r1452560905
PR Review Comment: https://git.openjdk.org/jdk/pull/16388#discussion_r1452650328
PR Review Comment: https://git.openjdk.org/jdk/pull/16388#discussion_r1452629053
PR Review Comment: https://git.openjdk.org/jdk/pull/16388#discussion_r1452632699