RFR: 8299080: Wrong default value of snippet lang attribute
Pavel Rappo
prappo at openjdk.org
Mon Jul 1 13:39:29 UTC 2024
Please review this bugfix to the way the language of a snippet is determined and processed.
The language of a snippet affects the form of snippet markup and enables external syntax highlighting, such as that provided by prism.js. The language of a snippet is [[determined](https://docs.oracle.com/en/java/javase/22/docs/specs/javadoc/doc-comment-spec.html#snippet)](https://docs.oracle.com/en/java/javase/22/docs/specs/javadoc/doc-comment-spec.html#snippet) as follows:
> A snippet may specify a `lang` attribute, which identifies the kind of content in the snippet. For an inline snippet, the default value is `java`. For an external snippet, the default value is derived from the extension of the name of the file containing the snippet's content.
There are two issues that this PR fixes. The first issue is a specification issue. The spec says nothing about the language of a hybrid snippet, which has features of both an inline and external snippets. It makes sense to specify that in the absence of the `lang` attribute, the language of a hybrid snippet is derived from the file extension. Put differently, when determining the language, a hybrid snippet behaves like an external snippet, not like an inline snippet.
The second issue is an implementation issue. If the `lang` attribute or the file extension is `java` or `properties`, then the form of markup corresponds to that language and the HTML construct modelling the snippet is attributed with `class=language-java` or `class=language-properties` respectively. This is expected. However, if the `lang` attribute or the file extension is neither of those, or the `lang` attribute is default, then the form of markup is assumed to be that of `java`, but the HTML construct modelling the snippet is not attributed, which means that the language is not passed through to the 3rd party syntax highlighters.
Stepping out of this PR for a moment, there is clearly a conflation between the language of a snippet and the form of snippet markup. Those are linked and controlled by a single knob. That and the design whereby every snippet in an unsupported language can use markup for the Java language was purposeful: it was considered simple and practical.
This PR proposes that the language of a snippet is determined and processed as follows:
1. If the `lang` attribute is present, then its value is the language; if that value is empty, then the language is undefined
2. Otherwise,
1. If the snippet is inline, then the language is `java`
2. Otherwise (i.e. the snippet is external or hybrid), the language is determined as follows:
1. If the `class` attribute is present, then the language is `java`
2. Otherwise, the value of the `lang` attribute is assumed equal to the extension of the file specified in the `file` attribute; if the file has no extension or the extension cannot be determined, the language is undefined
3. If the language is `java` or `properties`, then snippet markup is processed accordingly
4. Otherwise, snippet markup processed as if the language were `java`
5. If the language is defined as `<val>`, then HTML is attributed with `class=language-<val>`; if the language is undefined, no such attribute is present
-------------
Commit messages:
- Initial commit
Changes: https://git.openjdk.org/jdk/pull/19971/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=19971&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8299080
Stats: 197 lines in 5 files changed: 128 ins; 28 del; 41 mod
Patch: https://git.openjdk.org/jdk/pull/19971.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/19971/head:pull/19971
PR: https://git.openjdk.org/jdk/pull/19971
More information about the javadoc-dev
mailing list