RFR: 8299080: Wrong default value of snippet lang attribute
Pavel Rappo
prappo at openjdk.org
Mon Jul 1 15:22:18 UTC 2024
On Mon, 1 Jul 2024 14:03:52 GMT, Chen Liang <liach at openjdk.org> wrote:
>> Please review this bugfix to the way the language of a snippet is determined and processed.
>>
>> The language of a snippet affects the form of snippet markup and enables external syntax highlighting, such as that provided by prism.js. The language of a snippet is [[determined](https://docs.oracle.com/en/java/javase/22/docs/specs/javadoc/doc-comment-spec.html#snippet)](https://docs.oracle.com/en/java/javase/22/docs/specs/javadoc/doc-comment-spec.html#snippet) as follows:
>>
>>> A snippet may specify a `lang` attribute, which identifies the kind of content in the snippet. For an inline snippet, the default value is `java`. For an external snippet, the default value is derived from the extension of the name of the file containing the snippet's content.
>>
>> There are two issues that this PR fixes. The first issue is a specification issue. The spec says nothing about the language of a hybrid snippet, which has features of both an inline and external snippets. It makes sense to specify that in the absence of the `lang` attribute, the language of a hybrid snippet is derived from the file extension. Put differently, when determining the language, a hybrid snippet behaves like an external snippet, not like an inline snippet.
>>
>> The second issue is an implementation issue. If the `lang` attribute or the file extension is `java` or `properties`, then the form of markup corresponds to that language and the HTML construct modelling the snippet is attributed with `class=language-java` or `class=language-properties` respectively. This is expected. However, if the `lang` attribute or the file extension is neither of those, or the `lang` attribute is default, then the form of markup is assumed to be that of `java`, but the HTML construct modelling the snippet is not attributed, which means that the language is not passed through to the 3rd party syntax highlighters.
>>
>> Stepping out of this PR for a moment, there is clearly a conflation between the language of a snippet and the form of snippet markup. Those are linked and controlled by a single knob. That and the design whereby every snippet in an unsupported language can use markup for the Java language was purposeful: it was considered simple and practical.
>>
>> This PR proposes that the language of a snippet is determined and processed as follows:
>>
>> 1. If the `lang` attribute is present, then its value is the language; if that value is empty, then the language is undefined
>> 2. Otherwise,
>> 1. If the snippet...
>
> src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/taglets/SnippetTaglet.java line 498:
>
>> 496: return null;
>> 497: }
>> 498: return (lastPeriod == fileName.length() - 1) ? null : fileName.substring(lastPeriod + 1);
>
> Some files, like `.gitignore`, only has suffixes, yet they are valid languages. What do you think?
Sure, that's why there's `<= 0` and not `< 0` one line above that:
int lastPeriod = fileName.lastIndexOf('.');
if (lastPeriod <= 0) {
return null;
So, if `fileName` starts with `.`, the extension is null.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/19971#discussion_r1661212422
More information about the javadoc-dev
mailing list