RFR: 8335122: Reorganize internal low-level support for HTML in jdk.javadoc [v4]
Hannes Wallnöfer
hannesw at openjdk.org
Fri Jul 26 10:54:38 UTC 2024
On Wed, 24 Jul 2024 22:09:46 GMT, Jonathan Gibbons <jjg at openjdk.org> wrote:
>> Please review a change to reorganize the internal low-level support for HTML in the jdk.javadoc module.
>>
>> Hitherto, there are two separate sets of classes for low-level support for HTML in the `jdk.javadoc` module: one, in doclint, focused on reading and checking classes, the other, in the standard doclet, focused on generating HTML. This PR merges those two sets, into a new package `jdk.javadoc.internal.html` that is now used by both `doclint` and the standard doclet.
>>
>> There was a naming "anti-clash" -- `HtmlTag` in `doclint` vs `TagName` in the standard doclet. The resolution is to use `HtmlTag`, since the merged class is more than just the tag name.
>>
>> A few minor bugs were found and fixed. Other minor cleanup was done, but otherwise, there should be no big surprises here. But, one small item of note: `enum HtmlStyle` was split into `interface HtmlStyle` and `enum HtmlStyles implements HtmlStyle` to avoid having a doclet-specific enum class in the new `internal.html` package. The naming follows `HtmlId` and `HtmlIds`.
>>
>> There is no attempt at this time to simplify `HtmlTag` and `HtmlAttr` to remove support for older versions of HTML.
>
> Jonathan Gibbons has updated the pull request incrementally with one additional commit since the last revision:
>
> Cleanup use of HtmlStyle and HtmlStyles
src/jdk.javadoc/share/classes/jdk/javadoc/internal/html/HtmlTag.java line 87:
> 85: attrs(AttrKind.HTML4, CLEAR)),
> 86:
> 87: BUTTON(BlockType.OTHER, EndKind.REQUIRED,
Several tag constants that use `BlockType.OTHER` in this enum are defined as [Phrasing Content](https://html.spec.whatwg.org/#phrasing-content) in the HTML5 spec. Since HTML5 phrasing content roughly corresponds to pre-HTML5 inline content these tags should use `BlockType.INLINE` here. This includes the following tags:
- BUTTON
- INPUT
- LABEL
- LINK
- SCRIPT
These tags were also flagged as `phrasingContent` in the old doclet `TagName` enum. I'm not sure whether marking it as `INLINE` content will break DocLint tests.
It would seem like a good idea to suggest using [HTML5 content categories](https://developer.mozilla.org/en-US/docs/Web/HTML/Content_categories) in the new merged code, but the new categories are more complex and overlapping, and don't include list and table content, so there is not a lot to gain besides maybe more up-to-date terminology.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/19916#discussion_r1692888920
More information about the compiler-dev
mailing list