RFR: JDK-8268335: Find better way to exclude empty HTML elements [v8]

Tue May 17 10:56:27 UTC 2022

> Please review a change to straighten out the mechanism to avoid generating empty HTML elements in javadoc. 
> 
> Currently this is implemented using the `Content.isValid()` method in `HtmlTree.add(Content)` to check whether the the content argument should be added or dropped. This seems wrong since we don't really want to check for validity but non-emptiness (we want to drop empty elements, not print a warning or error).
> 
> So the first decision was to get rid of the `Content.isValid()` method and use `Content.isEmpty()` instead. I considered keeping `isValid()` and using it to actually do some validation checks, but in the end decided it didn't make much sense and scrapped the method.
> 
> My initial approach was to take care of emptiness at the content creation site, but it soon turned out there are too many sites potentially creating empty content. After fixing 40+ sites using a new `Content.addNonEmpty(Content)` method I still got empty elements that shouldn't be there in the JDK documentation.
> 
> Therefore `HtmlTree.add(Content)` still checks for and drops empty content arguments. Instead, I added a new `HtmlTree.addUnchecked(Content)` method that adds the argument without the checks. The new method is only used 5 times in the whole codebase. It replaces the use of the special `HtmlTree.EMPTY` content token which is no longer needed.
> 
> A few remarks about the rewritten `HtmlTree.isEmpty()` method: its purpose is to determine whether content is missing in an element that should have content. It therefore always returns `false` for elements that are expected or allowed to be without content. This is maybe a bit counter-intuitive, but it is what is needed. We could do a combined check such as `isVoid() || !isEmpty()` but that would still leave out elements that *can* be empty, such as `<script>`.
> 
> The implementation of `HtmlTree.isEmpty()` is deliberately simple and conservative. For instance, it doesn't look at attributes to decide whether an element is allowed to be empty. The rationale for this is to make the behavior as predictable as possible and avoid surprises.
> 
> Since we no longer have a validity check in `HtmlTree.add(Content)` I had to remove a few lines of code in `TestHtmlDocument.java` that expected invalid tags to be dropped. Otherwise there are no test changes, as this is a noreg-cleanup task. I did compare the javadoc JDK documention before and after this change, the output is unchanged.

Hannes Wallnöfer has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits:

 - Merge master
 - Tweak discardability criteria
 - Merge branch 'master' into JDK-8268335
 - Introduce Content.isDiscardable()
 - Fix merge of branch 'master' into JDK-8268335

   # Conflicts:
   #	src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/markup/HtmlTree.java
 - Merge branch 'master' into JDK-8268335
 - Merge branch 'master' into JDK-8268335
 - JDK-8268335: Remove unnecessary if statement
 - JDK-8268335: Find better way to exclude empty HTML elements

-------------

Changes: https://git.openjdk.java.net/jdk/pull/7159/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7159&range=07
  Stats: 143 lines in 17 files changed: 42 ins; 52 del; 49 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7159.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7159/head:pull/7159

PR: https://git.openjdk.java.net/jdk/pull/7159