RFR: 8352249: Remove incidental whitespace in traditional doc comments

Hannes Wallnöfer hannesw at openjdk.org
Tue Mar 18 17:05:44 UTC 2025


Please review a patch to remove incidental indentation from traditional doc comments. This adds a `stripIndent()` method to the `Tokens.Comment` interface and a new `JavadocTokenizer.StrippedComment` nested class that implements indentation stripping while maintaining a map of position offsets to the original comment. 

While the patch changes `javadoc` output by removing leading whitespace, the change is generally not visible in the browser except in `<pre>` elements, which is the point of this enhancement. 

The change affects most tree positions in the AST checker tests in javac/doctree, but mostly does not affect the structure of the parsed trees, with the exception of `BreakIterator` tests in `FirstSentenceTest.java`. 

`BreakIterator` does not recognize `.\n` immediately followed by a lower case letter as sentence break, while it recognizes the break if the letter is an upper-case letter *or* if there is a space between the line break and the letter. I find this rule a bit peculiar but AFAICT we can't influence the behavior of `BreakIterator`, so I have added tests that cover both lower and upper-case behavior.

The source position lookup in the stripped comment is implemented by creating a new `OffsetMap` that translates from the stripped to the original comment, then using the original comment's `OffsetMap` to translate the position to the source file. `OffsetMap` is relatively lightweight (usually 2 `int[]` elements per comment paragraph), so the added overhead is not too bad. 

Inspired by [JDK-8305688](https://bugs.openjdk.org/browse/JDK-8305688) I did various test builds with restricted jobs and memory settings, but didn't notice any change in processing or memory overhead to API docs builds.

-------------

Commit messages:
 - Rename method
 - Update comment
 - Clean up code, add comments, tests and @bug id
 - Updated copyright year in testSourceTab breaks test
 - Update remaining doctree tests & copyright headers
 - Avoid losing the last of multiple trailing newlines in stripComment
 - Remove unnecessary variable
 - Don't rely on trailing newline
 - Strip indentation from traditional doc comments

Changes: https://git.openjdk.org/jdk/pull/24032/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24032&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8352249
  Stats: 1386 lines in 63 files changed: 268 ins; 5 del; 1113 mod
  Patch: https://git.openjdk.org/jdk/pull/24032.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24032/head:pull/24032

PR: https://git.openjdk.org/jdk/pull/24032


More information about the compiler-dev mailing list