RFR: 8352249: Remove incidental whitespace in traditional doc comments
Hannes Wallnöfer
hannesw at openjdk.org
Tue Mar 18 17:05:44 UTC 2025
Please review a patch to remove incidental indentation from traditional doc comments. This adds a `stripIndent()` method to the `Tokens.Comment` interface and a new `JavadocTokenizer.StrippedComment` nested class that implements indentation stripping while maintaining a map of position offsets to the original comment.
While the patch changes `javadoc` output by removing leading whitespace, the change is generally not visible in the browser except in `<pre>` elements, which is the point of this enhancement.
The change affects most tree positions in the AST checker tests in javac/doctree, but mostly does not affect the structure of the parsed trees, with the exception of `BreakIterator` tests in `FirstSentenceTest.java`.
`BreakIterator` does not recognize `.\n` immediately followed by a lower case letter as sentence break, while it recognizes the break if the letter is an upper-case letter *or* if there is a space between the line break and the letter. I find this rule a bit peculiar but AFAICT we can't influence the behavior of `BreakIterator`, so I have added tests that cover both lower and upper-case behavior.
The source position lookup in the stripped comment is implemented by creating a new `OffsetMap` that translates from the stripped to the original comment, then using the original comment's `OffsetMap` to translate the position to the source file. `OffsetMap` is relatively lightweight (usually 2 `int[]` elements per comment paragraph), so the added overhead is not too bad.
Inspired by [JDK-8305688](https://bugs.openjdk.org/browse/JDK-8305688) I did various test builds with restricted jobs and memory settings, but didn't notice any change in processing or memory overhead to API docs builds.
-------------
Commit messages:
- Rename method
- Update comment
- Clean up code, add comments, tests and @bug id
- Updated copyright year in testSourceTab breaks test
- Update remaining doctree tests & copyright headers
- Avoid losing the last of multiple trailing newlines in stripComment
- Remove unnecessary variable
- Don't rely on trailing newline
- Strip indentation from traditional doc comments
Changes: https://git.openjdk.org/jdk/pull/24032/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24032&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8352249
Stats: 1386 lines in 63 files changed: 268 ins; 5 del; 1113 mod
Patch: https://git.openjdk.org/jdk/pull/24032.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/24032/head:pull/24032
PR: https://git.openjdk.org/jdk/pull/24032
More information about the compiler-dev
mailing list