RFR: 8337111: Bad HTML checker for generated documentation [v5]

Nizar Benalla nbenalla at openjdk.org
Fri Dec 20 14:55:54 UTC 2024


> Doccheck's human-generated reports are great at previewing a "chessboard" of results. Giving reader a quick glimpse at the quality/health of the documentation. But these tests needed to be automated and they didn't easily translate to something that can be integrated into a CI.
> 
> This PR includes an HTML and internal link test on `api/java.base` and a BadChars and Doctype test on the entire generated documentation bundle.
> 
> Here is an example of the output after running all tests on `api/java.base`
> 
> Note: There is an active PR to fix the broken anchors left in `java.base` so this is not a blocker.
> 
> 
> 
> STDOUT:
> STDERR:
> test: test
> Tidy found errors in the generated HTML
> /Users/nizarbenalla/Work/jdk-repos/jdk1/build/macosx-aarch64/images/docs/api/java.base/java/lang/Class.html:323:87: Warning: <a> anchor "nest" already defined
> Tidy output end.
> 
> 
> api/java.base/java/util/concurrent/StructuredTaskScope.ShutdownOnFailure.html:245: id not found: api/java.base/java/util/concurrent/StructuredTaskScope.ShutdownOnFailure.html#TreeStructure
> api/java.base/java/util/concurrent/StructuredTaskScope.ShutdownOnSuccess.html:242: id not found: api/java.base/java/util/concurrent/StructuredTaskScope.ShutdownOnSuccess.html#TreeStructure
> api/java.base/java/lang/Class.html:323: name already declared: nest
> api/java.base/java/lang/Module.html:291: id not found: api/java.base/java/lang/foreign/package-summary.html#restricted
> api/java.base/java/lang/Module.html:434: id not found: api/java.base/java/lang/foreign/package-summary.html#restricted
> api/java.base/java/lang/foreign/MemorySegment.html:725: id not found: api/java.base/java/lang/foreign/package-summary.html#restricted
> 
> Link Checker Report
> Checked 3446 files.
> Found 445059 references to 48205 anchors in 5770 files and 64 other URIs.
>      1 duplicate ids
>      3 missing ids
> 
> Hosts
>     20 docs.oracle.com
>      1 tools.ietf.org
>      1 www.ietf.org
>      1 jcp.org
>      4 www.rfc-editor.org
>      7 unicode.org
>     10 www.unicode.org
>     20 www.w3.org
> Exception running test test: java.lang.Exception: One or more HTML checkers failed: [java.lang.RuntimeException: Tidy found errors in the generated HTML, java.lang.RuntimeException: LinkChecker encountered errors. Duplicate IDs: 1, Missing IDs: 3, Missing Files: 0, Bad Schemes: 0]
> java.lang.Exception: One or more HTML checkers failed: [java.lang.RuntimeException: Tidy found errors in the generated HTML, java.lang.RuntimeException: LinkChecker encountered errors. Duplicate IDs: 1, Missing IDs: 3, Mi...

Nizar Benalla has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 12 additional commits since the last revision:

 - Improve external link checker, swapping the number of the latest release using a regex
 - Merge remote-tracking branch 'upstream/master' into new-docs-tests-suit
 - Rename method and usage to be more concise
 - make regex more rebust in case of single quote in legacy doctype
 - Add a test for external links
   separate checks into different jtreg tests
   fix typos
 - Merge remote-tracking branch 'upstream/master' into new-docs-tests-suit
 - Merge remote-tracking branch 'upstream/master' into new-docs-tests-suit
 - add file with all vetted links
 - improve some parts based on review comments
 - Merge remote-tracking branch 'upstream/master' into new-docs-tests-suit
 - ... and 2 more: https://git.openjdk.org/jdk/compare/71ced4ac...0857f83a

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/21879/files
  - new: https://git.openjdk.org/jdk/pull/21879/files/964ca5e2..0857f83a

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=21879&range=04
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=21879&range=03-04

  Stats: 15981 lines in 564 files changed: 11526 ins; 2508 del; 1947 mod
  Patch: https://git.openjdk.org/jdk/pull/21879.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/21879/head:pull/21879

PR: https://git.openjdk.org/jdk/pull/21879


More information about the javadoc-dev mailing list