RFR: 8337111: Bad HTML checker for generated documentation [v3]

Hannes Wallnöfer hannesw at openjdk.org
Thu Dec 19 15:55:50 UTC 2024


On Fri, 13 Dec 2024 17:07:23 GMT, Nizar Benalla <nbenalla at openjdk.org> wrote:

>> Doccheck's human-generated reports are great at previewing a "chessboard" of results. Giving reader a quick glimpse at the quality/health of the documentation. But these tests needed to be automated and they didn't easily translate to something that can be integrated into a CI.
>> 
>> This PR includes an HTML and internal link test on `api/java.base` and a BadChars and Doctype test on the entire generated documentation bundle.
>> 
>> Here is an example of the output after running all tests on `api/java.base`
>> 
>> Note: There is an active PR to fix the broken anchors left in `java.base` so this is not a blocker.
>> 
>> 
>> 
>> STDOUT:
>> STDERR:
>> test: test
>> Tidy found errors in the generated HTML
>> /Users/nizarbenalla/Work/jdk-repos/jdk1/build/macosx-aarch64/images/docs/api/java.base/java/lang/Class.html:323:87: Warning: <a> anchor "nest" already defined
>> Tidy output end.
>> 
>> 
>> api/java.base/java/util/concurrent/StructuredTaskScope.ShutdownOnFailure.html:245: id not found: api/java.base/java/util/concurrent/StructuredTaskScope.ShutdownOnFailure.html#TreeStructure
>> api/java.base/java/util/concurrent/StructuredTaskScope.ShutdownOnSuccess.html:242: id not found: api/java.base/java/util/concurrent/StructuredTaskScope.ShutdownOnSuccess.html#TreeStructure
>> api/java.base/java/lang/Class.html:323: name already declared: nest
>> api/java.base/java/lang/Module.html:291: id not found: api/java.base/java/lang/foreign/package-summary.html#restricted
>> api/java.base/java/lang/Module.html:434: id not found: api/java.base/java/lang/foreign/package-summary.html#restricted
>> api/java.base/java/lang/foreign/MemorySegment.html:725: id not found: api/java.base/java/lang/foreign/package-summary.html#restricted
>> 
>> Link Checker Report
>> Checked 3446 files.
>> Found 445059 references to 48205 anchors in 5770 files and 64 other URIs.
>>      1 duplicate ids
>>      3 missing ids
>> 
>> Hosts
>>     20 docs.oracle.com
>>      1 tools.ietf.org
>>      1 www.ietf.org
>>      1 jcp.org
>>      4 www.rfc-editor.org
>>      7 unicode.org
>>     10 www.unicode.org
>>     20 www.w3.org
>> Exception running test test: java.lang.Exception: One or more HTML checkers failed: [java.lang.RuntimeException: Tidy found errors in the generated HTML, java.lang.RuntimeException: LinkChecker encountered errors. Duplicate IDs: 1, Missing IDs: 3, Missing Files: 0, Bad Schemes: 0]
>> java.lang.Exception: One or more HTML checkers failed: [java.lang.RuntimeException: Tidy found errors in the generated HTML, java.lang.Ru...
>
> Nizar Benalla has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision:
> 
>  - Add a test for external links
>    separate checks into different jtreg tests
>    fix typos
>  - Merge remote-tracking branch 'upstream/master' into new-docs-tests-suit
>  - Merge remote-tracking branch 'upstream/master' into new-docs-tests-suit
>  - add file with all vetted links
>  - improve some parts based on review comments
>  - Merge remote-tracking branch 'upstream/master' into new-docs-tests-suit
>  - Merge remote-tracking branch 'upstream/master' into new-docs-tests-suit
>  - Convert parts of doccheck into tests

test/docs/jdk/javadoc/doccheck/ExtLinksJdk.txt line 266:

> 264: https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/math/BigDecimal.html
> 265: https://docs.oracle.com/en/java/javase/23/docs/specs/man/java.html
> 266: https://docs.oracle.com/en/java/javase/24/docs/specs/man/java.html

Is this right that some links are left pointing to version 23 and 24 resources, while most are updated to 25? 
I'm afraid it will be unpractical to manually update these links twice a year, so we should use some placeholder/macro to insert the current feature release. But that will only work if the links are also generated uniformly with the current feature release (for example by the `@extLink` taglet).

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/21879#discussion_r1892568276


More information about the javadoc-dev mailing list