RFR: 8337111: Bad HTML checker for generated documentation
Nizar Benalla
nbenalla at openjdk.org
Thu Nov 14 13:14:57 UTC 2024
On Thu, 14 Nov 2024 11:58:47 GMT, Hannes Wallnöfer <hannesw at openjdk.org> wrote:
>> Doccheck's human-generated reports are great at previewing a "chessboard" of results. Giving reader a quick glimpse at the quality/health of the documentation. But these tests needed to be automated and they didn't easily translate to something that can be integrated into a CI.
>>
>> This PR includes an HTML and internal link test on `api/java.base` and a BadChars and Doctype test on the entire generated documentation bundle.
>>
>> Here is an example of the output after running all tests on `api/java.base`
>>
>> Note: There is an active PR to fix the broken anchors left in `java.base` so this is not a blocker.
>>
>>
>>
>> STDOUT:
>> STDERR:
>> test: test
>> Tidy found errors in the generated HTML
>> /Users/nizarbenalla/Work/jdk-repos/jdk1/build/macosx-aarch64/images/docs/api/java.base/java/lang/Class.html:323:87: Warning: <a> anchor "nest" already defined
>> Tidy output end.
>>
>>
>> api/java.base/java/util/concurrent/StructuredTaskScope.ShutdownOnFailure.html:245: id not found: api/java.base/java/util/concurrent/StructuredTaskScope.ShutdownOnFailure.html#TreeStructure
>> api/java.base/java/util/concurrent/StructuredTaskScope.ShutdownOnSuccess.html:242: id not found: api/java.base/java/util/concurrent/StructuredTaskScope.ShutdownOnSuccess.html#TreeStructure
>> api/java.base/java/lang/Class.html:323: name already declared: nest
>> api/java.base/java/lang/Module.html:291: id not found: api/java.base/java/lang/foreign/package-summary.html#restricted
>> api/java.base/java/lang/Module.html:434: id not found: api/java.base/java/lang/foreign/package-summary.html#restricted
>> api/java.base/java/lang/foreign/MemorySegment.html:725: id not found: api/java.base/java/lang/foreign/package-summary.html#restricted
>>
>> Link Checker Report
>> Checked 3446 files.
>> Found 445059 references to 48205 anchors in 5770 files and 64 other URIs.
>> 1 duplicate ids
>> 3 missing ids
>>
>> Hosts
>> 20 docs.oracle.com
>> 1 tools.ietf.org
>> 1 www.ietf.org
>> 1 jcp.org
>> 4 www.rfc-editor.org
>> 7 unicode.org
>> 10 www.unicode.org
>> 20 www.w3.org
>> Exception running test test: java.lang.Exception: One or more HTML checkers failed: [java.lang.RuntimeException: Tidy found errors in the generated HTML, java.lang.RuntimeException: LinkChecker encountered errors. Duplicate IDs: 1, Missing IDs: 3, Missing Files: 0, Bad Schemes: 0]
>> java.lang.Exception: One or more HTML checkers failed: [java.lang.RuntimeException: Tidy found errors in the generated HTML, java.lang.Ru...
>
> test/docs/jdk/javadoc/doccheck/doccheckutils/Log.java line 43:
>
>> 41:
>> 42: public void log(Path path, int line, String message, Object... args) {
>> 43: errors.add(formatErrorMessage(path, line, message, args));
>
> It's a strange that this class is called `Log` and it has several `log` methods, but they are also used to report and track errors. It seems like some checkers use this class to track errors, while others use it purely for logging. Maybe the two features should be separated, for example by adding a dedicated `logError` method?
I mostly use this to store errors but also anything I want to the test to output later. I can find a way to separate this.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/21879#discussion_r1842196584
More information about the javadoc-dev
mailing list