Integrated: JDK-8304433: cleanup sentence breaker code in DocTreeMaker

Jonathan Gibbons jjg at openjdk.org
Mon Mar 20 15:18:01 UTC 2023


On Sun, 19 Mar 2023 00:12:31 GMT, Jonathan Gibbons <jjg at openjdk.org> wrote:

> Please review a localized cleanup of the code in `DocTreeMaker` to split a description into the _first sentence_ and _the rest_.
> 
> There are two main aspects to the refactoring:
> 
> 1. All the related code is moved into a new nested class, `SentenceBreaker`, within `DocTreeMaker`. This provides a stronger distinction between that code and the other, primary methods in `DocTreeMaker` that implement `DocTreeFactory`.
> 
> 2. The use of casts is improved to reduce the overall number required. In particular, the `cast` method is used early on to convert `List<DocTree>` to `List<DCTree>`. This is similar to existing usage of that method elsewhere.
> 
> There are no significant changes in functionality.   There is just one small observable difference, detected by one of the tests.  The refactored code uses `List.of()` for an empty list; the older code used empty `javac` `List` objects. This was detected in a test that uses reflection to verify the result of scanning a tree, and the reflective check was for `javac` `List` objects (only).  The fix in the test is to just remove the import for `javac.util.List` and rely on the import for `List` in `java.util.*`, which picks up `java.util.List`, and which is a supertype of `javac` `List`.
> 
> There is followup work that could be done, but is deferred to keep this change simpler and cleaner, with no significant change in functionality.
> 
> 1. The handling of HTML tags in the first sentence is "less than ideal" and could be improved. In particular, most block tags, including lists and tables, do not terminate the first sentence, and could even lead to incorrect partitioning and downstream HTML generation. Presumably, this does not occur in practice.  Related: the code that ignores `EndElement` tags in the first position also seems suspect.
> 
> 2. When a `List<DocTree>` is split, the two partial lists use `javac` `List`s, not standard lists. This does have the benefit that such lists are immutable by design.   That being said, it might be worth looking at the possibility of using unmodifiable "standard" lists instead.

This pull request has now been integrated.

Changeset: 80e97972
Author:    Jonathan Gibbons <jjg at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/80e979720a052fbc944b0d85ab25daa831942f19
Stats:     282 lines in 2 files changed: 86 ins; 71 del; 125 mod

8304433: cleanup sentence breaker code in DocTreeMaker

Reviewed-by: hannesw

-------------

PR: https://git.openjdk.org/jdk/pull/13091


More information about the compiler-dev mailing list