RFR: JDK-8253735: Cleanup SearchIndexItem API

Jonathan Gibbons jjg at openjdk.java.net
Tue Oct 6 15:01:12 UTC 2020


On Tue, 6 Oct 2020 11:53:39 GMT, Hannes Wallnöfer <hannesw at openjdk.org> wrote:

>> This pull-request is for a substantial refactoring and cleanup of the code
>> to build the static index pages and the files for the interactive search
>> 
>> There is no significant change in functionality, and all tests continue to
>> pass without change.
>> 
>> ## Improvements
>> 
>> * `SearchIndexItem` is merged into `IndexItem`
>> * `SearchIndexItems` (the collection of `SearchIndexItem`s indexed by
>>     `Category`) is merged into `IndexBuilder`, which now maintains
>>     both maps: index items grouped by first character, and grouped by category
>> * In `Category`, the members `INDEX` and `SYSTEM_PROPERTY` are merged into
>>     single new entry `TAGS`, such that the members of `Category` now directly
>>     correspond with the JavaScript files generated for interactive search
>> * `IndexItem` now provides access to the `DocTree` node for those items
>>     that previously were `SearchIndexItem`. This can be used to differentiate
>>     between items for `{@index...}` and `{@systemProperty}`.
>>     This specific change was a primary motivation for all this work, in order
>>     to facilitate supporting additional new tags that can contribute items
>>     for the index.
>> * All index items (i.e. including those that previously were `SearchIndexItem`)
>>     are now created by static factory methods instead of directly calling
>>     constructors. This allows many values to be precomputed and stored in
>>     final fields or made available by overriding accessor methods in
>>     anonymous subtypes.
>> * The comparators for index items have been cleaned up. Previously, one
>>     of the comparators was "unstable", meaning that it could fall back on
>>     the value of mutable fields, and the `toString` output (which was used
>>     to generate the content for the JavaScript files.)
>> * Work to generate the JavaScript files has been moved out of
>>     `AbstractIndexBuilder` and its subtypes into a new class, `HtmlIndexBuilder`,
>>     that subclasses the main `IndexBuilder`.
>>     To facilitate that, some methods were moved from `HtmlDocletWriter` to
>>     `Links`. This is not ideal, because it introduces a dependence on `Utils`
>>     in `Links` that was not there before, but the choice was pragmatic and the
>>     least bad of the alternatives.  Long term, we might want to move most
>>     of the `formats/html/markup` package into a new more standalone package that
>>     does not rely on other javadoc internals, and at that time, the factory
>>     objects like `Links`, `TableHead` and `Table` would just move up to the
>>     `formats/html` package.
>> 
>> ## The Changes
>> 
>> The work is done in a series of steps/commits, each with a specific focus
>> of the work involved. It may be instructive to review the changes in each
>> commit, to follow the overall evolution of the work. From the Git log,
>> the changes are as follows (oldest first)
>> 
>> * Move `SearchIndexItem`, `SearchIndexItems` to `toolkit.utils`
>> * Cleanup `SearchIndexItem`
>> * fix bug
>> * Simplify `SearchIndexItems`
>> * simplify statement
>> * Merge `SearchIndexItems` into `IndexBuilder`
>> * Cleanup `IndexItem` prior to merge with `SearchIndexItem`
>> * Cleanup `SearchIndexItem` prior to merge with `IndexItem`
>> * Merge `SearchIndexItem` into `IndexItem`
>>     (without changing how items are added to the index collections)
>> * simplify adding index items to the index builder
>> * move comparators for index items into IndexBuilder
>> * improve init for some index items
>>     improve comments in IndexItem
>> * Bug fix: obsolete call to add an item to the index
>>     Improve comparators used to build index
>> * Move `getAnchor` methods from `HtmlDocletWriter` to `Links`.
>>     This is slightly undesirable because it means that `Links` requires access to `Utils`, but it is less bad than the
>>     alternatives.
>> * Move code to complete initialization of index items and to write JavaScript index files from `AbstractIndexWriter` and
>>   subtypes to new subtype of `IndexBuilder`: `HtmlIndexBuilder`
>> 
>> At each stage, the repo should build and all javadoc tests should pass (and there is no reason to believe that any
>> other tests may fail.)
>> ## Future work
>> 
>> Instead of maintaining collections of `SortedSet`s in `IndexBuilder`, it might
>> be more effective to use `List` instead, and just sort the list as needed.
>> 
>> There is little need to eagerly build both maps in `IndexBuilder`. As long as
>> there is at least one collection, such as the `itemsByFirstCharacter`, we could
>> defer generating `itemsByCategory` until we need to write out the JavaScript
>> files.  There is one place in the code, in `SystemPropertyWriter`, where we
>> look at `itemsByCategory` to determine whether there were any
>> `{@systemProperty...}` tags, but at the time we need that information, it
>> would not be significantly more expensive to scan `indexByFirstCharacter`,
>> because the items for elements need not have been added at this time.
>
> src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/toolkit/util/IndexItem.java line 159:
> 
>> 157:     /**
>> 158:      * Creates an index item for a type element.
>> 159:      * Note: use {@code getElement()} to access this value, not {@code getTypeElement}.
> 
> Maybe rename `getTypeElement()` to `getContainingTypeElement()` to avoid this ambiguity? I note that this would blur
> the method with the `setContaining*(String)` methods used by `HtmlIndexBuilder` and sometimes derived from this
> `TypeElement`, but I'm not sure this would be a bad thing.

Generally a good suggestion; I'l investigate to ensure there is no ambiguity, but I'm sure we can come up with some
naming that is less bad than it is now.

> src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/toolkit/util/IndexItem.java line 506:
> 
>> 504:      * @return a string representing this item in JSON notation
>> 505:      */
>> 506:     public String toJavaScript() {
> 
> The generated output is technically JSON, which is a (very small) subset of JavaScript. Maybe `toJSON()` would be more
> informative as a method name. This may apply to other occurrences of the term `JavaScript` in this patch, but of course
> the term is correct when refering to the search implementation as a whole.

Agreed.

-------------

PR: https://git.openjdk.java.net/jdk/pull/499


More information about the javadoc-dev mailing list