RFR [15] 8237909: Remove zipped index files feature

Tue Feb 4 15:15:43 UTC 2020

> On 3 Feb 2020, at 19:06, Jonathan Gibbons <jonathan.gibbons at oracle.com> wrote:
> 
> I concur with all the preceding discussion, about removing the zipped-files feature, and about the desirability of asynchronous loading if that is practical.
> 
> Separate RFE:  the UI should indicate if/while the results are incomplete.
> 
Good. For readers' convenience let me once again mention that there's a task
that addresses those concerns, JDK-8236935.
> There's a TODO in AbstractIndexWriter.java
> 
>  472         if (!searchIndex.isEmpty()) { // TODO: write to disk straight
Oops, thanks! This is a leftover from my explorative activity. Something to
ponder over. I was (probably) thinking that there might've been no need in
creating a StringBuilder instance containing the complete index string.
Index could've been written incrementally, from that `searchIndex` collection.

> I'm pleased to see this line go:
>  318             tree.add(new RawHtml("<!--[if IE]>"));
Me too. Not only does it not work in more modern versions of IE [1], in our case
it causes additional problems. When rendered to HTML, the check in question
looks like this:

    <!--[if IE]>
    <script type="text/javascript" src="script-dir/jszip-utils/dist/jszip-utils-ie.min.js"></script>
    <![endif]-->

There's another, orthogonal, check in jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/toolkit/resources/script.js:

    if (window.navigator.userAgent.indexOf('MSIE ') > 0 || window.navigator.userAgent.indexOf('Trident/') > 0 ||
            window.navigator.userAgent.indexOf('Edge/') > 0) {
        createElem(doc, tag, 'script-dir/jszip-utils/dist/jszip-utils-ie.js');
    }

While the former check loads the minified version of that JS library, the latter one
loads the uncompressed version. Not that I'm concerned with the space or the size
of the data transfers, those libs are tiny anyway, but this is just messy.

If this changeset is pushed, this mess will go away.

> We need a (separate?) plan of action for documenting this change and recommending that webservers delivering big API index files should be configured to use compression.

Thinking.
>> I intend to let this RFR sit here for at least 2 weeks.
>> The perceived severity of this change should not be underestimated.
> I think it is equally important to get this pushed early in the release, so that it can
> be tested with real-world docs, such as the EA docs for JDK 15.

I know. Let me try to come up with text for a release note, while this
review is given a couple more days of sitting still.

-Pavel

---------------------------------
[1] https://docs.microsoft.com/en-us/previous-versions/windows/internet-explorer/ie-developer/compatibility/hh801214(v=vs.85)?redirectedfrom=MSDN

> 
> -- Jon
> 
> 
> On 01/28/2020 07:55 AM, Pavel Rappo wrote:
>> Hello,
>> 
>> Please review the change for https://bugs.openjdk.java.net/browse/JDK-8237909 <https://bugs.openjdk.java.net/browse/JDK-8237909>:
>> 
>>    http://cr.openjdk.java.net/~prappo/8237909/webrev.00/ <http://cr.openjdk.java.net/~prappo/8237909/webrev.00/>
>> 
>> This change removes the "zipped index files" feature, which was introduced as
>> part of 8141492: Implement search feature in javadoc.
>> 
>> The "zipped index files" feature consists of generating the zipped index files
>> on the back end, and fetching & unzipping mechanics on the front end.
>> 
>> When documenting source files, the standard doclet accumulates index which is
>> later used by the JavaScript code serving the interactive search. The index
>> is written in two formats, .js (JavaScript) and .json (JSON). The latter is
>> then zipped.
>> 
>> When a browser accesses the pages using "http://" <http://> urls, the .zip index files are
>> transferred using XHR. Those files are then unzipped by the browser, using the
>> JSZip library, and parsed as JSON. If the transfer of the .zip index files fails
>> for whatever reason, the browser falls back on the alternative mechanism. This
>> mechanism transfers the .js index files by referring to them from dynamically
>> inserted <script src="... .js"> elements. Those files then are not additionally
>> parsed, as they are already data hardcoded in JavaScript code.
>> 
>> One of the reasons the .zip index files transfer may fail is using javadoc pages
>> in the "standalone" mode. When a browser accesses "file://" <file:///> urls, there's no
>> HTTP server to send the XHR requests to. So the fallback mechanism kicks in and
>> the browser loads the .js index files instead.
>> 
>> Analysis
>> ========
>> 
>> From what I understand, the original intent was to reduce the transfer size of
>> the index files. The observations made during the recent upgrade of JSZip
>> (JDK-8236700) suggest that the feature is not working as intended. It is not
>> clear if it ever did. The proposal is to remove it for the following reasons:
>> 
>> 1. The feature in its current state does more harm than good (see JDK-8236922)
>> 2. Fixing, debugging, testing, and evolving require expertise beyond that of
>>    typical for the javadoc area
>> 3. The feature significantly complicates the front end and less so the back end
>>    code
>> 4. The feature relies on the 3rd party libraries, which require tracking &
>>    periodical upgrades
>> 5. The difference in size between the .zip and .js files is not that big (see below)
>> 6. The index files are transferred once and then used from cache
>> 7. Modern HTTP servers provide compression. This makes the net result
>>    virtually the same, compare:
>> 
>>                       | (current) js + zip, MB | (proposal) js files, MB
>>     ------------------+------------------------+------------------------
>>     no compression                        7.4                      5.8
>>     HTTP compression                      2.7                      1.4
>> 
>> Had this feature worked as intended, we would always transfer only the zipped
>> index files and the transfer size would not depend on whether the server uses
>> HTTP compression. But does this really outweigh the reasons stated above?
>> 
>> Summing all up. Removing the zipped index files feature will make the overall
>> interactive search feature (JDK-8141492) more robust. It will be less
>> complicated, have fewer dependencies (JSZip, JSZip Utils), and will push the
>> optimization down to HTTP.
>> 
>> Testing
>> =======
>> 
>> Here is how I tested this change.
>> 
>> 1. make clean && make docs
>> 2. Standalone test
>>     2.1. Opened the browser at file://...images/docs/index.html <file://...images/docs/index.html>
>> 3. HTTP test
>>     3.1. Started an HTTP server at build/...images/docs
>>     3.2. Opened the browser at http://localhost...images/docs/index.html <http://localhost...images/docs/index.html>
>> 
>> Browser cache was cleared each time immediately before accessing the index.html page.
>> In both cases I checked that no zipped index files or the related JavaScript
>> libraries were accessed, and that the search worked as intended.
>> 
>> I also tried to access the resulting javadoc pages, served by an HTTP server on
>> my laptop, from a couple of mobile devices, all of which were on the same WiFi
>> network. Everything worked as intended.
>> 
>> Thanks,
>> -Pavel
>> 
>