RFR: 8049347: HTMLDocument throws NPE for Block Tag

Thu Jun 15 11:17:07 UTC 2023

On Thu, 15 Jun 2023 09:23:47 GMT, Prasanta Sadhukhan <psadhukhan at openjdk.org> wrote:

>> Obviously. :)
>> 
>> Does it do what the `getIterator` method promise to do? It doesn't because it iterates over `LeafElement`-s only:
>> 
>> https://github.com/openjdk/jdk/blob/931625a9304ec2761ca9035d69fd33f6beadb124/src/java.desktop/share/classes/javax/swing/text/html/HTMLDocument.java#L1993-L1998
>> 
>> So, the `next` method moves the iterator to the next leaf element:
>> 
>> https://github.com/openjdk/jdk/blob/931625a9304ec2761ca9035d69fd33f6beadb124/src/java.desktop/share/classes/javax/swing/text/html/HTMLDocument.java#L2037-L2044
>> 
>> Note that it uses `isLeaf` to stop the generic `ElementIterator`.
>> 
>> ---
>> 
>> It looks as if `HTMLDocument.getIterator` is public as an implementation detail: it's used to iterate over `HTML.Tag.A` to scroll the text component to the anchor and to determine whether the document is a frame set or not.
>> 
>> Removing the `if` condition removes the NPE but the method doesn't support Block Elements.
>> 
>> Should we update the javadoc to state the method shouldn't be used by apps? And to throw `UnsupportedOperationException` if a block tag is passed instead of returning null?
>
>> Does it do what the `getIterator` method promise to do? It doesn't because it iterates over `LeafElement`-s only:
> 
> As per the spec wordings,
> `The [getIterator(HTML.Tag t)](https://docs.oracle.com/en/java/javase/20/docs/api/java.desktop/javax/swing/text/html/HTMLDocument.html#getIterator(javax.swing.text.html.HTML.Tag)) method can also be used for finding all occurrences of the specified HTML tag in the document.`
> 
> which I think is what we get with the fix an Iterator object with next block tag

> > Does it do what the `getIterator` method promise to do? It doesn't because it iterates over `LeafElement`-s only:
> 
> As per the spec wordings, _The [getIterator(HTML.Tag t)](https://docs.oracle.com/en/java/javase/20/docs/api/java.desktop/javax/swing/text/html/HTMLDocument.html#getIterator(javax.swing.text.html.HTML.Tag)) method can also be used for finding all occurrences of the specified HTML tag in the document._
> 
> which I think is what we get with the fix an Iterator object with next block tag

Really? All the evidence I provided above shows the current implementation *skips* block tags because it takes into account only elements for which [`isLeaf`](https://github.com/openjdk/jdk/blob/83d92672d4c2637fc37ddd873533c85a9b083904/src/java.desktop/share/classes/javax/swing/text/AbstractDocument.java#L2636) method returns `true`, that is `LeafElement`s. Block tags are represented with `BranchElement`, its [`isLeaf`](https://github.com/openjdk/jdk/blob/83d92672d4c2637fc37ddd873533c85a9b083904/src/java.desktop/share/classes/javax/swing/text/AbstractDocument.java#L2481) method returns `false`.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/14466#discussion_r1230852394