<Swing Dev> OOM error parsing HTML with large <pre> Tag text

Mon Mar 16 17:29:12 UTC 2020

Hi, Kenny.

Thank you for your report, can you please file the bug here:
https://bugs.java.com/bugdatabase

On 3/16/20 10:08 am, Kenny Wong wrote:
> Hello,
> 
> After upgrading from Java 8 to 11, we started seeing OOM errors when parsing HTML files with a large <pre> tag text. The test program below works fine under Java 8, but terminates with an OutOfMemoryError under Java 9 or later. If the <pre> tag text is *not* '\n' separated, it works in all versions of Java.
> 
> We have tracked down the problem to a change originally committed to Java 9:
> https://github.com/openjdk/jdk/commit/17679435a174f6a7f0e450309dc8775e77df968a
> 
> Reverting the above change or replacing `Arrays.copyOf(txt, txt.length)` with `Arrays.copyOfRange(txt, offs, offs + length)` fixes the OOM error.
> 
> Thank you,
> Kenny Wong
> 
> --- 8< ---
> import java.io.StringReader;
> 
> import javax.swing.text.html.HTMLEditorKit;
> 
> public class Test {
>      public static void main(String[] args) throws Exception {
>          StringBuilder html = new StringBuilder();
>          html.append("<html><body><pre>");
>          for (int i = 0; i < 10_000; i++) {
>              html.append("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789")
>                  .append("\n");
>          }
>          html.append("</pre></body></html>");
> 
>          HTMLEditorKit kit = new HTMLEditorKit();
>          kit.read(new StringReader(html.toString()), kit.createDefaultDocument(), 0);
>      }
> }
> --
> 
> --- 8< ---
> $ java --version
> openjdk 11.0.3 2019-04-16 LTS
> OpenJDK Runtime Environment Zulu11.31+11-CA (build 11.0.3+7-LTS)
> OpenJDK 64-Bit Server VM Zulu11.31+11-CA (build 11.0.3+7-LTS, mixed mode)
> 
> -bash3.2$ java Test.java
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>      at java.base/java.util.Arrays.copyOf(Arrays.java:3841)
>      at java.desktop/javax.swing.text.DefaultStyledDocument$ElementSpec.<init>(DefaultStyledDocument.java:1267)
>      at java.desktop/javax.swing.text.html.HTMLDocument$HTMLReader.addContent(HTMLDocument.java:3909)
>      at java.desktop/javax.swing.text.html.HTMLDocument$HTMLReader.addContent(HTMLDocument.java:3883)
>      at java.desktop/javax.swing.text.html.HTMLDocument$HTMLReader.preContent(HTMLDocument.java:3787)
>      at java.desktop/javax.swing.text.html.HTMLDocument$HTMLReader.handleText(HTMLDocument.java:2766)
>      at java.desktop/javax.swing.text.html.parser.DocumentParser.handleText(DocumentParser.java:271)
>      at java.desktop/javax.swing.text.html.parser.Parser.handleText(Parser.java:409)
>      at java.desktop/javax.swing.text.html.parser.Parser.endTag(Parser.java:524)
>      at java.desktop/javax.swing.text.html.parser.Parser.parseTag(Parser.java:1934)
>      at java.desktop/javax.swing.text.html.parser.Parser.parseContent(Parser.java:2195)
>      at java.desktop/javax.swing.text.html.parser.Parser.parse(Parser.java:2372)
>      at java.desktop/javax.swing.text.html.parser.DocumentParser.parse(DocumentParser.java:135)
>      at java.desktop/javax.swing.text.html.parser.ParserDelegator.parse(ParserDelegator.java:113)
>      at java.desktop/javax.swing.text.html.HTMLEditorKit.read(HTMLEditorKit.java:263)
>      at Test.main(Test.java:16)
> ---
> 
> 

-- 
Best regards, Sergey.